11 Comments

Phenomenal! Would love to get in depth ideas about the data engineering stack, especially the Signal to Noise Ratio(for data driven sourcing) and how to optimize it.

Stoked about what's to come!!!

Expand full comment
Sep 22, 2022Liked by Andre Retterath

Hi, I would add "company-data" to the enrichment point of Data-driven sourcing approaches - for example, which technologies are companies using based on their self-description in job postings (e.g., as used in https://techmap.io)

Expand full comment

Love your thinking! I have come at this problem from a different perspective. I am an industry analyst that covers one sector, albeit a fast growing one, cybersecurity. I even recently changed the messaging for IT-Harvest to "A data-driven analyst firm." :-)

Most analyst firms take a top-down approach. When I was at Gartner we created Magic Quadrants that often had a minimum company size requirement. When I left Gartner I wanted to be able to answer a very simple question. How many cybersecurity companies are there?

So same problem that you describe: How to find ALL the companies? Luckily the cybersecurity industry is a lot easier to research than say the restaurant industry. :-) Because I have been in the business for 27 years people reach out to me via, email, Twitter, and Linkedin. (Inbound). Other methods I use:

-Conference exhibitors. This is best for finding new companies in regions outside the US.

-Investor portfolio pages. It is amazing that even after an investment many companies are still hidden.

-AI applied to a news feed from Feedly on "cybersecurity." I check that every day.

-Crunchbase has 12,000 results if you search on cybersecurity, Pitchbook has 25,000. But you have to go through each one to verify that they are still in business or are actual product companies. The big data sources don't have industry analysts to determine what a company does. It is easy for a law firm, consultant, or insurance firm to plaster Cyber all over their website and the "researchers" just tag them with those words.

-Linkedin has 41,000 "cybersecurity" listings. We are systematically working our way through those. We have a success rate 0.75% so it is an arduous process.

After 17 years of doing this, with an intensive focus the last three years, we have 3,015 cybersecurity vendors in 17 categories and 880 subcategories. We launched a SaaS dashboard to allow investors to subscribe and get access to all the data. Certainly, for an investor just entering the field of cybersecurity we can save them thousands of hours of research.

Here is an idea: Track new domain registrations. Maybe crawl them automatically and use ML to identify new companies? According to this site https://whoisds.com/newly-registered-domains there are about 100K per day!

-

Expand full comment