Market Segmentation and Lead Enrichment with LLMs

Market segmentation is an essential part of any go-to-market strategy. Segmenting a large list of prospects can be very challenging. Fortunately, it is precisely the kind of work that large language models excel at, but it does take a few tricks.

Market segmentation is an essential part of any go-to-market strategy. From a marketer perspective, potential customers belong to the same segment if they

  1. Will use the same product
  2. Can be reached by the same sales process
  3. Will look at each other as references

Let’s say we have a list of companies, where each company is described by two attributes: Name and Location. We could obtain this list by hosting a webinar series and asking each attendee for the name of their company.

How do we segment this list?

For a small list, we could Google each company, visit their website, understand their business, and assign the company to one or more segments.

But what if our list has hundreds of companies?

Generative AI to the rescue!

Large language models can use tools such as Python and web search.

Ideally, we must be able to give the list to an AI model and ask it to enrich the list with additional attributes by researching the companies online. Next, we could ask the model to segment the list using these attributes.

In reality, it’s easier said than done.

To begin with, the list is too big to fit in LLM's working memory. We must break it down into chunks, get them processed by the LLM, and collate the results. 

Second, capable models hosted by OpenAI and Microsoft are guarded with request rate limiters. If we just throw all our data at a model, we will exceed a rate limit and get a bunch of errors.

Third, models are getting better at producing results in a standard format, such as JSON or CSV. Still, with many records, we are almost guaranteed to get garbage from a model occasionally. When this happens, the best approach is to resubmit the request while pointing out the error to the model. 

Slowly but surely, this is turning into a week-long programming project. 

But don’t despair.

PyAQ, an open-source Generative AI platform from AnyQuest, takes care of this for you automatically:

  1. Breaks down a long list of records into chunks digestible by a model
  2. Maps each chunk to a separate worker for parallel processing
  3. Equips each worker with access to LLMs, web search, and Python
  4. Throttles requests to honor request and token rate limits
  5. Verifies results produced by the model and resubmits the ones that failed
  6. Collates and saves the results produced by multiple workers  

Thanks to these built-in services, you only need to do the fun part: craft a prompt instructing the model to search the web and gather information about the companies on the list. 

Which is exactly what I did. Check out my super low code here:

https://github.com/anyquest/pyaq/blob/main/examples/apps/companies.yml

Reach out if you have a list or two that must be “enriched.” We are here to help.

Gen AI in 2025: The Year of Quantity

What can we expect from Gen AI next year? For most of 2024, the narrative has been about quality. Models are measured and compared on a range of qualitative and quantitative benchmarks. In almost every dimension, the difference in performance among frontier models is now within a tenth of a

Optimizing Business Processes with Gen AI and Use Case Crowdsourcing

Gen AI is set to revolutionize business by automating numerous steps in enterprise processes. But where should one begin? We suggest establishing a robust cycle of Ideation, Prioritization, Testing, and Analysis. Here are some typical errors and best practices for the Ideation stage.

Can GPT-4 Outsmart Wall Street Stock Pickers?

I provided a large language model with last year's market outlook reports and asked it to generate aggressive growth portfolios. Next, I backtested the portfolios against historical market data. Here are the results.