Market Segmentation and Lead Enrichment with LLMs
Market segmentation is an essential part of any go-to-market strategy. From a marketer perspective, potential customers belong to the same segment if they
- Will use the same product
- Can be reached by the same sales process
- Will look at each other as references
Let’s say we have a list of companies, where each company is described by two attributes: Name and Location. We could obtain this list by hosting a webinar series and asking each attendee for the name of their company.
How do we segment this list?
For a small list, we could Google each company, visit their website, understand their business, and assign the company to one or more segments.
But what if our list has hundreds of companies?
Generative AI to the rescue!
Large language models can use tools such as Python and web search.
Ideally, we must be able to give the list to an AI model and ask it to enrich the list with additional attributes by researching the companies online. Next, we could ask the model to segment the list using these attributes.
In reality, it’s easier said than done.
To begin with, the list is too big to fit in LLM's working memory. We must break it down into chunks, get them processed by the LLM, and collate the results.
Second, capable models hosted by OpenAI and Microsoft are guarded with request rate limiters. If we just throw all our data at a model, we will exceed a rate limit and get a bunch of errors.
Third, models are getting better at producing results in a standard format, such as JSON or CSV. Still, with many records, we are almost guaranteed to get garbage from a model occasionally. When this happens, the best approach is to resubmit the request while pointing out the error to the model.
Slowly but surely, this is turning into a week-long programming project.
But don’t despair.
PyAQ, an open-source Generative AI platform from AnyQuest, takes care of this for you automatically:
- Breaks down a long list of records into chunks digestible by a model
- Maps each chunk to a separate worker for parallel processing
- Equips each worker with access to LLMs, web search, and Python
- Throttles requests to honor request and token rate limits
- Verifies results produced by the model and resubmits the ones that failed
- Collates and saves the results produced by multiple workers
Thanks to these built-in services, you only need to do the fun part: craft a prompt instructing the model to search the web and gather information about the companies on the list.
Which is exactly what I did. Check out my super low code here:
https://github.com/anyquest/pyaq/blob/main/examples/apps/companies.yml
Reach out if you have a list or two that must be “enriched.” We are here to help.