Synthetic Data is Transforming Market Research

09-08-2025 Technology

By Managing Director James Butcher

In today’s fast-moving consumer landscape, companies are demanding richer, “real-time” insights to provide a better understanding of their products, customers, industry and competition. Yet traditional methods of market research – surveys, panels and focus groups – are intermittent, labor-intensive and expensive, limiting their utility for companies seeking a granular understanding of changing market sentiment.

Synthetic data – which is artificially generated using AI models to mimic the characteristics of real-world surveys and panel responses – promises to alleviate many of these industry pain points. In this article, we consider how synthetic data compares to the real thing and the broader implications for the market research industry.

Synthetic vs. Real-World Data

When synthetic data is used on its own, results have been disappointing, with only 31% of respondents in one survey rating the results as “great.” However, the quality of synthetic data improves markedly when it is trained on real-world survey responses. In fact, when Gen AI is trained on primary research, the resulting synthetic datasets have more than held their own against traditional methods.

For example, one synthetic data company partnered with EY to conduct a double-blind test using the global professional services firm’s annual brand-survey questionnaire aimed at CEOs of US companies with more than $1 billion in revenue. When EY compared results cultivated from a thousand synthetic personas to its actual survey results, it found a 95% correlation. Notably, the EY synthetic data survey was produced in days, not months, for a fraction of the cost of a traditional survey.

Exploring Diverse Use Cases

From simulation testing to expanding representation of niche groups or populations, synthetic data can be applied to a variety of applications across market research. Some of the most promising use cases include:

Scenario simulation: Testing product concepts, pricing strategies, or campaign ideas before launch.
Questionnaire testing: Evaluating survey logic and clarity before field deployment.
Sample augmentation: Boosting representation in hard-to-reach groups (e.g. Gen Z or rural populations).
Synthetic personas: Creating AI-driven chatbots (also known as “digital twins”) that simulate customer segments for ongoing engagement.

Evaluating the Pluses and Minuses of Synthetic Data

Beyond the cost and timing benefits, synthetic data offers other advantages, namely, increased representation from diverse and / or hard-to-reach groups, heightened privacy and data-compliance, predictive power to improve forecasting of consumer behavior and market trends, and agile testing to allow companies to roll out multiple versions of a concept or message.

Of course, synthetic data is far from a silver bullet. Like human researchers, it can also produce less-than-reliable results when trained on information that is incomplete and / or unrepresentative. In those cases, synthetic data can exaggerate biases and produce results lacking in depth and variety, which can lead to flawed decision-making.

Ultimately a Complement to, not a Replacement for, Traditional Market Research

Synthetic data is a powerful tool that, when used responsibly, can revolutionize market research. It addresses long-standing challenges – cost, speed, representation, and privacy – while opening new avenues for predictive modeling and agile testing. However, it would be irresponsible to ignore its limitations. Real human data remains irreplaceable for capturing emotional depth, cultural nuance, and authentic behavior.

For market research firms, the path forward lies in thoughtful integration – leveraging synthetic data where it excels, validating it rigorously, and combining it with real-world insights to drive smarter decisions.

The companies best positioned to thrive in this new environment have extensive real-world data, advanced capabilities in areas such as AI, machine learning, and data science, and a consumer-grade user interface.

Implications for Deal Activity in Market Research

The rapid adoption of synthetic data is likely to lead to increased M&A across the sector following several years of reduced activity. In this new environment, scale matters more than ever. As such, established players should pursue consolidation to gain access to additional real-world data to train their AI models. Traditional market research companies should also seek out specialist synthetic data providers to enhance their modeling capabilities and their ability to draw actionable insights.

A key focus area for buyers / investors should be understanding a target’s data usage rights – the ability to use customers’ past and future survey results for training models and generating synthetic data.

Companies with the flexibility to exploit customers’ data will be best positioned to harness the power and promise of synthetic data to drive meaningful value creation.

Visit Solomon Technology