By Centico Research · 3 min read
There is a growing temptation in research to fill gaps with synthetic data — model-generated “respondents” that answer the way a real person supposedly would. It is fast, cheap, and for some early-stage exploration, genuinely useful. But it is also where a lot of studies are quietly going to go wrong.
A synthetic respondent can only reflect what a model already learned. It cannot surprise you, cannot tell you about the thing nobody predicted, and cannot represent an audience the model never really understood — which, awkwardly, is most of the hard-to-reach B2B and niche audiences that research exists to study. Lean on it too early and you end up with findings that confirm assumptions rather than challenge them.
As AI-generated answers spread, the same pressure shows up inside real panels: bots, fraud, and people speeding through for the incentive. The value, then, is not just “sample” — it is sample you can prove is real. Verified identity, fraud screening, attention checks, and honest sourcing are turning into the premium part of the offer, not the boring back office.
Use synthetic methods to sharpen hypotheses and pressure-test design. But when a decision rides on the answer, make sure a real, screened human gave it. Quality at the source is the foundation everything else — cleaning, coding, reporting — is built on.
— Centico Research