The "Shit In, Shit out" Fallacy

Why Synthetic Research Without Real Customers Is Just Expensive Guessing

18.02.2026

The oldest rule in data — and why it refuses to die

There's a phrase every analyst learns in their first week on the job: Shit in, shit out.

You can build the most elegant financial model. The most beautiful dashboard. The most sophisticated regression. None of it matters if the numbers going in are wrong. The output isn't insight — it's fiction with formatting.

This rule was born in a deterministic world. Spreadsheets. Databases. Structured inputs, structured outputs. And for decades, it served as a reliable immune system against overconfidence in bad data.

Then came AI. And suddenly, the outputs got really articulate.

Large language models don't just process your assumptions — they dress them up in eloquent paragraphs, present them with the confidence of a McKinsey partner, and never once pause to say: "Actually, I'm not sure about this." Which is precisely what makes the shit-in-shit-out problem more dangerous than ever.

Because in a probabilistic world, bad inputs don't just produce wrong answers. They produce convincing wrong answers.

The synthetic respondent gold rush

Let's be honest: the appeal is obvious.

Synthetic interview panels — AI-generated participants that simulate your target audience — promise everything a research team dreams of. They're faster (minutes, not months). Cheaper (pennies, not thousands). Infinitely scalable. Available at 3 AM. Never cancel. Never ramble off-topic.

And the market is moving fast. According to Qualtrics' 2025 Market Research Trends Report (surveying 3,000+ researchers across 14 countries), 71% of researchers expect synthetic responses to make up more than half of all data collection within three years. That's not a fringe prediction — it's a consensus.

Needless to say, recent fundraising rounds reflect this conviction. Without promoting our competitors, we respectfully acknowledge that the race is on, particularly in the US.

But here's where it gets interesting — and where most people stop thinking critically.

The confidence trap

Researchers at Carnegie Mellon interviewed 19 qualitative researchers about AI-generated interview responses. Their finding was striking — not because AI responses were obviously bad, but because they were deceptively good.

AI-generated answers sound plausible. Articulate. Structured. But they lack something no model architecture can fabricate: actual lived experience.

The researchers coined a term for it: the "surrogate effect." When AI stands in for real communities, it doesn't just approximate their voice — it can distort or erase it entirely. Marginalized perspectives. Edge cases. The messy contradictions that make human beings human. All smoothed away by a model optimized for coherence.

And coherence, it turns out, is the enemy of insight.

Consider what studies comparing synthetic and real survey responses found: the headline numbers looked similar. Encouraging, right? But dig one layer deeper and the statistics were fundamentally broken. Variance was artificially tight — synthetic responses clustered far more closely around the mean than real human data. The result: false precision. The feeling of certainty, without the messy reality that creates actual understanding.

Or take the sycophancy bias. LLMs are optimized to be helpful. Agreeable. In everyday use, that's a feature. In a research context, it's a catastrophe. Synthetic respondents are structurally inclined to give you the answer you're hoping for — which is precisely the answer you should be most skeptical of.

Insights Map: Real human insights are messy. And that's the point. Outliers matter!

The micro-expression moment you'll never get from a model

Here's a story that crystalizes the problem.

In a real-world qualitative study, researchers were interviewing engineers about their sustainability practices. The verbal answers were clean: yes, sustainability is a priority. Yes, we're committed.

But the researcher caught something else. Micro-expressions — a flash of contempt, a moment of surprise, a flicker of disgust — that suggested the spoken words and the felt reality were misaligned. When pressed, the engineers revealed an uncomfortable truth: under supply chain pressure, they consistently prioritized availability over sustainability. Not because they didn't care, but because the system incentivized it.

That insight was invisible to synthetic respondents. Not because the AI was poorly trained, but because it was never in the room. It never saw the hesitation. Never noticed the jaw tighten. Never felt the temperature of the silence between two sentences.

This is the kind of insight that changes strategy. And it cannot be generated — only witnessed.

So what are synthetic respondents actually good for?

Let's be fair. Dismissing synthetic research entirely would be just as intellectually lazy as blindly adopting it.

Greylock — one of the most respected research-stage VCs, investors in LinkedIn, Figma, and Discord — frames it well: synthetic user personas can play a "complementary role in prototyping, stress-testing ideas, or augmenting human feedback loops." But their thesis is clear: "Most buyers today prioritize insights from real users."

The key word is complementary. Synthetic panels are powerful when they're built on a foundation of genuine understanding. They fall apart when they're used as a shortcut to skip that understanding.

The curly thesis: orchestration, not religion

At curly, we don't believe in choosing sides. The question isn't "synthetic or human?" — it's "when which, and why?"

Our approach starts where good research has always started: with real people.

We run deep, voice-based interviews at scale. Not 10 interviews. Not 30. Hundreds — simultaneously, asynchronously, powered by Voice AI that adapts in real time. Every participant goes through the same structured core questions (so results are comparable and segmentable), but between those questions, curly asks individualized follow-ups triggered by what the participant actually says.

When someone mentions that something was "too complicated," curly doesn't move on to the next question. It probes: What exactly? At which step? What would have helped? This turns surface-level frustration into actionable, specific insight — without breaking the quantitative structure.

The result is what we call qualitative research at survey scale: the depth of a McKinsey interview program with the consistency and statistical rigor of a quantitative survey. In days, not months. At a fraction of the cost.

And here's where it gets interesting.

The Confidence Score: knowing when you know enough

As interviews accumulate, curly tracks a real-time Confidence Score — a measure of how robust your current understanding is for a given topic or segment.

At 95%, patterns are stable. New interviews confirm what you've already learned. At this point, you can safely extrapolate: synthetic panels, built entirely from your verified customer truth, can extend your insights into adjacent segments, scenarios, and edge cases.

At 55%, patterns are still emerging. Themes are shifting with each new interview. Here, curly tells you the opposite: "Keep going. You don't know enough yet."

This isn't a philosophical position. It's a system design.

The math: why this isn't just idealism

The economics of AI-moderated research have fundamentally shifted. Conveo's 2025 benchmarks found that AI-moderated qualitative research costs $45 per insight compared to $180 for traditional methods — a 75% reduction. In our early curly pilots we have seen even stronger cost reductions of up to 90% (with up to 5x more insights vs a traditional survey).

The point is not (only) that AI makes research cheaper. It's that AI removes the excuse for not doing real research in the first place. When 100 deep customer conversations cost less than one agency sprint — the question isn't whether you can afford to listen to your customers. It's whether you can afford not to.

The market is telling you something

The broader signals are hard to ignore.

89% of researchers now use AI tools regularly or experimentally. 64% increased their number of AI tools in 2025 alone. And critically: teams that don't use AI are 4x more likely to lose organizational influence.

But adoption isn't the same as wisdom. The question isn't whether to use AI for research — that debate is over. The question is whether you use AI to skip understanding your customers, or to deepen it.

Greenbook's 2026 outlook describes a shift from insights as one-off projects to continuous intelligence — research as an always-on capability that actively shapes decisions in real time. That future isn't built on synthetic assumptions refreshed quarterly. It's built on a living, growing body of real customer understanding — continuously enriched, continuously validated.

The bottom line

Shit in, shit out hasn't changed. Shit has just gotten more eloquent.

Synthetic respondents are a powerful tool — when they're built on a foundation of genuine, verified, emotionally rich customer understanding. Without that foundation, they're a mirror reflecting your own biases back at you, dressed up as customer truth.

The magic isn't in choosing between humans and AI. It's in knowing exactly when to switch.

Your customers should always be the starting point. Not the afterthought.

curly gives you qualitative research at survey scale. Voice-AI interviews that go deep — with hundreds of real customers, not simulated ones. Curious what 100 deep customer conversations in a weekend looks like? Let's talk →

Sources

Qualtrics — 2025 Market Research Trends Report — Survey of 3,000+ researchers, 14 countries
Qualtrics — 2026 Market Research Trends Report — AI adoption & organizational influence data
Carnegie Mellon / arXiv — Surrogate Effect Study — 19 qualitative researchers on AI-generated responses
Merrill Research — Synthetic Respondents: Promise, Pitfalls, Reality Check — Variance analysis
FieldworkHub — Synthetic Respondents: Innovation or Illusion? — Sycophancy bias
Quirk's — The Future of Synthetic Respondents — Micro-expression case study
Greylock — The Rise of AI-Native User Research — Investment thesis
Conveo — AI-Moderated Research: Framework & ROI Benchmarks 2025 — Cost-per-insight data
Rival Technologies — AI Research Insights 2026 — Response depth data
Rival Group — 2026 Market Research Trends Report — AI tool adoption data
Greenbook — What To Expect in 2026 — Continuous intelligence trend

FINN Case Study ›