- Tension: A landmark field study finds AI-generated ads perform comparably to human-made ads overall — but the highest-performing AI ads are the ones that don’t look like AI at all.
- Noise: Most coverage is still framing this as a competition: AI creative is catching up to humans, humans should be worried, the robots are coming for the copywriters.
- Direct Message: The competitive variable was never AI vs. human — it was always how well the ad communicates trust, and the study accidentally proved it.
There is a version of this story that writes itself easily. AI ads now perform as well as human ads. Cue the think pieces about creative departments. Cue the quotes from agency executives who are very calm about everything and definitely not nervous. That version of the story is available, and it is not wrong exactly — but it is answering a question that turns out to be less interesting than the one the data actually raises.
The more honest read of the research is this: a major field study set out to test whether AI can match human creative performance, and in doing so, stumbled onto a finding that makes the original question seem slightly beside the point.
What the study actually measured
The research comes from a collaboration between Columbia Business School, Harvard, the Technical University of Munich, and Carnegie Mellon, conducted in partnership with Taboola’s Realize platform. The full report is worth reading if you care about methodology, because the methodology is the thing that makes this worth taking seriously.
Most comparisons of AI and human creative share the same flaw: they’re not actually comparing like with like. You end up measuring advertiser quality, campaign timing, audience targeting, and budget differences at the same time as you’re measuring creative origin. The researchers here used what they call “sibling ads” — matched pairs of AI-generated and human-made ads drawn from the same advertiser, the same campaign, run on the same day. That controls for advertiser identity, timing, audience targeting, and the landing page the ad leads to. What’s left, once you strip those variables out, is as clean a test of creative execution as you’re likely to get in a live environment.
The scale was substantial: 500 million impressions, 3 million clicks, hundreds of thousands of live ads. This is not a lab study with 200 participants rating mock advertisements. It is observational data from real campaigns at real spending levels, with real people clicking or not clicking.
The headline finding, and the more interesting one underneath it
The headline result is that AI-generated ads performed statistically comparably to human-made ads overall. The raw numbers lean slightly toward AI — a click-through rate of 0.76% versus 0.65% — but the researchers are appropriately careful about that gap under tight statistical controls. The honest summary is: comparable. Not dramatically better, not worse. Comparable.
That finding matters for the practical question of whether brands can use AI-generated creative without sacrificing performance. The answer appears to be yes. The study found that AI visuals increased or maintained click-through rates without reducing downstream conversion performance — which addresses the concern that AI might inflate clicks from the curious while failing to deliver customers who actually buy anything. Early adopters in food and drink and personal finance, two sectors with high creative volume and significant testing appetite, appear to be figuring this out faster than others.
But the more interesting finding is the disaggregated one. When the researchers separated AI ads that appeared authentically generated from those that didn’t “look like AI,” the group performing best of all wasn’t human-made ads. It was AI ads that passed — the ones viewers would not have flagged as artificial. That group outperformed both human-made ads and AI ads that read as clearly synthetic.
This is the moment the study changes shape. If the best-performing ads were the AI ads that didn’t look like AI, then “AI-generated” is not actually the variable doing the work. Something else is.
The human face as the real variable
The study identifies what that something else appears to be: human faces. Large, clear, human faces in ads were among the most consistently impactful trust signals the researchers identified. This will not surprise anyone who has spent time in direct-response advertising — the face-as-anchor principle has been a practitioner heuristic for decades. What the study adds is scale and rigor. It also adds a twist.
AI-generated ads were more likely to include prominent human faces than human-made ads. Not slightly more likely — measurably more consistently. The researchers suggest this may be because generative AI learns from best-practice creative templates: it follows the playbook more reliably than humans do, who get bored, diverge, experiment, or simply miss the memo. AI doesn’t get bored. It keeps putting the face in the frame.
This is a genuinely strange inversion. The route by which AI ads outperform human ads, in the cases where they do, turns out to run through AI being better at deploying human warmth. The image that communicates trust does so regardless of whether a person or a model generated it. The person looking at the ad has no idea and, more importantly, no reason to care. They’re reading a face. They’re reading warmth or coolness, openness or guardedness, someone who looks like them or doesn’t. The origin of the pixel is invisible and irrelevant to that judgment.
What the CTR number doesn’t capture
There is a complication, and it comes from a different body of research entirely. A click is an intention, a moment of interest, a micro-decision made in a fraction of a second. What click-through rates don’t measure is what stays with you after the screen is gone — brand recall, emotional association, the slow accumulation of preference that shapes purchasing decisions over months and years.
Nielsen’s 2025 annual marketing research points to a related tension: while AI is rapidly being adopted for creative production, 59% of global marketers identify AI-powered personalization and optimization as the most impactful trend — and yet the same research surfaces ongoing concern about whether AI-generated creative can build the kind of long-term brand equity that human storytelling has historically produced. The worry isn’t clicks. It’s whether anything is being built that lasts.
That concern is real and the Taboola study doesn’t resolve it, because it wasn’t designed to. The study measures performance advertising — direct response, short-cycle, click-to-conversion. It says little about brand campaigns, awareness advertising, or the kind of creative work whose effects accumulate over years rather than days. The Taboola study and the broader brand-effectiveness literature are not contradicting each other so much as measuring different things — and both measurements are real. The Springer Nature research on AI versus human performance in text-to-image and text-to-video advertising, which finds the gap varies significantly across format, category, and viewer context, adds further texture to that complexity. The honest answer is that it depends on what you’re trying to accomplish and on what timescale you’re measuring it.
What the right question actually is
What the Columbia study did, probably without quite intending to, is demonstrate that the “AI vs. human” frame was always a category error. What viewers are responding to when they see an ad is not a disclosure about production method. They are responding to visual cues that their brains have learned to associate with trustworthiness, warmth, credibility, or relevance. Those cues can be assembled by a person. They can be assembled by a model. They can be assembled badly by either. The label attached to the assembly process is not part of the signal.
This reframes the actual competitive question, which is not “can AI match human creative?” but rather “what are we actually trying to communicate, and how do we communicate it most reliably?” Human faces work because they trigger the fast, instinctive reading of expression and intent that viewers do without thinking. That process does not first check whether the face was rendered in Midjourney. It just reads the face.
If AI tools are better at consistently deploying these trust signals — not because they’re more creative, but because they’re more consistent — then the practical advantage accrues to the tool that executes the fundamentals reliably. That’s a much more mundane story than “AI is replacing human creativity.” It’s a story about templates and consistency and the gap between knowing what works and reliably doing it.
What’s genuinely interesting about all this is what it reveals about persuasion. We’ve been worrying about whether AI can be creative. The study suggests the more pressing question is whether the audience can tell — and the answer is that they mostly can’t and don’t need to, because what they’re actually doing is something much older than either creativity or artificial intelligence. They’re looking for a face they can trust.