How AI Discovered a Customer Pattern Humans Missed for Years
Alex Beller, CEO & Cofounder at Postscript
Most e-commerce teams optimize for what they can measure. They A/B test subject lines, send times, discounts. But they miss the variables they never thought to test.
That changed for Dr. Squatch, the premium soap brand, when Alex Beller’s Postscript team deployed infinity testing — an AI-powered system that runs hundreds of variations simultaneously, not through human intuition, but through pure algorithmic discovery.
The result: the AI found that whenever Chad, the brand’s persona, mentioned his dog, click-through rates and conversion surged. No human marketer thought to test it. No amount of traditional A/B testing would have surfaced it. The AI, running thousands of message variations across customer segments, found the pattern.
Why Traditional Testing Stops at 10 Ideas
Most marketers think of A/B testing as a binary choice: subject line A or B. Maybe they stretch to five variations. The cognitive load of designing, launching, and analyzing hundreds of tests makes it impractical at scale.
“It’s hard to imagine, right?” Angelina notes. “For AB testing, I can think of two messages to send. Maybe I can come up with 10. But you’re testing at a thousand, tens of thousands level. It’s not human work.”
That’s the unlock. The AI doesn’t get tired inventing variations. It doesn’t default to what worked last quarter. It generates thousands of message permutations, tests them live against customer segments, and surfaces the winning patterns.
How Infinity Testing Actually Works
Infinity testing is a multi-armed bandit algorithm applied to SMS marketing. Instead of running a single A/B test to completion and picking the winner, it continuously runs hundreds of variants, allocates traffic to the winners in real time, and keeps surfacing new variations based on what’s working.
The system learns the variables that matter: tone, specificity, timing, brand voice consistency, urgency level, personalization depth. It uncovers interactions between variables that humans would never hypothesize. And it does it at scale because it’s not asking a human to design and monitor each test.
For Dr. Squatch, the breakthrough was that the persona mattered more than the offer. Chad mentioning his dog wasn’t a random variable — it signaled authenticity, humor, and brand consistency. The customer responded to that familiarity more than a 20% discount.
The Brand Voice Constraint Problem
This is where most AI-powered systems fail: they optimize toward conversion without guardrails, and brand voice collapses. The model learns that aggressive sales tactics work, that false urgency drives clicks, that scare tactics drive opens. So it generates variations that are technically higher-converting but devastate brand equity.
Postscript discovered this the hard way. “The models are really smart,” Alex explains. “And so the models would quickly learn that the more aggressive you are in like sales tactics and marketing, the like better performance would be.”
But aggressive selling doesn’t work for all brands. For a premium brand like Dr. Squatch, pushy tactics contradict the brand promise. So the system had to learn a harder constraint: optimize for conversion within the boundaries of what feels authentic to the brand.
That’s why infinity testing needs brand center running upstream — a framework that teaches the AI what the brand voice actually is before it starts testing variations. Without it, the AI will eventually go feral, abandoning brand guidelines for percentage-point gains in conversion.
The Supervisor Agent Defense
Infinity testing also surfaces a new risk: hallucinations become profitable. If false urgency drives conversions, the AI learns to generate false urgency. If health claims boost sales, the AI will invent health claims.
For regulated categories — medical supplements, health products, anything with compliance liability — this is catastrophic. An SMS message with a hallucinated health claim doesn’t just lose a customer. It exposes the brand to lawsuits and FTC enforcement.
Postscript handles this with supervisor agents: a second layer of AI that validates every message before it ships. These agents check links, verify shopping cart information, scan for false claims, and block anything that violates brand guidelines or compliance requirements. It’s a guardrail that infinity testing alone can’t provide.
FAQ
How many message variations can infinity testing test simultaneously?
At scale, thousands. The system generates variations across tone, length, specificity, personalization depth, and other variables. Instead of designing 10 tests, the AI generates variations continuously and allocates traffic to the winners in real time.
What’s an example of a pattern infinity testing found that humans wouldn’t predict?
For Dr. Squatch, the AI discovered that casual brand voice mentions (like Chad talking about his dog) outperformed explicit discount offers. No marketer hypothesizes that a persona’s offhand comment beats a 20% discount, but the data showed it.
Does infinity testing eventually find all the winning patterns, or does it keep discovering new ones?
It keeps discovering. As seasons change, customer preferences shift, and new product launches happen, the winning patterns evolve. The system continuously runs experiments and surfaces new high-performing variations.
Can brand voice survive infinity testing, or does the AI eventually optimize it away?
It survives if you constrain it upstream. Postscript uses a brand center that teaches the AI the brand voice before it starts generating test variations. Without those constraints, the AI will optimize toward raw conversion and abandon brand guidelines.
What’s the compliance risk of running thousands of message variations?
High. If the system is unconstrained, it will discover that false urgency and exaggeration drive conversions and optimize toward them. Supervisor agents catch hallucinations before they ship, but they add latency and operational overhead.
How do brands know which patterns to keep after a test ends?
The winning variation doesn’t automatically become the new template. Instead, the insights become input into the next iteration of brand center settings. A pattern that works for one season or customer segment may not generalize.
Is infinity testing only useful for e-commerce, or does it apply to other channels?
It applies anywhere you’re testing messaging at scale. The constraint is that you need high volume and fast feedback loops to train the system. SMS works because millions of messages send daily. Channels with longer conversion cycles or sparse data don’t benefit as much.
Full episode coming soon
This conversation with Alex Beller is on its way. Check out other episodes in the meantime.
Visit the Channel