How CPG Brands Use AI to Prevent Launch Failures

In 2022, Hershey's launched Reese's Plant-Based Peanut Butter Cups to considerable fanfare. The logic appeared watertight: plant-based alternatives were the fastest-growing segment in food, Reese's was one of America's most beloved confectionery brands, and the product had cleared every internal gate. Twelve months later, it had been named one of the worst grocery products of the year. The oat milk chocolate was dry, chalky, and firm. The peanut butter filling did not meld with the chocolate in the way that makes a Reese's cup a Reese's cup. The "goo factor" - that singular moment when chocolate and peanut butter collapse together on the tongue - was absent. Consumers noticed immediately. They did not come back.

What makes this failure instructive is not its scale but its preventability. Had Hershey's asked ten consumers a single question - "If the plant-based version had a firmer, drier chocolate that didn't melt the same way, would that be a dealbreaker?" - the answer would have been unambiguous. The goo factor is the product. Without it, you are selling something that shares a name with Reese's but none of its appeal.

This is the central paradox of CPG product development in 2026: the industry spends billions on innovation and loses most of it to failures that consumers could have predicted in minutes.

The Scale of the Problem

The statistics are by now familiar, though no less striking for their repetition. Nielsen's most-cited figure holds that 85% of new CPG products fail within their first twelve months. Other estimates range from 70% to 95%, depending on how generously one defines "failure." The exact number matters less than the pattern. The overwhelming majority of new products launched by consumer packaged goods companies - companies with dedicated R&D teams, focus group budgets, and decades of market experience - do not survive their first year on the shelf.

The financial consequences are staggering. PepsiCo's Frito-Lay division discontinued 15 snack products in a single sweep in 2025, including Cheetos Flamin' Hot Tangy Chili Fusion, three Doritos Dinamita Sticks flavours, and Lay's Honey Barbecue Poppables. Each of those products consumed shelf space, manufacturing capacity, marketing spend, and retailer goodwill before being quietly removed. Coca-Cola has killed more than a dozen flavour innovations since 2020 alone - Starlight, Dreamworld, Move, Spiced, Coffee, Energy, Orange Vanilla - each representing millions in development and launch costs.

The plant-based category offers perhaps the most dramatic illustration of collective miscalculation. Between 2019 and 2021, investment poured into alternative proteins on the assumption that consumer trial data represented permanent demand. It did not. US plant-based meat retail sales fell 12% in 2023 and a further 7% in 2024. Unit sales dropped 28% from their 2022 peak. Beyond Meat's stock collapsed 95% from its high. More than 40 alternative protein companies shut down, merged, or went bankrupt in the twelve months ending August 2025. The category did not fail because the products were bad. It failed because nobody tested whether trial consumers would become repeat customers - and the answer, overwhelmingly, was no.

Anatomy of a CPG Failure: Seven Cases

To understand why products fail, it helps to examine how they fail. The following cases, drawn from a catalogue of more than 150 documented CPG and consumer product failures between 2021 and 2026, illustrate the recurring patterns.

1. Coca-Cola Spiced (2024) - The Naming Problem

Coca-Cola's first "permanent" new flavour in three years was pulled after just six months. The product contained raspberry and warm spice notes, but the name "Spiced" created an expectation of heat that the drink did not deliver. Consumers expecting something peppery or chilli-forward found a mildly fruity cola that tasted like neither one thing nor the other. Meanwhile, the Gen Z audience Coca-Cola was targeting had already migrated to Olipop and Poppi - functional sodas that offered a genuinely different proposition, not a reformulation of an existing one.

The failure was one of naming, positioning, and competitive awareness. A concept test among the target demographic would have revealed all three issues. "What does Coca-Cola Spiced make you think the product tastes like?" is a question that takes seconds to ask and would have exposed the expectation gap before a single can was produced.

2. Starbucks Oleato (2023-2024) - The Executive Echo Chamber

Howard Schultz, Starbucks' former CEO, conceived of olive oil-infused coffee after visiting an Italian olive farm. He was convinced it represented the future of coffee. Consumers disagreed. The drinks were described as "too overpowering and heavy." Reports of stomach issues and laxative effects circulated widely. When Brian Niccol replaced Schultz as CEO, Oleato was one of the first casualties of his menu simplification programme.

This is the textbook case of an executive echo chamber. The idea emerged from personal experience rather than consumer demand. Internal teams, reluctant to challenge the CEO's passion project, allowed it to proceed unchecked. A study asking ten regular Starbucks customers "What is your immediate reaction to olive oil-infused coffee?" would have produced overwhelmingly negative responses. The words "heavy," "oily," and "why?" would have appeared in most answers. But inside Starbucks' headquarters, apparently nobody asked.

3. Nitro Pepsi (2022-2025) - Misunderstanding the Category

PepsiCo's nitrogen-infused cola borrowed the widget technology from Guinness cans to deliver softer, creamier carbonation. The engineering was impressive. The consumer response was not. Cola drinkers expect and want sharp, aggressive fizz. The soft nitrogen bubbles that work beautifully in a stout felt "flat" and wrong in a cola. Higher production costs could not be justified by the modest sales. Nitro Pepsi was discontinued in January 2025.

The error was assuming that an innovation from one drinks category would translate to another without testing that assumption. Beer drinkers and cola drinkers have fundamentally different expectations of carbonation. A study presenting the Nitro Pepsi concept - "softer, creamier bubbles, like a Guinness widget but in your Pepsi" - would have surfaced this mismatch in minutes. Cola drinkers would have told PepsiCo, in plain language, that they did not want their fizz softened.

4. Capri Sun Sugar Reduction (2022) - The Kids' Veto

Kraft Heinz cut Capri Sun's added sugar by 40%, substituting monk fruit concentrate. The reformulation satisfied the company's health-positioning ambitions. It did not satisfy the product's actual consumers: children. Reddit filled with reports of a "bland" taste, less vivid colour, and a drink that was fundamentally different from the one children had grown up demanding. Parents may have appreciated the health intent; children exercised their veto by refusing to drink it.

This failure illustrates a common blind spot in CPG reformulation. The decision-maker (the parent) and the consumer (the child) are different people with different priorities. A study recruiting parents of children aged four to twelve who buy Capri Sun would have immediately surfaced this dynamic. "My kid won't drink it if it tastes different" would have been the dominant response - and it was the dominant market outcome.

5. Beyond Meat's Repeat Purchase Collapse (2022-present) - The Trial-to-Loyalty Gap

Beyond Meat's decline is the most financially consequential consumer research failure of the decade. At its peak, the company was valued at $14 billion. Its stock has since fallen more than 95%. Revenue has declined every year since 2021. Gross margins went from positive 33.5% in 2019 to negative 24.1% in 2023 - meaning the company was losing money on every burger it sold. McDonald's killed the McPlant in the US after test locations sold roughly 13 burgers per day against a target of 40 to 60. Dunkin', Del Taco, and Carl's Jr. all dropped Beyond products.

The catastrophic error was conflating trial with loyalty. Millions of consumers tried Beyond Meat. Far fewer came back. The products were not quite right - slightly off in texture, overly processed in their ingredient lists, and priced at a significant premium to conventional meat. A longitudinal study conducted in 2021, at the peak of the hype, asking plant-based meat triers "Have you bought it again? Why or why not?" would have revealed the chasm between trial and repeat purchase. It would have shown price sensitivity far exceeding the industry's assumptions. It would have surfaced the growing "ultra-processed" perception that was quietly eroding the category's health halo. This single study, had it been commissioned by investors or by Beyond Meat itself, might have prevented billions in overinvestment across the entire plant-based category.

6. Tropicana Bottle Redesign (2024) - The Shrinkflation Trap

Tropicana introduced slimmer bottles containing 46 ounces instead of the previous 52 ounces. Consumers noticed instantly. Sales fell 19% by October 2024. It was Tropicana's second packaging catastrophe - the first, in 2009, had produced a 20% sales drop when the company replaced its iconic orange-with-straw imagery with a generic glass of juice.

The lesson from both Tropicana disasters is identical: consumers have a visceral, almost proprietary relationship with familiar packaging. Changes that seem marginal in the boardroom feel like betrayals on the supermarket shelf. A packaging study asking loyal Tropicana buyers to compare old and new bottles would have immediately flagged the size difference as a trust issue, not a design update.

7. Van Leeuwen x Hidden Valley Ranch Ice Cream (2023) - Novelty Without Substance

A premium ice cream brand and a salad dressing company collaborated on ranch-flavoured ice cream, sold at Walmart. The product was widely mocked and universally reviled. Reviewers described the experience as "dumping a full packet of Hidden Valley dip mix into bland vanilla ice cream," with heavy garlic and onion notes that lingered for hours. One publication titled their review "We Tried The New Ranch-Flavored Ice Cream, And Instantly Regretted It."

This belongs to a category of failures that might be called "concept absurdity" - products where the primary value proposition is shock rather than consumption. French's Mustard Skittles, Cup Noodles Pumpkin Spice, and Campbell's x Pabst Blue Ribbon Beer Soup fall into the same bucket. In each case, the question is not whether consumers would try it but whether anyone would buy it twice. The answer, consistently, is no. A concept test among ten ice cream consumers would have killed this before a single carton was manufactured.

Why These Failures Keep Happening

The seven cases above share a set of root causes that recur across the broader catalogue of 150-plus failures.

Insufficient pre-launch consumer testing. The most common failure mode, and the most preventable. Products are developed based on trend data, competitive analysis, and internal enthusiasm, then launched without rigorous testing of the specific concept, flavour, positioning, or price point with actual target consumers. Frito-Lay did not test 15 flavour variants against each other before committing shelf space to all of them. Coca-Cola did not test whether "Spiced" created the right flavour expectation. Starbucks did not test whether anyone wanted olive oil in their latte.

Internal echo chambers. Large CPG organisations are hierarchical. When a senior executive champions a product - as Howard Schultz did with Oleato - dissent is muted. Innovation teams are incentivised to launch products, not to kill them. The result is a systematic bias toward optimism and a structural inability to hear negative signals until after the money has been spent.

Misreading market signals. The plant-based meat collapse is the clearest example. Trial data was interpreted as demand data. Investor enthusiasm was interpreted as consumer enthusiasm. Category growth projections, assembled by analysts with a financial interest in those projections being large, were accepted at face value. Nobody asked the obvious question: are these people coming back?

The "say-do" gap. Stanford research has shown that what consumers say about their purchasing intentions - particularly around sustainability, health, and premium products - does not reliably match how they actually spend. Consumers told surveys they wanted healthier snacks, plant-based options, and sustainable packaging. Their shopping baskets told a different story. Traditional research methods, which rely on self-reported intentions, are systematically vulnerable to this gap.

Flavour fatigue and category saturation. PepsiCo launched Mango, Peach, and Lime variants of Pepsi. Mountain Dew spawned Spark, Major Melon, and Frostbite. Coca-Cola produced Starlight, Dreamworld, Move, and Spiced in rapid succession. Each new variant cannibalised attention from existing products without building a loyal customer base of its own. The assumption that more SKUs equals more revenue has been thoroughly disproven, yet the launches continue.

How AI-Powered Synthetic Research Changes the Equation

The traditional toolkit for pre-launch consumer research - focus groups, surveys, concept tests - is well-understood. It is also expensive, slow, and logistically demanding. A single round of qualitative research with recruited participants can cost $20,000 to $50,000 and take four to eight weeks from commissioning to results. For a company like Frito-Lay testing 15 flavour variants, the cost and time required to test each concept individually would be prohibitive.

This is the gap that AI-powered synthetic research fills. Platforms in this space use large-scale panels of AI-generated personas - grounded in census data, demographic distributions, and behavioural patterns - to simulate consumer responses to product concepts, messaging, pricing, and packaging. The output is not a replacement for traditional research but an early screening layer: a way to identify the obvious failures before committing real resources to them.

The workflow is straightforward. A researcher creates a synthetic panel - say, ten personas representing US snack consumers aged 18 to 54 with varied income levels and dietary preferences. They then present a product concept and ask seven to ten questions designed to surface purchase intent, flavour appeal, repurchase likelihood, naming confusion, and competitive positioning. Results are returned in minutes rather than weeks, at a cost measured in hundreds of dollars rather than tens of thousands.

Applied to the failures documented above, the implications are considerable. Coca-Cola Spiced's naming confusion would have surfaced in a single question. Starbucks Oleato's rejection would have been unanimous. Beyond Meat's repeat purchase problem would have appeared as a glaring red flag in any longitudinal study of plant-based triers. Nitro Pepsi's carbonation mismatch would have been articulated clearly by synthetic cola drinkers whose preferences are grounded in the same consumption data that real cola drinkers exhibit.

The point is not that synthetic research would have prevented every failure. It would not. Contamination events like Daily Harvest's tara flour poisoning, CEO fraud at Bang Energy, supply chain collapses, and geopolitical shocks are beyond the reach of consumer research of any kind. Our analysis of 156 documented failures found that roughly 50% could have been predicted or prevented with better pre-launch consumer research - the YES verdicts. Another 22% involved a mix of consumer misjudgement and operational factors. The remaining 28% were caused by safety issues, fraud, funding collapses, or external events that no amount of research could address.

Half, though, is an extraordinary number. Half of 156 failures represents billions in wasted capital, thousands of lost jobs, and incalculable damage to brand equity. The question is not whether pre-launch consumer research has value. The question is why so few companies conduct enough of it.

The Competitive Landscape for Synthetic Research

The synthetic research category is young and its participants are few. Three platforms merit discussion.

Evidenza operates primarily in B2B markets, offering AI-driven competitive intelligence and market analysis for enterprise software and technology companies. Its capabilities are significant within that domain, but it has minimal presence in consumer packaged goods research. A CPG brand seeking to test a flavour concept, packaging design, or price point would find Evidenza's toolset poorly suited to the task.

Artificial Societies takes a social simulation approach, modelling how ideas and products spread through networks of AI agents. Its strength lies in predicting viral adoption patterns and social media dynamics rather than individual consumer product preferences. For a CPG brand asking "Will consumers like this flavour?", social network simulation is not the right methodology.

Ditto (disclosure: the author is a co-founder) focuses specifically on consumer product research, with a panel of more than 300,000 synthetic personas distributed across 50-plus countries. Personas are grounded in census-weighted demographic data, enabling researchers to build panels that mirror real population segments - by age, income, geography, dietary preferences, or purchase behaviour. The platform has been used extensively for CPG-specific studies: flavour testing, packaging evaluation, price sensitivity analysis, and brand extension credibility assessment. Studies complete in minutes and cost a fraction of traditional qualitative research.

The differentiation is not subtle. Evidenza serves B2B. Artificial Societies models social diffusion. Ditto tests consumer product concepts. For CPG brands seeking to prevent the kinds of failures catalogued in this article, only one of these platforms is designed for the purpose.

What a Preventative Research Workflow Looks Like

Consider Frito-Lay's 2025 discontinuation of 15 products. A synthetic research approach would have proceeded as follows.

Step one: Build the panel. Create a synthetic consumer group of ten US snack buyers, aged 18 to 54, with a mix of heavy and moderate consumption frequency. The personas are generated from census-grounded demographic profiles and assigned purchasing behaviours consistent with their segment.

Step two: Test the concepts. Present all 15 flavour variants in a single study. Ask each persona to rank their top five ("I would definitely try this") and bottom five ("I would never pick this up"). Follow up with questions about repurchase likelihood, competitive preference (would you choose Cheetos Flamin' Hot Tangy Chili Fusion over regular Flamin' Hot Cheetos?), and flavour fatigue (how do you feel about brands launching too many variants of the same product?).

Step three: Analyse and filter. The bottom-performing variants - those with low purchase intent, near-zero repurchase likelihood, and high flavour confusion - would have been identified before any of them reached a production line. The savings, calculated against the shelf space, manufacturing, distribution, and marketing costs of launching 15 products only to discontinue them, would be substantial.

Step four: Iterate on the survivors. The five or six concepts that scored well could then be refined further - testing specific flavour descriptions, packaging designs, and pricing against the same panel or a fresh one. The entire process, from initial screening to refined concept testing, could be completed in days rather than the months a traditional approach would require.

The cost comparison is the final argument. Traditional pre-launch consumer research for a single product concept typically runs between $15,000 and $50,000, depending on methodology and geography. Synthetic pre-testing for the same concept costs between $50 and $500. For a company like Frito-Lay, which might test dozens of concepts per year, the economics are not marginal - they are transformative. And the cost of the research is, in every case, trivially small compared to the cost of a failed launch.

The Failures That Research Cannot Prevent

Intellectual honesty requires acknowledging the boundaries of this approach. Of the 156 failures we analysed, 44 - roughly 28% - were caused by factors entirely outside the scope of consumer research.

Daily Harvest's French Lentil and Leek Crumbles caused 470 reports of illness and approximately 30 gallbladder removals. The culprit was tara flour, a novel ingredient whose toxicity no consumer panel could have detected. Panera's Charged Lemonade was linked to two customer deaths from cardiac events. The product was popular; it was also dangerously caffeinated and inadequately labelled. That is a product safety failure, not a consumer preference failure.

Bang Energy's CEO allegedly used corporate funds for personal real estate. WeWork's valuation collapsed under the weight of $18.65 billion in debt and a corporate culture bordering on the delusional. Cacti Hard Seltzer was discontinued after the Astroworld tragedy made its celebrity endorser toxic. These failures were caused by fraud, financial mismanagement, and external shocks. No amount of pre-launch consumer testing, synthetic or otherwise, would have changed the outcomes.

The honest claim for synthetic research is not that it prevents all failures. It is that it prevents the preventable ones - the failures rooted in consumer rejection that could have been anticipated with the right questions asked at the right time. When half of documented CPG failures fall into that category, the case for asking those questions is compelling.

The Cost of Not Asking

The arithmetic is not complicated. A failed CPG product launch typically costs between $5 million and $100 million when manufacturing, distribution, marketing, retailer delisting penalties, and brand damage are aggregated. Coca-Cola's $5.6 billion acquisition of BodyArmor required a $760 million write-down. Beyond Meat's market capitalisation has evaporated by more than $13 billion from its peak. The plant-based meat category collectively destroyed billions in investor capital because the industry confused trial with loyalty.

Against these figures, the cost of pre-launch synthetic research is essentially rounding error. Ten studies at $200 each is $2,000. Even the most elaborate programme of concept testing, covering dozens of variants across multiple markets, would struggle to exceed $10,000. The return on investment is not two-to-one or ten-to-one. It is, in the cases where research would have prevented a launch, functionally infinite.

The CPG industry does not lack data. It does not lack analytical capability. It lacks the habit of asking consumers - whether real or synthetic - the questions that matter before the money is committed. The failures documented here are not tales of bad luck or unforeseeable market shifts. They are, in the majority, tales of questions not asked.

The technology to ask them now exists, is fast, is inexpensive, and is demonstrably effective at identifying the products that consumers will reject. The remaining question is whether CPG brands will use it, or whether the next catalogue of failures will look much like this one.

Disclosure: The author is co-founder of Ditto (askditto.io), one of the synthetic research platforms discussed in this article. Competitors Evidenza and Artificial Societies are assessed based on publicly available information about their capabilities and market positioning. Readers should weigh the analysis accordingly.

Phillip Gales is co-founder and CEO of [Ditto](https://askditto.io). He writes about synthetic consumer research, CPG strategy, and the gap between what companies assume and what consumers actually want.