Designing Chemical Products with AI - Don't Be Confidently Incorrect

Designing the Future of Chemical Products with AI – Don't Be Confidently Incorrect

Why LLMs Fall Short for Formulation Design

Scientist working with AI tools in laboratory

Spanning dozens of industries, formulated products drive trillions of dollars in global economic activity each year. Cosmetics, foods, oil & gas, biopharmaceuticals, agrochemicals, personal care products, and many more rely heavily on formulations to create and deliver their products. However, these industries cannot scale their products in the same way they have historically. With increased regulatory and consumer pressures for sustainability/environmentally friendly products, healthy ingredients, personalization, and novel solutions, simply leaning on economies of scale will not solve these constraints. Even with high-throughput experimentation, the critical bottleneck remains in the initial design of the formulation and extensive characterization and analysis of physico-chemical properties.

We are no longer in the age of only caring about the result; specific ingredient usages and development processes matter more than ever. Because adapting to these constraints requires more complexity in formulations, companies are becoming more interested in applying AI to accelerate their R&D and formulation development. LLMs like ChatGPT/Perplexity/Gemini/Claude have taken the corporate world by storm, and for good reason. However, they are not an adequate solution to the formulation design bottleneck. Domain specific, chemistry-aware models are required to make reliable predictions on physicochemical properties that drive formulation performance.

Formulation Design Across Industries

While formulated products across industries may look nothing alike in their final form, the science underlying their design is largely the same. Surfactants, polymers, emulsifiers, and active ingredients are not just the building blocks of a skin cream or a shampoo, but they appear in agrochemicals, food products, biopharmaceuticals, and industrial fluids alike. Similarly, many of the performance properties that define whether a product succeeds are shared across industries, including rheology, stability, surface tension, and foaming. Whether a product is being pumped through a pipeline, sprayed onto a crop, injected into a patient, or applied to skin, its success ultimately comes down to how it flows. A product's flow behavior depends entirely on viscosity across different shear rates. Whether a product lasts on a shelf, in a field, or in the body comes down to stability. Across all these dimensions, the guiding principles trace back to the same foundation: soft matter physics and colloidal science.

Neither discipline is trivial to master, and it takes years of hands-on formulation experience to even begin appreciating the complexity of how ingredients interact. Even seasoned formulators cannot reliably predict how a new combination of ingredients will behave, because the relationships between chemical inputs and physical outputs are highly complex and often defy intuition. What makes this tractable at all is that, however complex those relationships are, they are not arbitrary. The behavior of a formulation is fully determined by the underlying chemistry. Specifically, the chemical structure of its ingredients, their molecular size, and formulation variables such as pH, ionic strength, and temperature. That consistency is precisely what makes machine learning a compelling tool and also why applying it carelessly produces dangerously misleading results.

Modern challenges amplify this complexity. Consumers and regulators increasingly demand bio-based, degradable, healthier, or personalized products, and companies must meet these requirements without compromising performance. This shift adds more variables to formulation design, increasing the number of potential ingredient combinations and interactions that must be evaluated. Traditional trial-and-error approaches are increasingly impractical, motivating the adoption of computational tools and AI to accelerate discovery and optimize performance.

The Confidently Incorrect LLM Approach

Some companies have begun exploring LLMs like ChatGPT, Gemini, and Claude as tools for formulation design: proposing ingredients, setting concentrations, and predicting physicochemical properties. The appeal is understandable, as LLMs are fluent in scientific literature, ingredient categories, and capable of producing outputs that sound authoritative and technically grounded. On the surface, this seems promising, but LLMs are fundamentally ill-suited for this task. There has also been a significant rise in AI platforms that claim to be built for chemistry/formulations that utilize LLMs in the background, but they don't do any further training on chemistry and fall into the same problems as LLMs.

LLMs are trained on text, which is their strength in many contexts, but their fundamental limitation in deep science. They have processed vast amounts of published formulation science and understand, at a linguistic level, what a surfactant is, what viscosity means, and how stability is generally discussed in the literature. But understanding a concept as language is categorically different from encoding it as chemistry. LLMs have no mechanism for representing chemical structure, no access to the proprietary experimental data that lives in company labs, and no way to model the complex, structure-driven interactions between ingredients that determine how a formulation actually performs. Predicting physical chemical properties requires precise, structured knowledge about molecules and their interactions, which is data that LLMs simply do not have. Asking an LLM to predict physicochemical properties of a specific complex formulation is a fundamental mismatch between the tool and the task, akin to applying a financial forecasting algorithm to predict the weather forecast.

This mismatch becomes actively dangerous because of how LLMs communicate uncertainty, or rather, how they fail to. They are autoregressive systems, meaning each word (or more accurately 'token') they generate is fed back into the model as the basis for the next word in its output. Any error introduced early in a reasoning chain compounds progressively, producing outputs that can be substantially wrong by the end while showing no signs of doubt. Research from MIT has shown that LLMs can be confidently incorrect — not just uncertain, but wrong in ways that their own internal confidence metrics do not flag. The anthropomorphic quality of their outputs also makes them feel more trustworthy than a numerical model returning a poor confidence score. A wrong number simply looks wrong, but a wrong answer from a fluent, expert-sounding 'authority' figure, does not.

The broader scientific community is beginning to deal with this problem directly. A 2026 Nature analysis found that tens of thousands of publications from 2025 may contain invalid references generated by AI. These errors made it through peer review precisely because they were wrapped in credible-sounding language. As Yan LeCun, one of the most respected researchers in machine learning and Meta's ex-Chief of AI, has noted, LLMs largely memorize knowledge and retrieve answers rather than reason from first principles, which is one reason they require so many parameters to function at all. In domains where language itself is the substrate of reasoning such as coding, legal analysis, and summarization, this works remarkably well. But formulation science is not one of those domains.

That is to say, none of this means LLMs have no role in a formulation workflow. They are genuinely useful for literature synthesis, regulatory document drafting, ingredient sourcing research, and communicating results. The mistake lies in using them for physicochemical prediction and formulation design, where the consequences of a confident, plausible-sounding error can send an R&D team deeply down the wrong path.

Domain Specific Machine Learning

Machine learning is not a magic bullet, rather it is a blank slate that performs only as well as the data it is trained on, the task it is trained for, and the architecture suited to that task. Consider the analogy of a precisely calibrated diet: even the most optimal nutritional plan, designed for a specific body, becomes useless or even harmful when given to someone of a different age, physiology, or lifestyle, especially if they do not follow it correctly. The same logic applies to ML models. Architecture matters, but data quality and training design matter more. An optimal model trained on the wrong data, or trained toward the wrong objective, produces outputs that are worthless regardless of its theoretical capacity. You cannot take a model built to predict the next word in a sentence and repurpose it to predict how a polymer will behave in a surfactant system.

What formulation science requires is something that most general-purpose ML tools do not provide: chemistry-aware models trained specifically on the kinds of complex, multi-ingredient systems that real formulations involve. This means encoding chemical structure, accounting for molecular size and interaction type, and learning the relationship between composition and physicochemical output from actual experimental data collected under controlled, reproducible conditions. It also means acknowledging that every lab is different. Measurement protocols, instrument configurations, and experimental conventions vary across organizations, and a model that cannot quickly adapt to these differences will struggle to work with a company's specific data environment.

For a company attempting to build this kind of model from scratch, the challenge is significant. It requires not only a large volume of high-quality experimental data across a diverse range of formulations, but also the specialized machine learning expertise to architect models that can encode chemical structure and learn complex ingredient interactions. Most formulation teams are not machine learning teams, and most machine learning teams do not understand formulation science. Bridging that gap internally demands substantial time, resources, and a level of interdisciplinary depth that is difficult to build and even harder to retain.

This is the problem that FastFormulator was built to solve. Rather than applying a general-purpose model to chemistry and hoping it transfers, FastFormulator developed its approach from the ground up, building models that are explicitly chemistry-aware and training them on thousands of proprietary, expertly designed formulations spanning a wide range of ingredient types and industries. That training data was not collected opportunistically; it was designed to systematically build up the model's understanding of individual ingredients, binary interactions, and higher-order multi-component behavior. The result is a model with a genuine chemical foundation as opposed to an LLM's linguistic one. Practically, what this means for companies is that they do not need to arrive with thousands of their own data points or an in-house machine learning team to benefit. FastFormulator's pre-trained models provide physicochemical property predictions from day one. As a company runs experiments and uploads their results, even in very small batches, the models actively fine-tune to their specific ingredients, formulation styles, and lab conditions. Over time, it becomes a customized, company-specific prediction engine that improves with continuous use. FastFormulator does not replace the expertise of a formulator or the value of a well-run experiment; it ensures both are applied in the right direction, faster.

Summary

Formulated products across industries share a common scientific foundation, and meeting modern demands for sustainability, personalization, and regulatory compliance requires more sophisticated tools for property prediction. LLMs, while they have a place in scientific workflows, are not those tools. Their strengths lie in language, not chemistry, and confidently wrong predictions are more dangerous than uncertain ones. Domain-specific, chemistry-aware machine learning is the right approach, and it is what FastFormulator was built to deliver: not to replace formulators or experimentation, but to dramatically narrow the space of where to look.

← Back to all posts