Technology

MIT Study Reveals Flaws in Large Language Models’ Responses

MIT Study Reveals Flaws in Large Language Models’ Responses
Editorial
  • PublishedNovember 26, 2025

A recent study conducted by researchers at the Massachusetts Institute of Technology (MIT) has uncovered significant flaws in the reliability of large language models (LLMs). These models occasionally learn incorrect associations that lead to misleading responses based on grammatical patterns rather than actual understanding of the subject matter. This finding has critical implications for the deployment of LLMs in various sectors, including customer service and healthcare.

The researchers, led by Marzyeh Ghassemi, an associate professor in the MIT Department of Electrical Engineering and Computer Science, discovered that LLMs can mistakenly link certain syntactic patterns to specific topics. For instance, an LLM might produce a convincing answer by recognizing familiar phrasing instead of truly comprehending the question being asked. Their experiments demonstrated that even the most advanced models can err in this manner, raising concerns about their reliability in critical applications.

Ghassemi emphasized the unexpected nature of these findings, stating, “This is a byproduct of how we train models, but models are now used in practice in safety-critical domains far beyond the tasks that created these syntactic failure modes.” Joining her in the study were co-lead authors Chantal Shaib, a graduate student at Northeastern University, and Vinith Suriyakumar, an MIT graduate student, along with Levent Sagun, a research scientist at Meta, and Byron Wallace, an associate dean of research at Northeastern University’s Khoury College of Computer Sciences. The findings will be presented at the Conference on Neural Information Processing Systems.

Understanding Syntactic Templates

LLMs are trained on vast amounts of data sourced from the internet, where they learn the relationships between words and phrases. This training allows them to respond to queries, but it also leads to the development of associations that may be inaccurate. The researchers identified that LLMs pick up on part-of-speech patterns, which they refer to as “syntactic templates.”

For example, in the domain of news writing, specific structures are common, and while the model learns both the semantics and syntax, it may improperly rely on these syntactic templates when responding to questions. If an LLM recognizes that a question like “Where is Paris located?” follows a certain structure, it may inaccurately apply that association to other questions that share the same grammatical framework, regardless of their actual meaning.

To illustrate this, the researchers conducted experiments using synthetic questions that maintained the same syntactic structure but included nonsensical words. Remarkably, the LLMs frequently provided correct answers, even when the questions had no coherent meaning. Conversely, when the questions were restructured with a different part-of-speech pattern, the models often failed to respond accurately despite the underlying meaning remaining unchanged.

Implications and Future Directions

The researchers also examined the potential for malicious exploitation of these vulnerabilities. They discovered that by phrasing questions using syntactic templates linked to “safe” datasets, individuals could trick LLMs into generating harmful content, even if those models had been designed to refuse such requests.

Vinith Suriyakumar noted the necessity for robust defenses against these security vulnerabilities, stating, “In this paper, we identified a new vulnerability that arises due to the way LLMs learn.” While the current research did not address specific mitigation strategies, it introduced an automatic benchmarking technique to evaluate an LLM’s reliance on incorrect syntactic-domain correlations. This method could empower developers to address these shortcomings proactively, thereby enhancing safety and performance.

Looking ahead, the researchers plan to explore various mitigation strategies, including augmenting training data with a broader range of syntactic templates. They are also interested in investigating this phenomenon within reasoning models, which are specialized LLMs designed for complex tasks.

As noted by Jessy Li, an associate professor at the University of Texas at Austin who was not involved in the study, this research highlights the importance of linguistic knowledge in ensuring LLM safety. “This work serves as a reminder that understanding language intricacies is crucial for developing safer AI systems,” she stated.

This research is supported by funding from sources including the National Science Foundation, the Gordon and Betty Moore Foundation, and a Google Research Award, among others. The findings underscore the urgent need for continuous improvement and evaluation of LLMs, particularly as they increasingly permeate safety-critical domains.

Editorial
Written By
Editorial

Our Editorial team doesn’t just report the news—we live it. Backed by years of frontline experience, we hunt down the facts, verify them to the letter, and deliver the stories that shape our world. Fueled by integrity and a keen eye for nuance, we tackle politics, culture, and technology with incisive analysis. When the headlines change by the minute, you can count on us to cut through the noise and serve you clarity on a silver platter.