SLMs: A New Agentic AI Model? Efficiency, Cost and Deployment

Agentic AI Systems Needs to Shift.

LLMs’ conversational and humanlike abilities are highly praised. LLMs have become more popular for repetitive and specialized tasks due to rapid AI growth. This shift is gaining momentum—over half of major IT companies now use AI agents, with significant funding and projected market growth. The LLMs are used by these agents for task planning and execution as well as decision making. Huge investments are being made to support LLMs, indicating that AI is relying on this model for its future.

The Case for SLMs: Their Efficiency, Applicability and Suitability

NVIDIA-Georgia Tech researchers argue that the small language model (SLM) is not only capable of performing many agent tasks, but it’s also cost effective and more efficient. SLMs, they believe, are more suited to the simple and repetitive nature of many agentic tasks. They propose a mixture of models depending upon the task complexity. While larger models are still essential for general conversational requirements, this is not true when it comes to large-scale models. It challenges the current reliance of agentic system on LLMs and offers a framework to transition from LLMs towards SLMs. Open discussion is encouraged to help encourage a more resource-conscious AI implementation.

Why SLMs suffice for agentic operations

SLMs, according to the researchers, are capable of performing most AI tasks but also are cost-effective and practical. They define SLMs as models that can run efficiently on consumer devices, highlighting their strengths—lower latency, reduced energy consumption, and easier customization. SLMs can be sufficient or even preferred for many repetitive tasks. This paper proposes a move toward modular agents systems, using SLMs as default, but LLMs when needed. It promotes a sustainable, flexible and inclusive approach for building intelligent systems.

Arguments in Favor of LLM Dominance

Some claim that LLMs always perform better than small models (SLMs), due to their superior semantic and scaling abilities. Other people claim that centralizing LLM inferences is cheaper due to economies scale. Some believe that LLMs are dominant simply because of their early beginnings, which attracted the attention of most in the industry. The study counters this by stating that SLMs can be highly adaptive, are cheaper to operate, and handle subtasks within agent systems with ease. SLMs face a number of obstacles, such as existing infrastructure investments and evaluation bias towards LLM benchmarks.

The Transition from LLMs and SLMs Framework

The process begins with the collection of usage data, while protecting privacy. After that, data is cleaned to remove any sensitive information. Clustering allows common tasks to be grouped together in order for SLMs to take control. SLMs that are suitable for the task at hand can be selected and tuned with custom datasets. This is often done using efficient techniques like LoRA. LLM results can guide SLM development in certain cases. This isn’t a one-time process—models should be regularly updated and refined to stay aligned with evolving user interactions and tasks.

Final Conclusions: Toward Sustainable Resource-Efficient AI

The researchers conclude that switching from LLMs to SLMs can improve agentic AI system efficiency and durability, particularly for repetitive, narrowly-focused tasks. SLMs, they argue, are more powerful, cost-effective and more suited to such roles than general-purpose LLMs. Mixing models can be recommended in situations requiring greater conversational capabilities. They invite comments and feedback on their position, and commit to sharing responses in public. It is hoped that this will inspire future AI technology users to be more resource efficient and thoughtful.

Click here to find out more Paper. This research is the work of researchers. Also, feel free to follow us on Twitter Join our Facebook group! 100k+ ML SubReddit Subscribe Now our Newsletter.

Sana Hassan is a dual-degree IIT Madras student and consulting intern with Marktechpost. She loves to apply technology and AI in order to solve real-world problems. Sana Hassan, an intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to real-world challenges.

SLMs: A New Agentic AI Model? Efficiency, Cost and Deployment

OpenMythos Coding Tutorial: Recurrent-Depth Transformers, Depth Extrapolation and Mixture of Experts Routing

OpenAI Releases GPT-5.5, a Absolutely Retrained Agentic Mannequin That Scores 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval

Mend Releases AI Safety Governance Framework: Masking Asset Stock, Danger Tiering, AI Provide Chain Safety, and Maturity Mannequin

Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Coaching Structure Attaining 88% Goodput Below Excessive {Hardware} Failure Charges

OpenAI’s Fidji Simo Plans to Make ChatGPT Means Extra Helpful—and Have You Pay For It

Schematik Is ‘Cursor for Hardware.’ The Anthropics Want In

Apple Is Pushing AI Into More of Its Products—but Still Lacks a State-of-the-Art Model

Amazon Explains how its AWS outage brought down the web

Mira Murati’s Stealth AI Lab launches its first product

Top Insights

Sam Altman says the GPT-5 haters are wrong.

How do you design an interactive dash and plotly dashboard for local and online deployment?

Latest News

Apple’s new CEO must launch an AI killer product

OpenMythos Coding Tutorial: Recurrent-Depth Transformers, Depth Extrapolation and Mixture of Experts Routing

SLMs: A New Agentic AI Model? Efficiency, Cost and Deployment

Agentic AI Systems Needs to Shift.

The Case for SLMs: Their Efficiency, Applicability and Suitability

Why SLMs suffice for agentic operations

Arguments in Favor of LLM Dominance

The Transition from LLMs and SLMs Framework

Final Conclusions: Toward Sustainable Resource-Efficient AI

Related Posts