DeepSWE is a fully-open-source, state-of the-art software engineering agent, which has been trained exclusively through reinforcement learning. DeepSWE is a state-of-the-art, fully open-source software engineering agent that was built using the Qwen3 32B language model. DeepSWE achieved 59% accuracy in the SWEBench verified benchmark. Together AI has made a big shift with this launch, shifting from pre-training pipelines to creating autonomous agents who continuously learn and grow through feedback.
Code Generation meets Reinforcement Education
DeepSWE, Agentica’s reinforcement learning framework designed for language agents (rLLM), is the outcome of the post-training Qwen332B model. In contrast to conventionally supervised approaches to fine tuning, rLLM allows agents adapt to actual workflows by gaining experience. DeepSWE is specifically designed to handle complex software engineering problems using feedback loops rather than static data.
The training pipeline incorporates Agentica’s R2EGym dataset—a software engineering benchmark designed for RL-style agent development. This framework is designed to train language models for action-oriented tasks, such as fixing bug, completing functions or editing code. DeepSWE is more in line with the way human engineers learn and iterate.
Performance Benchmarks & Capabilities
DeepSWE has a score of 59% when it comes to test-time scale on SWEBench Verified – the toughest benchmark available for software engineering agents. It is a significant improvement over previous open-weighted models. In Pass@1 evaluations—which measure the probability that the agent solves a problem correctly on the first attempt—DeepSWE reaches an impressive 42.2%.
These results demonstrate the potential of RL-based learning to improve agent behavior. Particularly in domains that demand iterative reasoning with precise outputs such as code synthesis. This model, inherited directly from Qwen332B’s architecture allows for it to scale efficiently while still being suitable for applications in the real-world.

The core of Open Source is Reproducibility and Open Source
The full transparency of the release stands out as a feature. AI and Agentica open-sourced together not only the DeepSWE models, but the entire recipe including the R2EGym dataset and scripts for training configuration. The open-source model allows for reproducibility, and the research and development communities are invited to build on DeepSWE.
DeepSWE rLLM is available to developers through the following links:
Now that Language Agents are available, we have moved from Language Reasoners.
DeepSWE represents a shift in both philosophy and practice: instead of building models to reason about languages, it builds agents who learn by interacting. LLMs that are based on traditional methods have strong reasoning skills, but lack the flexibility to learn from feedback and improve over time. These models are able to perform better than expected at the start, and improve with time.
Using this method also allows for local deployment. Because DeepSWE is fully open-source and modular, it can be extended and retrained for organization-specific use cases. Researchers and developers are able to build agents on DeepSWE utilizing rLLM. These agents can be used in diverse domains including web navigation or robotics.
The conclusion of the article is:
DeepSWE represents a significant milestone in the development of AI-generated software. Together AI has enabled a future in which agents will not only be pre-trained, but also continuously trained and improved. This is achieved by applying reinforcement learning on large language models such as Qwen332B, and releasing their entire training infrastructure. This shift from language-based agency to an action-oriented one has profound implications in programming, automation, intelligent system design, and other areas.
The researchers are the sole owners of all credit. Also, feel free to follow us on Twitter Don’t forget about our 100k+ ML SubReddit Subscribe now our Newsletter.
Asif Razzaq, CEO of Marktechpost Media Inc. is a visionary engineer and entrepreneur who is dedicated to harnessing Artificial Intelligence’s potential for the social good. Marktechpost was his most recent venture. This platform, which focuses on machine learning and deep-learning news, is known to be both technically solid and understandable for a broad audience. Over 2 million views per month are a testament to the platform’s popularity.


