Meta FAIR releases Code World Model, a 32-billion parameter open-weights weights model to advance research on code generation with world models

Meta FAIR release Code World Model (CWM)A 32-billion parameter dense LLM with only decoder that injects World Modeling into code generation by training on execution traces and long-horizon agent–environment interactions—not just static source text.

Learn code with the new version of Code by Predicting execution?

CWM mid-trains on two large families of observation–action trajectories: (1) Python interpreter traces After each line of code, record the local variables. Interactions between Dockerized repositories and their agents This grounding is intended to teach semantics (how state evolves) rather than only syntax. The goal of this grounding is to introduce semantics, or how state changes over time.

The research team developed a collection scale to collect data at a larger volume. You can download executable images Thousands of GitHub Projects and multi-step trajectories are gathered by a software engineering agent.”ForagerAgent”). Release reports Trajectories 3M The issue was fixed in 10k of the images, and 3.15k repos.

https://ai.meta.com/research/publications/cwm-an-open-weights-llm-for-research-on-code-generation-with-world-models/

Window with Model and Context

CWM is the acronym of Cement and Concrete Management. Transformer only with decoder No MoE with 64-layers, GQA (48Q/8KV)., SwiGLU, RMSNormThen, The Rope is Scaled. Attention Alternates Local 8k The following are some examples of how to get started: Global 131k Enabling sliding-window blocks 131k tokens effective context; training uses document-causal masking.

Training recipe (pre → mid → post)

Preparation for General Training8T (code-heavy), 8k context.
Mid-training+5T tokens (131k) in long context with Python execution trace, ForagerAgent Data, PR-derived differences, IR/compilers and Triton kernels.
Post-training: 100B-token SFT Instruction + Reasoning multi-task RL (~172B-token) across verifiable coding, math, and multi-turn SWE environments using a GRPO-style algorithm and a minimal toolset (bash/edit/create/submit).
A quantitative inference can be made on the basis of a Single 80 GB H100.

Benchmarks

According to the research team, these are some of the findings pass@1 / scores Test-time Scaling (where applicable)

SWE-bench Verified: 65.8% (with test-time scaling).
LiveCodeBench-v5: 68.6%; LCB-v6: 63.5%.
Math-500: 96.6%; AIME-24: 76.0%; AIME-25: 68.2%.
CruxEval-Output: 94.3%.

CWM is competitive against open weights of similar size and with models that are larger on SWE Bench Verified.

See the following for more information on SWE Bench Verified task design metrics. official benchmark resources.

https://ai.meta.com/research/publications/cwm-an-open-weights-llm-for-research-on-code-generation-with-world-models/

Why is world modelling important for code?

This release highlights two operational abilities:

Execution-trace prediction: given a function and a trace start, CWM predicts stack frames (locals) and the executed line at each step via a structured format—usable as a “neural debugger” You can use grounded reasoning to make decisions without actual live execution.
Codes for AgenticMulti-turn reasoning using tool usage against real repos verified by hidden test and reward for patch similarity; setup teaches the model how to locate faults and generate End-to-end patch Use git diff instead of snippets.

It is worth noting some details

TokenizerLlama-3 Family with Reserved Control Tokens. The reserved IDs serve to distinguish trace and reasoning segments when SFT is performed.
Attention layout” 3:1 local:global Interleave occurs across the entire depth. Long-context training takes place at large token batch sizes Stabilize gradients
Compute scalingThe learning rate/batch sizes are determined by internal scaling laws tailored to long context overheads.

You can read more about it here:

CWM is a pragmatic step toward grounded code generation: Meta ties a 32B dense transformer to execution-trace learning and agentic, test-verified patching, releases intermediate/post-trained checkpoints, and gates usage under the FAIR Non-Commercial Research License—making it a useful platform for reproducible ablations on long-context, execution-aware coding without conflating research with production deployment.

Take a look at the Paper, GitHub PageThen, Model on Hugging Face. Please feel free to browse our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter Join our Facebook group! 100k+ ML SubReddit Subscribe now our Newsletter.

Asif Razzaq, CEO of Marktechpost Media Inc. is a visionary engineer and entrepreneur who is dedicated to harnessing Artificial Intelligence’s potential for the social good. Marktechpost is his latest venture, a media platform that focuses on Artificial Intelligence. It is known for providing in-depth news coverage about machine learning, deep learning, and other topics. The content is technically accurate and easy to understand by an audience of all backgrounds. Over 2 million views per month are a testament to the platform’s popularity.

Meta FAIR releases Code World Model, a 32-billion parameter open-weights weights model to advance research on code generation with world models

Cursor Releases TypeScript-based SDKs for Building Coding Agents with Sandboxed Cloud Virtual Machines, Subagents Hooks and Token Based Pricing

The smol Audio Notebook: An Adaptive Collection of Notebooks for Whisper, Parakeet Voxtral Granite Speech and Audio Flamingo 3.

Qwen Team Releases FlashQLA: a High-Performance Linear Attention Kernel Library That Achieves Up to 3× Speedup on NVIDIA Hopper GPUs

The Top 10 Compression techniques for LLM inference using KV cache: Reduced memory overhead across evictions, low-rank methods, and quantization

Microsoft Agent 365 tries its best to become the AI Bot Boss

OpenAI’s open-weight models are coming to US Military

Elon Musk Boosts New Yorker’s Sam Altman Exposé on X as Trial Begins

AI backlash keeps growing stronger

Jon M. Chu says AI couldn’t have made one of Wicked’s best moments

Top Insights

Do AI Models Act Like Insider Threats? Anthropic Simulations say yes

Agoda Open Sources’ APIAgent Converts Any REST or GraphQL API Into an MCP Server With Zero Code

Latest News

Cursor Releases TypeScript-based SDKs for Building Coding Agents with Sandboxed Cloud Virtual Machines, Subagents Hooks and Token Based Pricing

The smol Audio Notebook: An Adaptive Collection of Notebooks for Whisper, Parakeet Voxtral Granite Speech and Audio Flamingo 3.

Meta FAIR releases Code World Model, a 32-billion parameter open-weights weights model to advance research on code generation with world models

Learn code with the new version of Code by Predicting execution?

Window with Model and Context

Training recipe (pre → mid → post)

Benchmarks

Why is world modelling important for code?

It is worth noting some details

You can read more about it here:

Related Posts