Liquid AI is now available. LFM2-24B-A2BThe model is designed for low latency local tool distribution. LocalCoworkOpen-source desktop agents are available on their Liquid4All GitHub Cookbook. It provides an architecture deployable for enterprise workflows that can be executed entirely on the device, removing API calls as well as data exit for sensitive environments.
Architectural and serving configuration
LFM2-24B A2B uses the Sparse MoE architecture to ensure low-latency performance on consumer hardware. Although the model contains 24 trillion parameters in total it activates only about 2 billion parameters for each token when inference is performed.
The model can maintain its broad knowledge base by using this structural design, while reducing computational overhead for every generation. Liquid AI stressed-tested the Liquid AI model with the following hardware/software stack:
- Hardware: Apple M4 Max, 36 GB unified memory, 32 GPU cores.
- Serving Engine
llama-serverFlash Attention is enabled. - Quantization:
Q4_K_Mformat. - Memory Footprint The RAM is 14.5GB.
- Hyperparameters: The temperature is set at 0.1. Top_p and max_tokens are both 0.1.
LocalCowork Tool Integration
LocalCowork, an offline AI desktop agent, uses the Model Context Protocol to run pre-built AI tools. It does not rely on cloud APIs and doesn’t compromise data privacy. Every action is recorded in a local audit log. It includes 75 tools on 14 MCP server capable of performing tasks like OCR and security scanning. The demo provided focuses instead on a subset of highly reliable and curated tools, spread across six servers. Each tool has been rigorously tested for over 80% accuracy in completing a single step, as well as verified participation in multi-step chains.
LocalCowork implements this model in practice. The system is available offline. The preconfigured tools include a full suite of business-grade applications:
- File Operation: Browse, read, and search the filesystem of the host.
- Security scanning: Local directories can be used to identify API keys, and personally identifiable information (PII).
- Document processing: The OCR process, which includes parsing, diffing, and generating of PDFs.
- The Audit Logging System To track compliance, it is important to record every call made by an instrument locally.
Performance Benchmarks
The Liquid AI Team evaluated the model using 100 one-step prompts for tool selection and 50 chains of multi-steps (requiring three to six discrete tool operations, including searching, OCRing, parsing, exporting, etc.).
Latency
This model is averaged ~385 ms per tool-selection response. The sub-second dispatch is ideal for applications that require immediate feedback.
Accuracy
- Single-Step Executions: 80% accuracy.
- Multi-Step Chains: Completion rate: 26%
What you need to know
- Privacy First Local Execution LocalCowork is a mobile-only solution that does not rely on cloud APIs or data exfiltration. This makes it ideal for enterprise environments with strict privacy requirements.
- Efficient MoE Architecture: LFM2-24B-A2B uses a Sparse Mixture-of-Experts design (MoE), activating 2 billion of the 24 billion parameters for each token. This allows it to comfortably fit within a 14.5GB RAM footprint.
Q4_K_Mquantization. - Hardware with a Sub-Second Lag: Benchmarked on the Apple M4 Max, this model has an average of 385 ms latency when executing tool selections, making it possible to have highly interactive real-time work flows.
- Standardized MCP Tool Integration: The agent leverages the Model Context Protocol (MCP) to seamlessly connect with local tools—including filesystem operations, OCR, and security scanning—while automatically logging all actions to a local audit trail.
- With Multi-Step limits, you can achieve high accuracy in single-step measurements. The model achieves 80% accuracy on single-step tool execution but drops to a 26% success rate on multi-step chains due to ‘sibling confusion’ (selecting a similar but incorrect tool), indicating it currently functions best in a guided, human-in-the-loop loop rather than as a fully autonomous agent.
Click here to find out more Repo The following are some examples of how to get started: Technical details. Also, feel free to follow us on Twitter Join our Facebook group! 120k+ ML SubReddit Subscribe now our Newsletter. Wait! Are you using Telegram? now you can join us on telegram as well.

