Auteur: Gavin Wallace

Tech

Bringing AI Agents Into Any UI: The AG-UI Protocol for Real-Time, Structured Agent–Frontend Streams

By Gavin Wallace18/09/2025

They are not just simple chatbots. These systems are evolving to become complex systems capable of reasoning step-by-step, calling APIs, updating dashboards and collaborating with humans. It raises the question of how to deal with this. What should the agent say to the user interface? Custom APIs and Ad-hoc sockets may work well for prototyping, but are not scalable. The way each project handles user corrections, tools calls or output streams is different. The gap is the exact reason the AG-UI (Agent–User Interaction) Protocol aims to be filled. AG-UI: What it brings to table AG-UI It is not a Streaming…

AI Humanoids are Here: Move aside, chatbots!

By Gavin Wallace18/09/2025

Michael Calore: Yeah. There’s an entirely different topic.Kylie Robison: It is. Will, do you have any recommendations?Will Knight This is what I recommend as a useful and practical home robot. Leono, my cat is very nice. He brings in animals that are dead or live, and he does it overnight. Many times, large rabbits will run about.Kylie Robison: Oh, my God!Will Knight The cat flap is equipped with a computer and camera that can detect if there’s a mouse or other object in your cat’s mouth. It will say: “Contraband detected,” And then stop them coming in. I find that…

AI Psychosis Rarely Is Psychosis At All

By Gavin Wallace18/09/2025

New Trend In psychiatric institutions, a new trend is emerging. In crisis, people arrive in psychiatric hospitals with grandiose, often dangerous, delusions and paranoid thinking. They all have one thing in common: they are having marathon chats with AI bots.WIRED interviewed more than a dozen researchers and psychiatrists who have become increasingly worried. In San Francisco UCSF psychiatrist Keith Sakata has said that he’s counted a 12 cases serious enough to require hospitalization in this year. artificial intelligence “played a significant role in their psychotic episodes.” In the wake of this crisis, headlines are now using a more catchy definition:…

Tech

Alibaba releases Tongyi DeepResearch, a 30B-parameter open-source agentic LLM optimized for long-term research.

By Gavin Wallace18/09/2025

Alibaba Tongyi Lab opens-sourced Tongyi-DeepResearch-30B-A3BThe model uses a mixture-of-experts (MoE) design with web-based tools. Model uses mixture of experts (MoE). ~30.5B total parameters and ~3–3.3B active per tokenIt allows for high performance reasoning while maintaining high-throughput. It targets multi-turn research workflows—searching, browsing, extracting, cross-checking, and synthesizing evidence—under ReAct-style tool use and a heavier test-time scaling mode. It includes inferences scripts (Apache 2.0), weights, and evaluation tools. Benchmarks: What they show? Tongyi DeepResearch report The latest results from agentic suites To test frequently “deep research” agents: Humanity’s last exam (HLE), 32.9, BrowseComp: 43.4 (EN) 46.7 (ZH), xbench-DeepSearch: 75,WebWalkerQA also scored well…

Tech

H Company releases Holo1.5, an open-weight computer-use VLMs focused on GUI localization and UIVQA

By Gavin Wallace18/09/2025

H Company, (a french AI startup), releases Holo1.5. Holo1.5 is a set of vision models that are open-source and designed for agents to interact with real-world user interfaces using screenshots or keyboard/pointer actions. Release 3B includes 7B and 72B Checkpoints that have a documented accuracy improvement of 10% over Holo1 in all sizes. Apache 2.0 is the 7B model; 3B and 72B are inherited from upstream base. The series focuses two capabilities important for CU Stacks: precise UI elements localization (coordinate estimation) and UI-VQA for state comprehension. https://www.hcompany.ai/blog/holo-1-5 Why is localization of UI elements important? The localization process is the…

Tech

IBM AI Releases Granite-Docling-258M: An Open-Source, Enterprise-Ready Document AI Model

By Gavin Wallace18/09/2025

IBM released a new release. Granite-Docling-258M, a vision-language (Apache 2.0) model that is open source and designed for document conversion from end to end. The model targets layout-faithful extraction—tables, code, equations, lists, captions, and reading order—emitting a structured, machine-readable representation rather than lossy Markdown. Hugging face has both a live Demo and MLX Build for Apple Silicon. What makes SmolDocling different? Granite-Docling replaces SmolDocling256M as a ready-to-use product. IBM has replaced its earlier backbone by a Granite 165M Upgraded language encoder and the visual model SigLIP2 (base, patch16-512) While retaining Idefics3’s style connector (pixel-shuffle-projector). This model, which has 258M variables,…

Tech

Meta AI Researchers Launch MapAnything: An Finish-to-Finish Transformer Structure that Instantly Regresses Factored, Metric 3D Scene Geometry

By Gavin Wallace17/09/2025

A group of researchers from Meta Actuality Labs and Carnegie Mellon College has launched MapAnything, an end-to-end transformer structure that straight regresses factored metric 3D scene geometry from photographs and non-compulsory sensor inputs. Launched beneath Apache 2.0 with full coaching and benchmarking code, MapAnything advances past specialist pipelines by supporting over 12 distinct 3D imaginative and prescient duties in a single feed-forward go. https://map-anything.github.io/belongings/MapAnything.pdf Why a Common Mannequin for 3D Reconstruction? Picture-based 3D reconstruction has traditionally relied on fragmented pipelines: characteristic detection, two-view pose estimation, bundle adjustment, multi-view stereo, or monocular depth inference. Whereas efficient, these modular options require task-specific…

Tech

How can I build a voice-based AI that is end-toend using hugging face pipelines?

By Gavin Wallace17/09/2025

The tutorial builds an advanced voice AI using Hugging Face’s free models. It is simple and easy to use on Google Colab. The transformers pipelines combine Whisper to recognize speech, FLAN T5 to understand natural language, and Bark, for speech synthesizing. This allows us to avoid heavy dependency, complicated setups or API keys. We focus instead on how we can transform voice input into meaningful conversations and receive natural-sounding voices in real time. See the FULL CODES here. !pip -q install “transformers>=4.42.0” accelerate torchaudio sentencepiece gradio soundfile import os, torch, tempfile, numpy as np You can import Gradio into a gr…

Content Creation

What Happened? We made a choice that reduced our ARR and MRR.

By Gavin Wallace17/09/2025

Recenty, we realized that our method of calculating MRR or ARR didn’t give us the most accurate picture of our company.We decided a few months back to cancel Buffer’s annual subscriptions for 1,361 legacy accounts that were inactive. They were told that Buffer is always free and they can sign up for an Annual Plan again.We were expecting a drop of $14,000 in monthly recurring revenues (MRRs) after sending the email and cancelling the annual plans. However, the figures didn’t change.When we did not see an immediate effect of the cancellations, we knew there was a problem. These cancellations are…

Nvidia CEO Jensen Huang Is Bananas for Google Gemini’s AI Image Generator

By Gavin Wallace17/09/2025

Nvidia Jensen Huang the CEO is in London. He stands in front of an audience of reporters and declares that he has a great love for Gemini’s Nano Banana. “How could anyone not love Nano Banana? I mean Nano Banana, how good is that? Tell me it’s not true!” He speaks to the group. He addresses the room. Tell me that it isn’t true. I love it. It’s so good. [Hassabis, CEO of DeepMind] yesterday and I said ‘How about that Nano Banana! What a great idea!It looks like lots of people agree with him: The popularity of the Nano…

Auteur: Gavin Wallace

Bringing AI Agents Into Any UI: The AG-UI Protocol for Real-Time, Structured Agent–Frontend Streams

AI Humanoids are Here: Move aside, chatbots!

AI Psychosis Rarely Is Psychosis At All

Alibaba releases Tongyi DeepResearch, a 30B-parameter open-source agentic LLM optimized for long-term research.

H Company releases Holo1.5, an open-weight computer-use VLMs focused on GUI localization and UIVQA

IBM AI Releases Granite-Docling-258M: An Open-Source, Enterprise-Ready Document AI Model

Meta AI Researchers Launch MapAnything: An Finish-to-Finish Transformer Structure that Instantly Regresses Factored, Metric 3D Scene Geometry

How can I build a voice-based AI that is end-toend using hugging face pipelines?

What Happened? We made a choice that reduced our ARR and MRR.

Nvidia CEO Jensen Huang Is Bananas for Google Gemini’s AI Image Generator

Top Insights

Anthrogen Introduces the Odyssey Model: A 102B-parameter protein language model that replaces attention with consensus and trains with discrete diffusion

A Coding Information to Exploring nanobot’s Full Agent Pipeline, from Wiring Up Instruments and Reminiscence to Abilities, Subagents, and Cron Scheduling

Latest News

Meet GitHub Spec-Equipment: An Open Supply Toolkit for Spec-Pushed Improvement with AI Coding Brokers

Build a single-cell RNA-seq analysis pipeline with Scanpy to perform PBMC clustering, annotation, and trajectory discovery