It is labor-intensive to create illustrations that are publication-ready. This bottleneck can slow down the workflow. AI scientists are able to handle code and literature reviews, but they still struggle with visualizing complex findings. A research team from Google and Peking University introduce new framework called ‘PaperBanana‘ which is changing that by using a multi-agent system to automate high-quality academic diagrams and plots.
5 Specialized Agents and the Architecture
PaperBanana Does not rely solely on a prompt. The orchestrate a team that collaborates. Five agents Transform raw text into professionally designed visuals.

Phase 1: Linear Planning
- Retriever AgentThis is a list of all the identifying characteristics. 10 The database will provide the most appropriate examples to help guide your style and format.
- Planning AgentTransforms a technical method text into an in-depth textual description for the desired figure.
- Stylist AgentActs in the capacity of a designer to ensure that the end product is consistent with the “NeurIPS Look” Using specific colors palettes and layouts.
Phase 2: Iterative Refinement
- Visualizer Agent: Converts the text into a visually appealing output. It uses images models such as Nano-Banana-Pro. The executable is used for plots that are statistical. Python Matplotlib code.
- Critic AgentThe generated image is compared to the text source in order to detect any factual or visual errors. It gives feedback. 3 There are several rounds of refinement.
NeurIPS 2025 Benchmark:

Research team presented PaperBananaBenchA dataset is a database of 292 Actual test cases are used to create the tests NeurIPS 2025 publications. You can use a VLM-as-a-Judge They compared PaperBanana Compare the baselines to those of leading companies.
| Metric | Improved Over Baseline |
| Overall Score | +17.0% |
| Conciseness | +37.2% |
| Readability | +12.9% |
| Aesthetics | +6.6% |
| Faithfulness | +2.8% |
The system excels in ‘Agent & Reasoning’ diagrams, achieving a 69.9% Overall score. It also provides an automated ‘Aesthetic Guideline’ that favors ‘Soft Tech Pastels’ over harsh primary colors.
Image or Code? Image
Standard image models lack the numerical precision needed to produce statistical plots. PaperBanana The Visualizer Agent can solve this problem by writing code rather than drawing pixels.
- Image Generation: Excels in aesthetics but often suffers from ‘numerical hallucinations’ or repeated elements.
- Code-Based generationYou can rely on us to deliver. 100% The Matplotlib library is used to create the final plot.
AI Research: Domain-Specific Aesthetics Preferences
As per the PaperBanana Aesthetic choices are often adapted to the different expectations of scholarly communities based on a style guide.
| Domain Research | Visual ‘Vibe‘ | Key Design Elements |
| Agent & Reasoning | Illustrative, Narrative, “Friendly” | The 2D Vector Robots are human avatars in vector form, as well as emojis. “User Interface” Aesthetics (chat-bubbles, iconographic documents) |
| Computer Vision & 3D | Spatial, Dense, Geometric | The RGB color code for axis correspondence, as well as rays lines, cloud points, and point clouds. |
| Generative & Learning | Modular, Flow-oriented | Three-dimensional cuboids can be used to calculate tensors or matrix grids. “Zone” Use light pastels as fillers to create group logic |
| Theory & Optimization | Minimalist, Abstract, “Textbook” | Nodes in graphs are circles, manifolds are planes and the palette is a grayscale with a single color highlight. |
Visualization Paradigms Comparison
The framework highlights the tradeoffs between an image-generation model (IMG), and executable code.
| Features | Image Generation for Plots | Matplotlib – Plots by Coding |
| Aesthetics | Plots are generally higher. “visually appealing” | Standard academic and professional look |
| Fidelity | Low; susceptible to “numerical hallucinations” Or element repetition | 100% accuracyThe raw data is provided. |
| Readability | Data complexity is a problem for complex datasets, but it’s not a problem with sparse data. | High consistency; no errors when handling dense data or multiple series |
What you need to know
- Multi-Agent Collaborative Framework: PaperBanana is a reference-driven system that orchestrates 5 specialized agents—The Retriever is a planner, stylist, visualizer and critic—to transform raw technical text and captions into publication-quality methodology diagrams and statistical plots.
- Dual-Phase Generating ProcessWorkflow consists of the following: Linear Planning Phase Retrieve reference examples, set aesthetic guidelines and then a 3-round Iterative Refinement Loop Where the Visualizer agent generates the image with higher accuracy after the critic agent has identified errors.
- Excellent Performance in PaperBananaBenchThe framework performed better than vanilla baselines when compared to 292 testcases from NeurIPS2025. Overall Score (+17.0%), Conciseness (37.2%)., Readability (+12.9%).Then, Aesthetics (+6.6%).
- Statistical plots with a Precision FocusIn order to generate statistical data the system changes from direct images generation Matplotlib Python executable codeThis hybrid approach guarantees numerical precision while eliminating errors “hallucinations” common in standard AI image generators.
Take a look at the Paper You can also find out more about the following: Repo. Also, feel free to follow us on Twitter Join our Facebook group! 100k+ ML SubReddit Subscribe now our Newsletter. Wait! Are you using Telegram? now you can join us on telegram as well.


