Simon Willison Refactors LLM Library to Support Frontier Reasoning and Tools

Simon Willison

Apr 30, 2026

Simon Willison released an alpha version of his LLM Python library and CLI tool that moves beyond simple text prompts to support complex message sequences. The update introduces a streaming architecture for handling reasoning tokens and tool calls to maintain compatibility with frontier model capabilities.

Simon Willison, the creator of Datasette, released version 0.32a0 of his LLM Python library and CLI tool. The update replaces text-only prompts with a message-based system and a streaming architecture designed to handle reasoning tokens (internal model thinking and logic steps) and tool calls.

As frontier models shift toward agentic workflows, they no longer return simple strings. Models like Claude now produce internal reasoning and structured tool requests alongside standard text. This refactor ensures developers can capture these distinct typed parts without the library's core logic breaking or conflating different output streams.

You can now use the messages=[] array to pass conversation histories to models like GPT-5.5. The CLI highlights reasoning tokens in a different color and provides a --no-reasoning flag to suppress them. The alpha is available now via the LLM GitHub repository for testing.

View the full update on simonwillison.net

Simon Willison

@simonwApr 29

I released LLM 0.32a0 this morning, a major backwards-compatible refactor of my LLM Python library and CLI tool for working with language models - the new changes should help LLM work better with reasoning models and other new frontier capabilities https://t.co/iLhtLrCQCL

370

View on X

Still wondering? A few quick answers below.

LLM is an open-source utility created by Simon Willison that provides a unified interface for interacting with thousands of different language models. It works as both a Python library and a command-line tool, using a plugin system to connect to proprietary APIs like OpenAI and Anthropic or local models running on a user's own hardware.

This alpha release introduces a major refactor that moves beyond simple text prompts to support a sequence of messages with specific roles like user and assistant. It also adds a new streaming architecture that can handle mixed content types, including text, reasoning tokens, and tool calls, allowing the library to better represent the outputs of modern frontier models.

The updated library can now distinguish between reasoning tokens, which represent a model's internal thinking process, and the final response text. In the command-line interface, these thinking tokens are displayed in a different color and sent to a separate output stream. Users can also suppress these tokens entirely using the new -R or --no-reasoning flag.

Yes, the 0.32a0 release is designed to be a backwards-compatible refactor. While it introduces new ways to interact with models using message arrays and event streams, the previous prompt-based methods still work. Under the hood, the library automatically converts single text prompts into the new message-based format to ensure existing code and workflows continue to function.

The alpha release introduces native serialization methods called to_dict and from_dict. These functions allow developers to convert a model response into a JSON-style dictionary that can be stored in any database or file system. This provides a flexible way to persist and reconstruct complex conversations without being forced to use the library's built-in SQLite logging system.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Keep reading

Simon Willison Releases Datasette Agent to Query SQLite Databases with Natural Language

Simon Willison released the first alpha of Datasette Agent, an extensible AI assistant that allows users to query SQLite databases using natural language. By integrating his existing LLM library, the tool enables a model-agnostic approach to data analysis that can be customized through a new plugin architecture.

OpenClaw Adds Kimi K2.6 Support and Provider Aware Reasoning Controls

OpenClawApr 22

OpenClaw Adds Kimi K2.6 Support and Provider Aware Reasoning Controls

OpenClaw v2026.4.20 introduces native support for Moonshot's Kimi K2.6 and a unified reasoning logic that normalizes internal thinking tokens across different model providers. This update stabilizes agentic flows on messaging platforms like iMessage while adding tiered pricing for more accurate token cost estimates.

NVIDIA Hardens Dynamo to Match Frontier Agent Performance on Custom Stacks

NVIDIAMay 9

NVIDIA Hardens Dynamo to Match Frontier Agent Performance on Custom Stacks

NVIDIA updated its Dynamo inference framework to support the specific multi-turn requirements of agent harnesses like Claude Code and Codex. The update eliminates infrastructure friction that causes reasoning drift and cache misses, allowing developers to run complex agents on private stacks with the same fidelity as managed frontier endpoints.

LangChain Deep Agents v0.6 Streams Parallel Subagent Progress

LangChainJun 7

LangChain Deep Agents v0.6 Streams Parallel Subagent Progress

LangChain has released Deep Agents v0.6, introducing a Streaming feature that supports highly parallelized AI agent systems. This update enables real-time progress tracking for tools and subagents, addressing a key challenge in observing complex multi-agent workflows.

What is the LLM Python library and CLI tool?

What are the major changes in LLM version 0.32a0?

How does LLM 0.32a0 handle reasoning models like Claude?

Is LLM 0.32a0 backwards compatible with older versions?

How can developers save and load model responses in the new LLM alpha?

Keep reading

Simon Willison Releases Datasette Agent to Query SQLite Databases with Natural Language

Simon Willison Releases Datasette Agent to Query SQLite Databases with Natural Language

OpenClaw Adds Kimi K2.6 Support and Provider Aware Reasoning Controls

OpenClaw Adds Kimi K2.6 Support and Provider Aware Reasoning Controls

NVIDIA Hardens Dynamo to Match Frontier Agent Performance on Custom Stacks

NVIDIA Hardens Dynamo to Match Frontier Agent Performance on Custom Stacks

LangChain Deep Agents v0.6 Streams Parallel Subagent Progress

LangChain Deep Agents v0.6 Streams Parallel Subagent Progress

Keep reading

Simon Willison Releases Datasette Agent to Query SQLite Databases with Natural Language

Simon Willison Releases Datasette Agent to Query SQLite Databases with Natural Language

OpenClaw Adds Kimi K2.6 Support and Provider Aware Reasoning Controls

OpenClaw Adds Kimi K2.6 Support and Provider Aware Reasoning Controls

NVIDIA Hardens Dynamo to Match Frontier Agent Performance on Custom Stacks

NVIDIA Hardens Dynamo to Match Frontier Agent Performance on Custom Stacks

LangChain Deep Agents v0.6 Streams Parallel Subagent Progress

LangChain Deep Agents v0.6 Streams Parallel Subagent Progress