GLM-5.1 Tool Calling Issue Fix & Chat Template Update If you are running GLM-5.1 with vLLM/SGLang and using tool calling, please update your chat template. https://t.co/XyyCucws82 Issue When using tool calling, frameworks including vLLM automatically convert plain-text tool message content into an array of content parts (`[{"type": "text", "text": "..."}]`) before passing it to the chat template. The original template only supported string-formatted tool content, causing array-formatted tool outputs to render empty. As a result, the model does not receive tool results and repeatedly triggers the same tool call in a loop. Affected Models All GLM-5.1 variants deployed with vLLM or SGLang. Fix Simply replace your existing `chat_template.jinja` with the updated version from the repository.
Zhipu AI releases GLM-5.1 template fix to stop infinite tool calling loops
· Updated
Zhipu AI, the lab behind the GLM model series, released a critical update to the
chat_template.jinja file for GLM-5.1. The update addresses a parsing error occurring when the model is deployed using inference (running a trained model to generate outputs) frameworks like vLLM or SGLang.These frameworks automatically convert plain-text tool results into structured arrays, which the original template could not read. Because the model perceived these results as empty, it would repeatedly trigger the same tool call in an infinite loop. This broke the model's agentic (autonomous task-solving) capabilities for self-hosted deployments.
To resolve the issue, replace the existing chat_template.jinja file in your local deployment with the updated version. The fix applies to all GLM-5.1 variants, including the 744B Mixture-of-Experts model. Restoring this template is essential for workflows involving multi-step reasoning or external tool integration.
Z.ai
@Zai_org
81retweets1.1klikes
View on X