Weaviate Adds PDF Support to Agent Skills for Autonomous Document Ingestion

Weaviate AI Database

Apr 8, 2026 · Updated Apr 25, 2026

Weaviate added PDF import capabilities to its Agent Skills framework, allowing AI agents to autonomously configure schemas and ingest document libraries. By combining multimodal embeddings with the MUVERA algorithm, the system enables high-accuracy multi-vector retrieval without the typical memory and cost overhead.

Weaviate, an AI database for building applications, added PDF support to its Agent Skills framework. Using a single prompt, agents like Claude Code or Cursor can now autonomously set up collections and ingest PDFs using ColModernVBERT, a multimodal model (AI that understands text and images together) for page-level embedding.

Multi-vector retrieval (a search method using multiple data points for higher accuracy) is often too resource-intensive. This update integrates MUVERA, an algorithm that compresses complex embeddings into an efficient format. This allows you to achieve high-quality retrieval while significantly reducing the computational and memory costs usually required for large-scale search.

You can now point any compatible agent at a document library to build a searchable database without writing manual ingestion logic. The skill is available via GitHub and can be installed using npx skills add or as a Claude Code plugin. CSV and JSON formats are also supported for agent-led imports.

View the full update on github.com

Weaviate AI Database

@weaviate_ioApr 7

PDF import just landed in Weaviate Agent Skills! Point Claude Code (or any agent) at a PDF, and it handles everything from schema setup, collection configuration, to embedding each page using ColModernVBERT multimodal model. MUVERA then makes multi-vector retrieval efficient by reducing memory usage and computational cost while preserving retrieval quality. In short: your entire document library becomes searchable in Weaviate with a single prompt. CSV, JSON, and JSONL are already supported! Try it: https://t.co/CvQ30X2pcU

View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Keep reading

Weaviate Agent Skills Teaches Coding Agents Its Vector Database

Weaviate Agent Skills Teaches Coding Agents Its Vector Database

Weaviate released agent skills that teaches coding agents how to correctly work with its vector database, covering search, schema management, and full RAG application patterns. Agents often hallucinate Weaviate syntax - this gives them accurate procedural knowledge.

LlamaIndex 🦙Mar 21

LlamaIndex Releases LiteParse Agent Skill for Local Document Parsing

LlamaIndex open-sourced LiteParse with a ready-to-use agent skill that lets coding agents parse PDFs and documents locally. Agents can extract text, generate screenshots, and process files as part of their reasoning loop — all without cloud APIs.

LangChain LangSmith Fleet Shares Agent Skills Across Teams, Keeps Them Synced

LangChainJun 7

LangChain LangSmith Fleet Shares Agent Skills Across Teams, Keeps Them Synced

LangChain has updated its LangSmith Fleet platform to support shareable skills for AI agents. This allows domain experts to codify specialized knowledge once, ensuring consistent, up-to-date information is accessible and usable by all agents across a team without manual coordination. The update helps prevent knowledge silos and accelerates agent development by centralizing expertise.

Elastic DevMar 18

Elastic Open-Sources Agent Skills to Give Coding Agents Platform Expertise

Elastic released Agent Skills, open-source instruction packages that give AI coding agents curated expertise for Elasticsearch, Kibana, Observability, and Security. Skills load into any runtime — Cursor, Claude Code, Copilot — replacing guessed syntax with correct, version-aware operations.