Perceptron Mk1 is live on OpenRouter, built by @perceptroninc. Frontier video and embodied reasoning in a vision-language model. Analyzes video at a dynamic frame rate (up to 2 FPS) across a 32k multimodal context, with hybrid reasoning and structured spatial primitives (points, boxes, polygons, clips) as first-class outputs.
OpenRouter Hosts Perceptron Mk1 for Structured Video and Embodied Reasoning
- Context window
- 32,768 tokens
- Max output
- 8,192 tokens
- Video analysis rate
- Up to 2 FPS
- Pricing (input)
- $0.15 per million tokens
- Pricing (output)
- $1.50 per million tokens
- Spatial outputs
- Points, boxes, polygons, and clips
This launch fills a gap between general-purpose chat models and specialized robotic perception layers. While frontier models often struggle with precise spatial localization, Perceptron Mk1 allows developers to request specific annotation formats. It mirrors the industry's shift toward Physical AGI development by prioritizing how models interpret the physical world.
Use Perceptron Mk1 for high-volume visual tasks like document parsing and video summarization via the OpenRouter API. A dedicated reasoning parameter can be enabled per request to trade latency for deeper analysis. Access is priced at $0.15 per million input tokens, offering a cost-effective alternative for multimodal reasoning workloads.
Still wondering? A few quick answers below.
