Skip to content
  1.  
  2. © 2023 – 2025 OpenRouter, Inc

    Qwen: Qwen2.5 VL 32B Instruct

    qwen/qwen2.5-vl-32b-instruct

    Created Mar 24, 202516,384 context
    $0.05/M input tokens$0.22/M output tokens

    Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities. It excels at visual analysis tasks, including object recognition, textual interpretation within images, and precise event localization in extended videos. Qwen2.5-VL-32B demonstrates state-of-the-art performance across multimodal benchmarks such as MMMU, MathVista, and VideoMME, while maintaining strong reasoning and clarity in text-based tasks like MMLU, mathematical problem-solving, and code generation.

    Providers for Qwen2.5 VL 32B Instruct

    OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.