Qwen3-4B is a 4 billion parameter dense language model from the Qwen3 series, designed to support both general-purpose and reasoning-intensive tasks. It introduces a dual-mode architecture—thinking and non-thinking—allowing dynamic switching between high-precision logical reasoning and efficient dialogue generation. This makes it well-suited for multi-turn chat, instruction following, and complex agent workflows.
Recent activity on Qwen3 4B
Total usage per day on OpenRouter
Prompt
200K
Reasoning
62K
Completion
22K
Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.