Benchmark Technical Report · March 2026
OQENYX
Guardian
Comprehensive performance analysis across language, vision, reasoning, and agentic tasks — compared against GPT, Claude, and Gemini.
81.7
GPQA Diamond — graduate-level reasoning (Guardian Lite)
Vision
Native image, document, and video understanding
Compact
Strong results at a fraction of frontier-model size
Fast
Low latency, efficient inference
Guardian 2.0
Reasoning Benchmarks
Guardian 2.0 scores across reasoning, knowledge, and math
IFBench
Instruction Following
76.5
Guardian 2.0 ThinkingGuardian leads
GPQA Diamond
Graduate-Level Reasoning
86
Guardian 2.0 ThinkingGuardian leads
MMLU-Pro
Professional Knowledge
85
Guardian 2.0 ThinkingGuardian leads
AIME 2026
Math Reasoning
93
Guardian 2.0 ThinkingGuardian leads
MATH-500
Mathematical Reasoning
97
G-2.0-LiteGuardian leads
OQENYXCompetitor scores are the figures published by the model providers. Guardian figures reflect our internal evaluation. Report date: March 2026.