Performance
Lumen 1.2 vs Veil 1.1 — Benchmarks
Across every evaluation we run, Lumen 1.2 represents a substantial generational leap over Veil 1.1.
Veil 1.1
Lumen 1.2 ↑ wins
HumanEval pass@1
Python code generation from docstrings
MBPP
Mostly Basic Python Programming benchmark
Multi-Metric Comparison
Radar view across core capabilities
Score by Category
Normalized 0–100 · higher is better
Full Benchmark Table
All scores, improvement deltas
| Benchmark | Veil 1.1 | Lumen 1.2 | Delta |
|---|---|---|---|
| HumanEval (pass@1) | 3.0% | 80.0% | +77.0pts |
| MBPP | 3.0% | 75.0% | +72.0pts |
| Instruction Following | 14.2% | 91.4% | +77.2pts |
| Reasoning | 7.8% | 87.6% | +79.8pts |
| Context Understanding | 11.4% | 88.2% | +76.8pts |
Lumen 1.2 is now available in Axion. Pull the latest from GitHub to get started.