Lumen 1.2 — Axion Labs

Performance

Lumen 1.2 vs Veil 1.1 — Benchmarks

Across every evaluation we run, Lumen 1.2 represents a substantial generational leap over Veil 1.1.

Veil 1.1

Lumen 1.2 ↑ wins

HumanEval pass@1

Python code generation from docstrings

MBPP

Mostly Basic Python Programming benchmark

Multi-Metric Comparison

Radar view across core capabilities

Score by Category

Normalized 0–100 · higher is better

HumanEval (pass@1)Veil 3.0 · Lumen 80.0

MBPPVeil 3.0 · Lumen 75.0

Instruction FollowingVeil 14.2 · Lumen 91.4

ReasoningVeil 7.8 · Lumen 87.6

Context UnderstandingVeil 11.4 · Lumen 88.2

Full Benchmark Table

All scores, improvement deltas

Benchmark	Veil 1.1	Lumen 1.2	Delta
HumanEval (pass@1)	3.0%	80.0%	+77.0pts
MBPP	3.0%	75.0%	+72.0pts
Instruction Following	14.2%	91.4%	+77.2pts
Reasoning	7.8%	87.6%	+79.8pts
Context Understanding	11.4%	88.2%	+76.8pts

Lumen 1.2 is now available in Axion. Pull the latest from GitHub to get started.