CLUSTER · TIER 2
Nous Research releases Lighthouse Attention for faster long-context pre-training
Nous Research has open-sourced Lighthouse Attention, a selection-based hierarchical attention mechanism for long-context pre-training that delivers a 1.4–1.7× wall-clock speedup at 98K context and runs ~17× faster than standard attention at 512K context on a single B200. The approach uses a multi-resolution pyramid with top-k cascade selection and requires no custom sparse attention kernel, straight-through estimator, or auxiliary loss.
Sources
1
X mentions
84k ▲
First seen
2Dago
Velocity
+2%/6h