NEWSFERENCE
THU, 14 May 2026 00:32:03
LIVE
$ today --liveF1TodayF2YesterdayF3ArchiveF4About
NEXT SCAN
← BACK TO TODAY/CLUSTER · NOUS · RESEARCH
CLUSTER · TIER 2
FIRST SEEN 4D AGO
NOUSRESEARCH

Nous Research releases Token Superposition Training for 2-3x LLM pretraining speedup

Nous Research released Token Superposition Training (TST), a modification to the standard LLM pretraining loop that achieves a 2–3× wall-clock speedup at matched FLOPs without changing model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens using averaged embeddings, then switches to standard next-token prediction for the remainder of the run.

Sources
1
X mentions
11k
First seen
4Dago
Velocity
+2%/6h
CONTRIBUTING SOURCES
1 ARTICLES
  1. X (Twitter)4D AGO
    x.com/NousResearch/status/2054610062836892054
X DISCOURSE
11k TOTAL · TOP 3
@NousResearch<1H · 127.1K
RT @akshay_pachaar: https://t.co/Exoyd8tB0d
@NousResearch8H · 115.0K
RT @akshay_pachaar: https://t.co/Exoyd8tB0d
@NousResearch1D · 85.2K
RT @akshay_pachaar: https://t.co/Exoyd8tB0d