NEWSFERENCE
TUE, 26 May 2026 04:00:00
LIVE
$ today --liveF1TodayF2YesterdayF3ArchiveF4About
NEXT SCAN
← BACK TO TODAY/CLUSTER · ARXIV · RESEARCH
CLUSTER · TIER 3
FIRST SEEN 7D AGO
ARXIVRESEARCH

Benchmarking Patent Embeddings: A Multi-Task Evaluation of 22 Models Across Retrieval, Classification, and Clustering

arXiv:2605.24297v1 Announce Type: cross Abstract: Which fine-tuning signals improve patent embedding models, and do gains transfer across patent landscapes? We benchmark 22 embedding models, from 22M-parameter encoders to 12B instruction-tuned LLMs, on retrieval, classification, and clustering. The study uses 113,148 WIPO assistive-technology patents, 46,069 citation-graph retrieval queries, and the public DAPFAM dataset for external validation. Our framework covers citation-based retrieval, hybrid sparse-dense fusion, multi-label classification over five datasets, unsupervised clustering, six text-section views, domain-adaptive fine-tuning of four models, jurisdiction analysis, and proprietary DWPI (Derwent World Patents Index, Clarivate) expert-written content. Results show that fine-tuning is task-dependent: single-landscape tuning can improve in-domain scores but often hurts retrieval on an external landscape, challenging the assumption that more domain data always helps. Within model families, scale usually predicts performance (Qwen3 0.6B to 4B to 8B; Llama-Nemotron 1B to 8B), but cross-family scaling is noisy: the 12B KaLM-Gemma3 ranks 8th on TAC retrieval, while Qwen3-0.6B leads ARI clustering. Title+Abstract+Claims is the most reliable text representation. Multi-view abstract-claim alignment improves retrieval by up to 7.1 percent nDCG@10, while combined fine-tuning gives the strongest classification gains (+7.1 F1). All models drop by 55-65 percent on out-of-domain queries, and hybrid sparse-dense fusion does not close this gap. BM25-dense interpolation gives modest nDCG@10 gains (+0.002 to +0.015), with larger benefits for weaker zero-shot dense models. Code and evaluation framework are publicly available.

Sources
1
X mentions
First seen
7Dago
Velocity
+2%/6h
CONTRIBUTING SOURCES
1 ARTICLES
  1. arXiv: Artificial Intelligence7D AGO
    arxiv.org/abs/2605.24297
X DISCOURSE
AWAITING X SIGNAL
No notable English-language X chatter on this entity yet.