Yuichi Tateno's picture

In a Training Loop 🔄

Yuichi Tateno PRO

hotchpotch

·

https://secon.dev/

AI & ML interests

Information Retrieval with LLMs

Recent Activity

upvoted a paper about 8 hours ago

Diffusion-Pretrained Dense and Contextual Embeddings

liked a model about 8 hours ago

perplexity-ai/pplx-embed-v1-4b

liked a dataset about 11 hours ago

sentence-transformers/parallel-sentences-opus-100

View all activity

Organizations

upvoted a paper about 8 hours ago

Diffusion-Pretrained Dense and Contextual Embeddings

Paper • 2602.11151 • Published 14 days ago • 8

upvoted a collection 5 days ago

ColBERT-Zero 🐶

First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT • 10 items • Updated 6 days ago • 16

upvoted a collection 8 days ago

Bharat-NanoBEIR: Indian Language Retrieval Benchmarks

NanoBEIR retrieval benchmarks translated into 22 Indian languages across 13 datasets. • 22 items • Updated Dec 13, 2025 • 5

upvoted an article 15 days ago

Article

Transformers.js v4 Preview: Now Available on NPM!

17 days ago

•

72

upvoted a collection 26 days ago

CoRNStack

State-of-the-art code retrieval and re-ranking models and datasets • 9 items • Updated Mar 26, 2025 • 20

upvoted an article about 2 months ago

Article

ModernVBERT: Towards Smaller Visual Document Retrievers

Oct 3, 2025

•

46

upvoted 2 collections 2 months ago

NanoBEIR datasets

These datasets are compatible with the (Sparse)NanoBEIREvaluator with Sentence Transformers v5.2+. Also CrossEncoderNanoBEIREvaluator if bm25 column • 18 items • Updated 26 days ago • 14

Embedding Model Datasets

A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 70 items • Updated Dec 10, 2025 • 163

upvoted an article 3 months ago

Article

Granite 4.0 Nano: Just how small can you go?

Oct 28, 2025

•

123

upvoted 4 articles 4 months ago

Article

Streaming datasets: 100x More Efficient

+3

Oct 27, 2025

•

81

Article

Provence: efficient and robust context pruning for retrieval-augmented generation

Jan 28, 2025

•

25

Article

huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning

+2

Oct 27, 2025

•

75

Article

Sentence Transformers is joining Hugging Face!

Oct 22, 2025

•

87

upvoted 3 articles 5 months ago

Article

Introducing RTEB: A New Standard for Retrieval Evaluation

+4

Oct 1, 2025

•

137

Article

Nemotron-Personas-Japan: Synthesized Data for Sovereign AI

Sep 23, 2025

•

27

Article

mmBERT: ModernBERT goes Multilingual

+4

Sep 9, 2025

•

133

upvoted 2 articles 7 months ago

Article

Ettin Suite: SoTA Paired Encoders and Decoders

+4

Jul 16, 2025

•

77

Article

Migrating the Hub from Git LFS to Xet

+1

Jul 15, 2025

•

28

upvoted 2 articles 8 months ago

Article

Efficient MultiModal Data Pipeline

+3

Jul 8, 2025

•

70

Article

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

Jul 1, 2025

•

133