Data, embedding, and index of MassiveDS by "Scaling Retrieval-Based Language Models with a Trillion-Token Datastore"
Rulin Shao
rulins
AI & ML interests
None yet
Recent Activity
updated
a dataset
3 days ago
rulins/gpqa_preprocessed
published
a dataset
3 days ago
rulins/gpqa_preprocessed
updated
a dataset
4 days ago
rulins/gpqa_searched_results_from_massiveds_non_cc
Organizations
Collections
1
models
4
datasets
13
rulins/gpqa_preprocessed
Viewer
•
Updated
•
1.19k
•
47
rulins/gpqa_searched_results_from_massiveds_non_cc
Viewer
•
Updated
•
198
•
26
rulins/MassiveDS-1.4T
Updated
•
2.59k
•
10
rulins/reasonir_bright_gpt4_reasoning_query_scores
Preview
•
Updated
•
19
rulins/DeepSeek-R1-Distill-Qwen-32B_NUMINA_train_amc_aime_merged_thoughts
Viewer
•
Updated
•
3.64k
•
40
•
1
rulins/DeepSeek-R1-Distill-Qwen-32B_NUMINA_train_amc_aime
Viewer
•
Updated
•
3.64k
•
800
•
2
rulins/pes2o_v3
Viewer
•
Updated
•
150M
•
164
rulins/MasssiveDS-1.4T-raw-data
Viewer
•
Updated
•
514M
•
456
rulins/MassiveDS-1.4T-raw-data
Viewer
•
Updated
•
514M
•
644
•
6
rulins/mmlu_searched_results_from_massiveds
Viewer
•
Updated
•
33.5k
•
297