Georgios Smyrnis
gsmyrnis
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
7 days ago
OpenThoughts: Data Recipes for Reasoning Models
published
a model
11 days ago
mlfoundations-dev/openthoughts3_100k_code_swap_r1
updated
a dataset
13 days ago
mlfoundations-dev/openthoughts3_code_force_stop
Organizations
gsmyrnis's activity
Any rundown on the data sources?
๐
2
5
#2 opened 2 months ago
by
teknium

Update config.json
1
#4 opened 9 months ago
by
sedrickkeh
TypeError: Couldn't cast array of type
1
#1 opened 10 months ago
by
shizhediao2

Seems like WARC metadata is missing from this version?
1
#4 opened 10 months ago
by
yury-zyphra
Missing files
3
#2 opened 12 months ago
by
pengyuan

Were the documents shuffled before the dataset was split into shards?
3
#5 opened 11 months ago
by
yury-zyphra
Would you share the 0.28T token dataset for achieve highest scores in 7B-2x experiment?
2
#6 opened 11 months ago
by
Mars2050
How many rows are there in the dataset?
1
#4 opened 12 months ago
by
yury-zyphra
Reproduce the clip score
1
#1 opened almost 2 years ago
by
zhangjc404