everything for high quality filtering of HPLT3
JQL-AI
community
AI & ML interests
None defined yet.
Recent Activity
Organization Card
JQL-AI (pronounced Jackal-AI) is a community of machine learning researchers committed to advancing the development of multilingual foundation models.
Latest Research
- Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
- Tokenizer Choice For LLM Training: Negligible or Crucial?
- Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?
- Do Multilingual Large Language Models Mitigate Stereotype Bias?
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
-
JQL: Judging Quality Across Languages
🦊5Filter multilingual data for high-quality language models
-
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Paper • 2505.22232 • Published • 18 -
JQL-AI/JQL-Edu-Heads
Text Ranking • Updated • 2 -
JQL-AI/JQL-LLM-Edu-Annotations
Viewer • Updated • 11.4M • 616 • 2
everything for high quality filtering of HPLT3
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
-
JQL: Judging Quality Across Languages
🦊5Filter multilingual data for high-quality language models
-
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Paper • 2505.22232 • Published • 18 -
JQL-AI/JQL-Edu-Heads
Text Ranking • Updated • 2 -
JQL-AI/JQL-LLM-Edu-Annotations
Viewer • Updated • 11.4M • 616 • 2
datasets
12
JQL-AI/HPLT3-198-500k
Updated
•
18
JQL-AI/curated_embeddings
Updated
•
1.41k
JQL-AI/fw2_embeddings
Updated
•
5.27k
•
2
JQL-AI/hplt2_embeddings
Updated
•
2.56k
JQL-AI/hplt2_edu_scores
Viewer
•
Updated
•
3.36B
•
24k
•
1
JQL-AI/fw2_edu_scores
Viewer
•
Updated
•
4.92B
•
558
•
5
JQL-AI/curated_edu_scores
Viewer
•
Updated
•
475
•
141
JQL-AI/JQL-LLM-Edu-Annotations
Viewer
•
Updated
•
11.4M
•
616
•
2
JQL-AI/JQL-Human-Edu-Annotations
Viewer
•
Updated
•
20.4k
•
44
•
5
JQL-AI/Fineweb_2_500k_removed
Viewer
•
Updated
•
11.7M
•
1.07k