ali-issa/arb_diacritized_tokenized_filtered_dataset_with_custom_tokenizer Viewer • Updated about 1 hour ago • 13.4M • 8
ali-issa/arb_diacritized_tokenized_filtered_dataset_with_arb-bpe-tokenizer-32768 Viewer • Updated 3 days ago • 141M • 137
ali-issa/new_removed_none_values_arb_filtered_and_diacritized_short_sentences_less_than_5_words Viewer • Updated 24 days ago • 141M • 59
ali-issa/arb_tokenized_filtered_dataset_with_arb-bpe-tokenizer-32768 Viewer • Updated Jan 27 • 142M • 21
ali-issa/eng_tokenized_filtered_dataset_with_eng-bpe-tokenizer-32768 Viewer • Updated Jan 20 • 142M • 54