--- license: mit datasets: - IslamQA/askimam - IslamQA/hadithanswers - IslamQA/islamqa language: - ar - en - fr - tr - fa - id - te - ru - hi - es - ur - ch - zh - pt - de base_model: intfloat/multilingual-e5-small tags: - embedding - retrieval - islam - multilingual - model - islamqa library_name: transformers description: > An embedding model optimized for retrieving passages that answer questions about Islam. The passages are inherently multilingual, as they contain quotes from the Quran and Hadith. They often include preambles like "Bismillah" in various languages and follow a specific writing style. finetuned_on: >- 180k multilingual questions and answers about Islam, using hard negative mining. data_scraped_on: April 2024 sources: - https://islamqa.info/ - https://islamweb.net/ - https://hadithanswers.com/ - https://askimam.org/ - https://sorularlaislamiyet.com/ format: question_prefix: 'query: ' answer_prefix: 'passage: ' --- from transformers import AutoModel, AutoTokenizer from peft import PeftModel # Load the base model and tokenizer base_model_name = "intfloat/multilingual-e5-small" tokenizer = AutoTokenizer.from_pretrained(base_model_name) base_model = AutoModel.from_pretrained(base_model_name) # Load the LoRA adapter directly adapter_repo = "IslamQA/multilingual-e5-small-finetuned" model = PeftModel.from_pretrained(base_model, adapter_repo) # Model Card for Model ID An embedding model optimized for retrieving passages that answer questions about Islam. The passages are inherently multilingual, as they contain quotes from the Quran and Hadith. They often include preambles like "Bismillah" in various languages and follow a specific writing style. ## Model Details ### Model Sources [optional] - https://islamqa.info/ - https://islamweb.net/ - https://hadithanswers.com/ - https://askimam.org/ - https://sorularlaislamiyet.com/ ## Uses - embedding - retrieval - islam - multilingual - q&a from transformers import AutoModel, AutoTokenizer from peft import PeftModel # Load the base model and tokenizer base_model_name = "intfloat/multilingual-e5-large-instruct" tokenizer = AutoTokenizer.from_pretrained(base_model_name) base_model = AutoModel.from_pretrained(base_model_name) # Load the LoRA adapter directly adapter_repo = "IslamQA/multilingual-e5-small-finetuned" model = PeftModel.from_pretrained(base_model, adapter_repo)