--- base_model: unsloth/llama-3-8b-bnb-4bit tags: - text-generation-inference - transformers - unsloth - llama - trl license: apache-2.0 language: - en --- # Uploaded model - **Developed by:** zmilczarek - **License:** apache-2.0 - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit This llama model was trained using [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. # Training details ## Dataset The model was trained on the following dataset: [link](https://huggingface.co/datasets/maribr/publication_dates_fr), for the task of extracting the Publication Date of the document. The model is fine-tuned on prompts with the following structure: {'role': 'user', 'content': f'Beggining and end of the document :\n{context}\nWhat is the publication date of the document? Output as a structured JSON object with a format DD/MM/YYYY.'}, {'role': 'assistant', 'content':f'{gold_date}'} Where the context is the first 3000 and last 3000 characters of the document, and the gold_date is a date from the original dataset. ## Training Trained using QLoRa adapters with 3 epochs. ## Evaluation The model scores 70% on accuracy measured as exact date matches and 80% on accuracy measured as partial date matches (combined month and year matches). ## Usage To load the model for inference, use the following code snippet: ```python from unsloth import FastLanguageModel, get_chat_template model, tokenizer = FastLanguageModel.from_pretrained( model_name = "zmilczarek/llama3_8b-finetuned-nlp_industry-adapters", max_seq_length = 2048, dtype = None, load_in_4bit = True, ) FastLanguageModel.for_inference(model) tokenizer = get_chat_template( tokenizer, chat_template = "llama-3.1", ) ``` Or if you don't want to use unsloth, it is also possible to use Huggingface PEFT and Transformers libraries: ```python from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer model = AutoPeftModelForCausalLM.from_pretrained( "zmilczarek/llama3_8b-finetuned-nlp_industry-adapters", load_in_4bit = True, ) tokenizer = AutoTokenizer.from_pretrained("zmilczarek/llama3_8b-finetuned-nlp_industry-adapters") ``` To perform inference on a document, you can use the following prompt: ```python def predict_date(message:list[dict]): inputs = tokenizer.apply_chat_template( message, tokenize = True, add_generation_prompt = True, return_tensors = "pt", ).to("cuda") outputs = model.generate(input_ids = inputs, max_new_tokens = 13, use_cache = True, temperature = 1.5, min_p = 0.1) outputs_ans_only = outputs[:,len(inputs[0]):] answer_only = tokenizer.batch_decode(outputs_ans_only, skip_special_tokens=True) return answer_only context = "[YOUR DOCUMENT]" prompt = "What is the publication date of the document? Output as a structured JSON object with a format DD/MM/YYYY." msg = [{'role': 'user', 'content': f'Beggining and end of the document :\n{context}\n{prompt}'}] predicted_date = predict_date(msg) ```