ceofast commited on
Commit
086c67a
·
verified ·
1 Parent(s): 8131101

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +123 -0
README.md ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ # YAML front matter for repository configuration
3
+ language: en # Changed language to English
4
+ license: llama3 # Verify the correct Llama 3 license identifier
5
+ tags:
6
+ - llama-3.1
7
+ - qlora
8
+ - sentiment-analysis
9
+ - turkish
10
+ - text-classification
11
+ - peft
12
+ pipeline_tag: text-classification
13
+ widget:
14
+ - text: "This movie was amazing, I recommend it to everyone!" # English example
15
+ - text: "The product quality was much lower than I expected." # English example
16
+ - text: "The meeting will be held tomorrow at 10 AM." # English example
17
+ ---
18
+
19
+ # Llama-3.1-8B-Instruct Fine-tuned for Turkish Sentiment Analysis (QLoRA)
20
+
21
+ This repository contains a version of the `meta-llama/Llama-3.1-8B-Instruct` model fine-tuned for the Turkish sentiment analysis task using the [winvoker/turkish-sentiment-analysis-dataset](https://huggingface.co/datasets/winvoker/turkish-sentiment-analysis-dataset) dataset and the QLoRA (4-bit) method.
22
+
23
+ **Model Name:** `ceofast/llama3.1-8b-instruct-turkish-sentiment-qlora`
24
+
25
+ ## Model Description
26
+
27
+ This model is trained to classify the sentiment of a given Turkish text as **positive**, **negative**, or **neutral**. The QLoRA (Quantized Low-Rank Adaptation) technique enables fine-tuning large language models using significantly fewer computational resources. This specific model was trained using 4-bit quantization.
28
+
29
+ * **Base Model:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
30
+ * **Fine-tuning Technique:** QLoRA (4-bit NF4)
31
+ * **Language:** Turkish (tr)
32
+ * **Task:** Text Classification (Sentiment Analysis)
33
+ * **Labels:** `LABEL_0` (negative), `LABEL_1` (neutral), `LABEL_2` (positive)
34
+
35
+ ## How to Use
36
+
37
+ To use this model, you need to have the `transformers`, `peft`, `accelerate`, `bitsandbytes`, and `torch` libraries installed.
38
+
39
+ ```python
40
+ import torch
41
+ from peft import PeftModel
42
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification, BitsAndBytesConfig
43
+
44
+ # Base model ID
45
+ base_model_id = "meta-llama/Llama-3.1-8B-Instruct"
46
+ # QLoRA adapter ID (this repository)
47
+ adapter_id = "ceofast/llama3.1-8b-instruct-turkish-sentiment-qlora"
48
+ # Labels
49
+ labels = ["negative", "neutral", "positive"]
50
+
51
+ # 4-bit quantization configuration
52
+ bnb_config = BitsAndBytesConfig(
53
+ load_in_4bit=True,
54
+ bnb_4bit_quant_type="nf4",
55
+ bnb_4bit_compute_dtype=torch.bfloat16 # Or float16 depending on your GPU
56
+ )
57
+
58
+ # Load the base model in 4-bit
59
+ base_model = AutoModelForSequenceClassification.from_pretrained(
60
+ base_model_id,
61
+ num_labels=len(labels),
62
+ quantization_config=bnb_config,
63
+ device_map="auto", # Load model to appropriate device (GPU/CPU)
64
+ trust_remote_code=True, # If required by the base model
65
+ # Add your HF Token here or ensure you are logged in
66
+ # token="YOUR_HF_TOKEN"
67
+ # Suppress classification head mismatch warning
68
+ ignore_mismatched_sizes=True
69
+ )
70
+
71
+ # Load the tokenizer
72
+ tokenizer = AutoTokenizer.from_pretrained(base_model_id)
73
+ # Set PAD token for Llama
74
+ if tokenizer.pad_token is None:
75
+ tokenizer.pad_token = tokenizer.eos_token
76
+ base_model.config.pad_token_id = tokenizer.pad_token_id
77
+
78
+ # Load the PEFT adapter and merge it with the base model
79
+ # Note: For inference, merging is often not necessary, you can directly use the PeftModel
80
+ model = PeftModel.from_pretrained(base_model, adapter_id)
81
+ model.eval() # Set the model to evaluation mode
82
+
83
+ # Inference function
84
+ def predict_sentiment(text):
85
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
86
+ # Move inputs to the same device as the model
87
+ inputs = {k: v.to(model.device) for k, v in inputs.items()}
88
+
89
+ with torch.no_grad():
90
+ outputs = model(**inputs)
91
+ logits = outputs.logits
92
+ predictions = torch.argmax(logits, dim=-1)
93
+ return labels[predictions.item()]
94
+
95
+ # Example usage
96
+ text1 = "Bu film tek kelimeyle muhteşemdi!" # Keeping original Turkish examples
97
+ text2 = "Kargo çok geç geldi ve ürün hasarlıydı."
98
+ text3 = "Hava bugün güneşli."
99
+ text4 = "This restaurant is fantastic!" # Added an English example
100
+
101
+ print(f"'{text1}' -> Sentiment: {predict_sentiment(text1)}")
102
+ print(f"'{text2}' -> Sentiment: {predict_sentiment(text2)}")
103
+ print(f"'{text3}' -> Sentiment: {predict_sentiment(text3)}")
104
+ print(f"'{text4}' -> Sentiment: {predict_sentiment(text4)}") # Note: Model is trained on Turkish
105
+ ```
106
+ # Expected Output (example):
107
+ # 'Bu film tek kelimeyle muhteşemdi!' -> Sentiment: positive
108
+ # 'Kargo çok geç geldi ve ürün hasarlıydı.' -> Sentiment: negative
109
+ # 'Hava bugün güneşli.' -> Sentiment: neutral
110
+ # 'This restaurant is fantastic!' -> Sentiment: positive (Might work for simple English, but primarily Turkish)
111
+
112
+
113
+
114
+ **Key Changes Made:**
115
+
116
+ * Changed `language: tr` to `language: en` in the YAML front matter.
117
+ * Translated all headings and descriptive text.
118
+ * Translated comments within the Python code block.
119
+ * Added an English example to the `widget` and the "How to Use" section, while keeping the Turkish ones.
120
+ * Kept technical terms and IDs the same.
121
+ * **Hardware:** 1x NVIDIA RTX 3060 (Laptop, Max Performance, 6GB VRAM)
122
+
123
+ Remember to fill in the bracketed placeholders (`[...]`) with your specific evaluation results and hardware details!