ayushsinha commited on
Commit
0cd3fe7
Β·
verified Β·
1 Parent(s): 4b721b5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -0
README.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Text-to-Text Transfer Transformer Quantized Model for News Summarization
2
+
3
+ This repository hosts a quantized version of the T5 model, fine-tuned specifically for text summarization of news. The model extracts concise summaries from semi-structured or unstructured news texts, making it ideal for POS systems, kitchen displays, and chat-based food order logging.
4
+
5
+ ## Model Details
6
+
7
+ - **Field:** Description
8
+ - **Model Architecture** T5 (Text-to-Text Transfer Transformer)
9
+ - **Task** Text Summarization for News
10
+ - **Input Format** Free-form order text (includes Order ID, Customer, Items, etc.)
11
+ - **Quantization** 8-bit (int8) using bitsandbytes
12
+ - **Framework** Hugging Face Transformers
13
+ - **Base Model** t5-base
14
+ - **Dataset** Custom
15
+
16
+ ## Usage
17
+
18
+ ## Installation
19
+
20
+ ```sh
21
+ pip install transformers accelerate bitsandbytes torch
22
+ ```
23
+ ### Loading the Model
24
+
25
+ ```python
26
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
27
+ import torch
28
+
29
+ device = "cuda" if torch.cuda.is_available() else "cpu"
30
+
31
+ model_name = "AventIQ-AI/T5-News-Summarization"
32
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
33
+ model = AutoModelForSeq2SeqLM.from_pretrained(model_name, load_in_8bit=True, device_map="auto")
34
+
35
+ def test_summarization(model, tokenizer):
36
+ user_text = input("\nEnter your News text:\n")
37
+ inputs = tokenizer("summarize: " + user_text, return_tensors="pt", truncation=True, max_length=512).to(model.device)
38
+
39
+ output = model.generate(
40
+ **inputs,
41
+ max_new_tokens=100,
42
+ num_beams=5,
43
+ length_penalty=0.8,
44
+ early_stopping=True
45
+ )
46
+
47
+ summary = tokenizer.decode(output[0], skip_special_tokens=True)
48
+ return summary
49
+
50
+ print("\nπŸ“ **Model Summary:**")
51
+ print(test_summarization(model, tokenizer))
52
+ ```
53
+
54
+ ## ROUGE Evaluation Results
55
+
56
+ After fine-tuning the **T5-Small** model for text summarization, we obtained the following **ROUGE** scores:
57
+
58
+ | **Metric** | **Score** | **Meaning** |
59
+ |-------------|-----------|-------------|
60
+ | **ROUGE-1** | **0.4125** (~41%) | Overlap of **unigrams** between reference and summary. |
61
+ | **ROUGE-2** | **0.2167** (~22%) | Overlap of **bigrams**, indicating fluency. |
62
+ | **ROUGE-L** | **0.3421** (~34%) | Longest common subsequence matching structure. |
63
+ | **ROUGE-Lsum** | **0.3644** (~36%) | Sentence-level summarization effectiveness. |
64
+
65
+ ## Fine-Tuning Details
66
+
67
+ ### Dataset
68
+
69
+ Custom-labeled food order dataset containing fields like Order ID, Customer, and Order Details. The model was trained to extract clean, natural summaries from noisy or inconsistent order formats.
70
+
71
+ ### Training
72
+
73
+ - Number of epochs: 3
74
+
75
+ - Batch size: 4
76
+
77
+ - Evaluation strategy: epoch
78
+
79
+ - Learning rate: 3e-5
80
+
81
+ ### Quantization
82
+
83
+ Post-training 8-bit quantization using bitsandbytes library with Hugging Face integration. This reduced the model size and improved inference speed with negligible impact on summarization quality.
84
+
85
+ ## Repository Structure
86
+
87
+ ```
88
+ .
89
+ β”œβ”€β”€ model/ # Contains the quantized model files
90
+ β”œβ”€β”€ tokenizer_config/ # Tokenizer configuration and vocabulary files
91
+ β”œβ”€β”€ model.safetensors/ # Quantized model weights
92
+ β”œβ”€β”€ README.md # Model documentation
93
+ ```
94
+
95
+
96
+ ## Limitations
97
+
98
+ - The model may misinterpret or misformat input with excessive noise or missing key fields.
99
+
100
+ - Quantized versions may show slight accuracy loss compared to full-precision models.
101
+
102
+ - Best suited for English-language food order formats.
103
+
104
+ ## Contributing
105
+
106
+ Contributions are welcome! If you have suggestions, feature requests, or improvements, feel free to open an issue or submit a pull request.