tdickson17 commited on
Commit
2d42ff7
·
1 Parent(s): 3c032db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -56
README.md CHANGED
@@ -2,7 +2,7 @@
2
  library_name: transformers
3
  pipeline_tag: summarization
4
  ---
5
- # Model Card for Model ID
6
 
7
  The model used in this summarization task is a T5 summarization transformer-based language model fine-tuned for abstractive summarization. The model generates summaries by treating text summarization as a text-to-text problem, where both the input and the output are sequences of text.
8
 
@@ -11,88 +11,47 @@ The model used in this summarization task is a T5 summarization transformer-base
11
  The model used in this summarization task is a Transformer-based language model (e.g., T5 or a similar model) fine-tuned for abstractive summarization. The model generates summaries by treating text summarization as a text-to-text problem, where both the input and the output are sequences of text.
12
  Architecture:
13
 
14
- Model Type: Transformer-based encoder-decoder (e.g., T5 or BART)
15
 
16
- Pretrained Model: The model uses a pretrained tokenizer and model from the Hugging Face transformers library (e.g., T5ForConditionalGeneration).
17
 
18
- Tokenization: Text is tokenized using a subword tokenizer, where long words are split into smaller, meaningful subwords. This helps the model handle a wide variety of inputs, including rare or out-of-vocabulary words.
19
 
20
- Input Processing: The model processes the input sequence by truncating or padding the text to fit within the max_input_length of 512 tokens.
21
 
22
- Output Generation: The model generates the summary through a text generation process using beam search with a beam width of 4 to explore multiple possible summary sequences at each step.
23
 
24
  Key Parameters:
25
 
26
- Max Input Length: 512 tokens — ensures the input text is truncated or padded to fit within the model's processing capacity.
27
 
28
- Max Target Length: 128 tokens — restricts the length of the generated summary, balancing between concise output and content preservation.
29
 
30
- Beam Search: Uses a beam width of 4 (num_beams=4) to explore multiple candidate sequences during generation, helping the model choose the most probable summary.
31
 
32
- Early Stopping: The generation process stops early if the model predicts the end of the sequence before reaching the maximum target length.
33
 
34
  Generation Process:
35
 
36
- Input Tokenization: The input text is tokenized into subword units and passed into the model.
37
 
38
- Beam Search: The model generates the next token by considering the top 4 possible sequences at each step, aiming to find the most probable summary sequence.
39
 
40
- Output Decoding: The generated summary is decoded from token IDs back into human-readable text using the tokenizer, skipping special tokens like padding or end-of-sequence markers.
41
 
42
  Objective:
43
 
44
  The model is designed for abstractive summarization, where the goal is to generate a summary that conveys the most important information from the input text in a fluent, concise manner, rather than simply extracting text.
45
  Performance:
46
 
47
- The use of beam search improves the coherence and fluency of the generated summary by exploring multiple possibilities rather than relying on a single greedy prediction.
48
 
49
- The model's output is typically evaluated using metrics such as ROUGE, which measures overlap with reference summaries, or other task-specific evaluation metrics.
50
 
51
- <!-- Provide the basic links for the model. -->
52
 
53
  - **Repository:** https://github.com/tcdickson/Text-Summarization.git
54
 
55
 
56
- ## Uses
57
-
58
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
59
-
60
- ### Direct Use
61
-
62
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
63
-
64
- [More Information Needed]
65
-
66
- ### Downstream Use [optional]
67
-
68
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
69
-
70
- [More Information Needed]
71
-
72
- ### Out-of-Scope Use
73
-
74
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
75
-
76
- [More Information Needed]
77
-
78
- ## Bias, Risks, and Limitations
79
-
80
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Recommendations
85
-
86
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
87
-
88
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
89
-
90
- ## How to Get Started with the Model
91
-
92
- Use the code below to get started with the model.
93
-
94
- [More Information Needed]
95
-
96
  ## Training Details
97
 
98
  The summarization model was trained on a dataset of press releases scraped from various party websites. These press releases were selected to represent diverse political perspectives and topics, ensuring that the model learned to generate summaries across a wide range of political content.
 
2
  library_name: transformers
3
  pipeline_tag: summarization
4
  ---
5
+ # Text Summarization
6
 
7
  The model used in this summarization task is a T5 summarization transformer-based language model fine-tuned for abstractive summarization. The model generates summaries by treating text summarization as a text-to-text problem, where both the input and the output are sequences of text.
8
 
 
11
  The model used in this summarization task is a Transformer-based language model (e.g., T5 or a similar model) fine-tuned for abstractive summarization. The model generates summaries by treating text summarization as a text-to-text problem, where both the input and the output are sequences of text.
12
  Architecture:
13
 
14
+ Model Type: Transformer-based encoder-decoder (e.g., T5 or BART)
15
 
16
+ Pretrained Model: The model uses a pretrained tokenizer and model from the Hugging Face transformers library (e.g., T5ForConditionalGeneration).
17
 
18
+ Tokenization: Text is tokenized using a subword tokenizer, where long words are split into smaller, meaningful subwords. This helps the model handle a wide variety of inputs, including rare or out-of-vocabulary words.
19
 
20
+ Input Processing: The model processes the input sequence by truncating or padding the text to fit within the max_input_length of 512 tokens.
21
 
22
+ Output Generation: The model generates the summary through a text generation process using beam search with a beam width of 4 to explore multiple possible summary sequences at each step.
23
 
24
  Key Parameters:
25
 
26
+ Max Input Length: 512 tokens — ensures the input text is truncated or padded to fit within the model's processing capacity.
27
 
28
+ Max Target Length: 128 tokens — restricts the length of the generated summary, balancing between concise output and content preservation.
29
 
30
+ Beam Search: Uses a beam width of 4 (num_beams=4) to explore multiple candidate sequences during generation, helping the model choose the most probable summary.
31
 
32
+ Early Stopping: The generation process stops early if the model predicts the end of the sequence before reaching the maximum target length.
33
 
34
  Generation Process:
35
 
36
+ Input Tokenization: The input text is tokenized into subword units and passed into the model.
37
 
38
+ Beam Search: The model generates the next token by considering the top 10 possible sequences at each step, aiming to find the most probable summary sequence.
39
 
40
+ Output Decoding: The generated summary is decoded from token IDs back into human-readable text using the tokenizer, skipping special tokens like padding or end-of-sequence markers.
41
 
42
  Objective:
43
 
44
  The model is designed for abstractive summarization, where the goal is to generate a summary that conveys the most important information from the input text in a fluent, concise manner, rather than simply extracting text.
45
  Performance:
46
 
47
+ The use of beam search improves the coherence and fluency of the generated summary by exploring multiple possibilities rather than relying on a single greedy prediction.
48
 
49
+ The model's output is evaluated using metrics such as ROUGE, which measures overlap with reference summaries, or other task-specific evaluation metrics.
50
 
 
51
 
52
  - **Repository:** https://github.com/tcdickson/Text-Summarization.git
53
 
54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
  ## Training Details
56
 
57
  The summarization model was trained on a dataset of press releases scraped from various party websites. These press releases were selected to represent diverse political perspectives and topics, ensuring that the model learned to generate summaries across a wide range of political content.