Shivanand Roy 👋
commited on
Commit
·
5a9665e
1
Parent(s):
8ca2a48
Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,13 @@ tags:
|
|
10 |
|
11 |
widget:
|
12 |
|
13 |
-
- text: "summarize: We describe a system called Overton, whose main design goal is to support engineers in building, monitoring, and improving production
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
license: mit
|
16 |
|
@@ -20,57 +26,38 @@ license: mit
|
|
20 |
A T5 model trained on 370,000 research papers, to generate one line summary based on description/abstract of the papers
|
21 |
|
22 |
Trained with [**simpleT5**](https://https://github.com/Shivanandroy/simpleT5)⚡️in just 3 lines of code
|
23 |
-
|
24 |
|
25 |
## Usage:[](https://colab.research.google.com/drive/1HrfT8IKLXvZzPFpl1EhZ3s_iiXG3O2VY?usp=sharing)
|
26 |
```python
|
|
|
|
|
|
|
|
|
|
|
27 |
model_name = "snrspeaks/t5-one-line-summary"
|
28 |
|
29 |
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
30 |
-
|
31 |
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
|
32 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
Key challenges engineers face are monitoring fine-grained quality, diagnosing errors in
|
37 |
-
sophisticated applications, and handling contradictory or incomplete supervision data.
|
38 |
-
Overton automates the life cycle of model construction, deployment, and monitoring by providing a
|
39 |
-
set of novel high-level, declarative abstractions. Overton's vision is to shift developers to
|
40 |
-
these higher-level tasks instead of lower-level machine learning tasks. In fact, using Overton,
|
41 |
-
engineers can build deep-learning-based applications without writing any code
|
42 |
-
in frameworks like TensorFlow. For over a year, Overton has been used in production to support multiple
|
43 |
-
applications in both near-real-time applications and back-of-house processing.
|
44 |
-
In that time, Overton-based applications have answered billions of queries in multiple
|
45 |
-
languages and processed trillions of records reducing errors 1.7-2.9 times versus production systems.
|
46 |
-
"""
|
47 |
-
|
48 |
-
input_ids = tokenizer.encode(
|
49 |
-
"summarize: " + abstract, return_tensors="pt", add_special_tokens=True
|
50 |
-
)
|
51 |
-
|
52 |
-
generated_ids = model.generate(
|
53 |
-
input_ids=input_ids,
|
54 |
-
num_beams=5,
|
55 |
-
max_length=50,
|
56 |
-
repetition_penalty=2.5,
|
57 |
-
length_penalty=1,
|
58 |
-
early_stopping=True,
|
59 |
-
num_return_sequences=3,
|
60 |
-
)
|
61 |
-
|
62 |
-
preds = [
|
63 |
-
tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True)
|
64 |
-
for g in generated_ids
|
65 |
-
]
|
66 |
-
|
67 |
print(preds)
|
68 |
|
69 |
# output
|
70 |
-
|
71 |
-
|
72 |
-
|
73 |
-
'Overton: A System for Building, Monitoring, and Improving Production Machine Learning Systems',
|
74 |
-
|
75 |
-
'Overton: Building, Monitoring, and Improving Production Machine Learning Systems']
|
76 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
widget:
|
12 |
|
13 |
+
- text: "summarize: We describe a system called Overton, whose main design goal is to support engineers in building, monitoring, and improving production
|
14 |
+
machinelearning systems. Key challenges engineers face are monitoring fine-grained quality, diagnosing errors in sophisticated applications, and
|
15 |
+
handling contradictory or incomplete supervision data. Overton automates the life cycle of model construction, deployment, and monitoring by providing a set of novel high-level, declarative abstractions. Overton's vision is to shift developers to these higher-level tasks instead of lower-level machine learning tasks.
|
16 |
+
In fact, using Overton, engineers can build deep-learning-based applications without writing any code in frameworks like TensorFlow. For over a year,
|
17 |
+
Overton has been used in production to support multiple applications in both near-real-time applications and back-of-house processing.
|
18 |
+
In that time, Overton-based applications have answered billions of queries in multiple languages and processed trillions of records reducing errors
|
19 |
+
1.7-2.9 times versus production systems."
|
20 |
|
21 |
license: mit
|
22 |
|
|
|
26 |
A T5 model trained on 370,000 research papers, to generate one line summary based on description/abstract of the papers
|
27 |
|
28 |
Trained with [**simpleT5**](https://https://github.com/Shivanandroy/simpleT5)⚡️in just 3 lines of code
|
29 |
+
- [**simpleT5**](https://https://github.com/Shivanandroy/simpleT5)⚡️ is a python package built on top of **pytorch lightning** and **transformers**🤗, to quickly train T5 models.
|
30 |
|
31 |
## Usage:[](https://colab.research.google.com/drive/1HrfT8IKLXvZzPFpl1EhZ3s_iiXG3O2VY?usp=sharing)
|
32 |
```python
|
33 |
+
abstract = """We describe a system called Overton, whose main design goal is to support engineers in building, monitoring, and improving production machine learning systems. Key challenges engineers face are monitoring fine-grained quality, diagnosing errors in sophisticated applications, and handling contradictory or incomplete supervision data. Overton automates the life cycle of model construction, deployment, and monitoring by providing a set of novel high-level, declarative abstractions. Overton's vision is to shift developers to these higher-level tasks instead of lower-level machine learning tasks. In fact, using Overton, engineers can build deep-learning-based applications without writing any code in frameworks like TensorFlow. For over a year, Overton has been used in production to support multiple applications in both near-real-time applications and back-of-house processing. In that time, Overton-based applications have answered billions of queries in multiple languages and processed trillions of records reducing errors 1.7-2.9 times versus production systems.
|
34 |
+
"""
|
35 |
+
```
|
36 |
+
Transformers🤗
|
37 |
+
```python
|
38 |
model_name = "snrspeaks/t5-one-line-summary"
|
39 |
|
40 |
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
|
|
41 |
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
|
42 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
43 |
+
input_ids = tokenizer.encode("summarize: " + abstract, return_tensors="pt", add_special_tokens=True)
|
44 |
+
generated_ids = model.generate(input_ids=input_ids,num_beams=5,max_length=50,repetition_penalty=2.5,length_penalty=1,early_stopping=True,num_return_sequences=3)
|
45 |
+
preds = [tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True) for g in generated_ids]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
print(preds)
|
47 |
|
48 |
# output
|
49 |
+
["Overton: Building, Deploying, and Monitoring Machine Learning Systems for Engineers",
|
50 |
+
"Overton: A System for Building, Monitoring, and Improving Production Machine Learning Systems",
|
51 |
+
"Overton: Building, Monitoring, and Improving Production Machine Learning Systems"]
|
|
|
|
|
|
|
52 |
```
|
53 |
+
simpleT5⚡️
|
54 |
+
```python
|
55 |
+
# pip install --upgrade simplet5
|
56 |
+
from simplet5 import SimpleT5
|
57 |
+
model = SimpleT5()
|
58 |
+
model.load_model("t5","snrspeaks/t5-one-line-summary")
|
59 |
+
model.predict(abstract)
|
60 |
+
|
61 |
+
# output
|
62 |
+
"Overton: Building, Deploying, and Monitoring Machine Learning Systems for Engineers"
|
63 |
+
```
|