Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,57 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
library_name: transformers
|
6 |
+
datasets:
|
7 |
+
- budecosystem/intellecta
|
8 |
---
|
9 |
+
|
10 |
+
<div align="center"><img src="https://raw.githubusercontent.com/BudEcosystem/boomer/main/assets/boomer-logo.png" width=200></div>
|
11 |
+
|
12 |
+
|
13 |
+
<p align="center"><i>Democratizing access to LLMs for the open-source community.<br>Let's advance AI, together. </i></p>
|
14 |
+
|
15 |
+
----
|
16 |
+
|
17 |
+
## Introduction 🎉
|
18 |
+
|
19 |
+
We are thrilled to announce the open-sourcing of our boomer-634m model, an important milestone in our ongoing AI research. This model, with 634 million parameters, was meticulously pre-trained from scratch on a custom synthetic dataset comprising 12 billion tokens.
|
20 |
+
|
21 |
+
|
22 |
+
|
23 |
+
## Run the model
|
24 |
+
|
25 |
+
Here is a quick guide to get you started with boomer-634m:
|
26 |
+
Please note that, at the moment, `trust_remote_code=True` is required for running the model.
|
27 |
+
|
28 |
+
```python
|
29 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
30 |
+
|
31 |
+
model = AutoModelForCausalLM.from_pretrained("budecosystem/boomer-634m",
|
32 |
+
trust_remote_code=True)
|
33 |
+
tokenizer = AutoTokenizer.from_pretrained("budecosystem/boomer-634m")
|
34 |
+
|
35 |
+
input_ids = tokenizer("Explain why the sky is blue.", return_tensors='pt').to(model.device)["input_ids"]
|
36 |
+
outputs = model.generate(input_ids, max_new_tokens=216)
|
37 |
+
print(tokenizer.batch_decode(outputs))
|
38 |
+
|
39 |
+
```
|
40 |
+
|
41 |
+
## Evaluations
|
42 |
+
|
43 |
+
The boomer-634m model has been rigorously evaluated on various benchmarks, showcasing its robust performance across different tasks:
|
44 |
+
|
45 |
+
Model Name MMLU ARC Hellaswag GSM8K Winogrande MathQA logiqa
|
46 |
+
boomer-634m 25.91 29.86 39.24 1.67 50.67 23.55 28.42
|
47 |
+
|
48 |
+
|
49 |
+
|
50 |
+
### Final thought on Boomer!
|
51 |
+
|
52 |
+
Embarking on the journey with boomer-634m is just the beginning. We are committed to developing more advanced, efficient, and accessible AI models. Join us in this exciting adventure to shape the future of AI.
|
53 |
+
|
54 |
+
|
55 |
+
### Aknowledgements
|
56 |
+
|
57 |
+
Our heartfelt thanks go to the open-source community and the trailblazers in AI research whose work has paved the way for innovations like boomer-634m.
|