Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,4 @@
|
|
1 |
---
|
2 |
-
# This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
|
3 |
license: llama3
|
4 |
language:
|
5 |
- en
|
@@ -10,48 +9,58 @@ tags:
|
|
10 |
- large language model
|
11 |
- domain adaptation
|
12 |
- scientific domain adaptation
|
13 |
-
- crystal generation
|
14 |
- materials copilot
|
15 |
- information extraction
|
16 |
- table understanding
|
17 |
- table data parsing
|
18 |
---
|
|
|
19 |
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
23 |
-
LLaMat-3-chat is a materials research copilot.
|
24 |
-
|
25 |
## Model Details
|
26 |
-
foundational model that is finetuned from LLaMat-3, which is made by continued pretraining of LLaMA-3 on material science tokens. It has instruction following abilities and can be used as a copilot for information extraction from material science textual or tabular data.
|
27 |
-
### Model Description
|
28 |
|
29 |
-
|
|
|
|
|
|
|
30 |
|
31 |
-
|
32 |
-
- **
|
33 |
-
- **
|
34 |
-
- **
|
35 |
-
- **
|
36 |
|
37 |
-
|
38 |
|
39 |
-
|
40 |
|
41 |
-
|
42 |
-
|
43 |
-
|
|
|
44 |
|
|
|
|
|
45 |
|
46 |
-
|
47 |
-
This work was supported by the Edinburgh International Data Facility (EIDF) and the Data-Driven Innovation Programme at the University of Edinburgh. The EIDF provided access to Cerebras CS2 clusters for pretraining the language models.
|
48 |
-
Link - https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs
|
49 |
|
50 |
-
|
|
|
|
|
|
|
|
|
51 |
|
52 |
-
|
53 |
-
Pretraining: 2 CS-2 Cerebras Wafer-Scale Engine (WSE-2)
|
54 |
-
Finetuning: 8 NVIDIA-A100 80GB GPUs
|
55 |
-
Inferencing: 1 NVIDIA-A100 80GB GPU
|
56 |
-
#### Software
|
57 |
-
PyTorch, HuggingFace, Transformers
|
|
|
1 |
---
|
|
|
2 |
license: llama3
|
3 |
language:
|
4 |
- en
|
|
|
9 |
- large language model
|
10 |
- domain adaptation
|
11 |
- scientific domain adaptation
|
|
|
12 |
- materials copilot
|
13 |
- information extraction
|
14 |
- table understanding
|
15 |
- table data parsing
|
16 |
---
|
17 |
+
# Model Card for LLaMat-3-Chat
|
18 |
|
19 |
+
**LLaMat-3-Chat** is a specialized large language model designed to serve as a copilot for materials research. Finetuned from **LLaMat-3**, this model is adapted for tasks such as information extraction from material science text and tabular data, table parsing, crystal generation, and more.
|
20 |
+
|
21 |
+
---
|
22 |
+
|
23 |
+
## Overview
|
24 |
+
|
25 |
+
- **Model Type:** Large Language Model (LLM)
|
26 |
+
- **Base Model:** LLaMat-3 (continued pretraining of LLaMA-3 on material science data)
|
27 |
+
- **Language:** English
|
28 |
+
- **License:** LLaMA-3 License
|
29 |
+
- **Tags:** Material Science, Domain Adaptation, Table Understanding, Scientific Data Parsing, Materials Copilot
|
30 |
+
|
31 |
+
---
|
32 |
|
|
|
|
|
|
|
33 |
## Model Details
|
|
|
|
|
34 |
|
35 |
+
### Key Features
|
36 |
+
- **Instruction Following Abilities:** Optimized for understanding and processing instructions in the material science domain.
|
37 |
+
- **Domain-Specific Expertise:** Pretrained on material science tokens, enabling high performance in scientific applications.
|
38 |
+
- **Applications:** information extraction, table understanding, and parsing data for research tasks.
|
39 |
|
40 |
+
### Development and Support
|
41 |
+
- **Developed by:** M3RG, IIT Delhi
|
42 |
+
- **Compute Support:**
|
43 |
+
- **Edinburgh International Data Facility (EIDF):** Provided access to Cerebras CS2 clusters for pretraining.
|
44 |
+
- **IIT Delhi High-Performance Computing Cluster:** Supported fine-tuning and inference stages.
|
45 |
|
46 |
+
---
|
47 |
|
48 |
+
## Technical Specifications
|
49 |
|
50 |
+
### Hardware Infrastructure
|
51 |
+
- **Pretraining:** 2 Cerebras CS-2 Wafer-Scale Engines (WSE-2)
|
52 |
+
- **Finetuning:** 8 NVIDIA A100 80GB GPUs
|
53 |
+
- **Inferencing:** 1 NVIDIA A100 80GB GPU
|
54 |
|
55 |
+
### Software Stack
|
56 |
+
- **Frameworks:** PyTorch, Hugging Face Transformers
|
57 |
|
58 |
+
---
|
|
|
|
|
59 |
|
60 |
+
## Model Sources
|
61 |
+
- **Repository:** [LLaMat-3 on GitHub](https://github.com/M3RG-IITD/llamat)
|
62 |
+
- **Compute Resources:** [EIDF Cerebras CS Clusters](https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs)
|
63 |
+
|
64 |
+
---
|
65 |
|
66 |
+
This template provides a robust foundation for understanding the **LLaMat-3-Chat** model, its capabilities, and its applications in advancing material science research.
|
|
|
|
|
|
|
|
|
|