m3rg-iitd commited on
Commit
336283a
·
verified ·
1 Parent(s): ea58c4b

added Model Card

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ # This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
3
+ license: llama3
4
+ language:
5
+ - en
6
+ base_model:
7
+ - m3rg-iitd/llamat-3
8
+ tags:
9
+ - material science
10
+ - large language model
11
+ - domain adaptation
12
+ - scientific domain adaptation
13
+ - crystal generation
14
+ - materials copilot
15
+ - information extraction
16
+ - table understanding
17
+ - table data parsing
18
+ ---
19
+
20
+ # Model Card for llamat-3-chat
21
+
22
+ <!-- Provide a quick summary of what the model is/does. -->
23
+ LLaMat-3-chat is a materials research copilot.
24
+
25
+ ## Model Details
26
+ foundational model that is finetuned from LLaMat-3, which is made by continued pretraining of LLaMA-3 on material science tokens. It has instruction following abilities and can be used as a copilot for information extraction from material science textual or tabular data.
27
+ ### Model Description
28
+
29
+ <!-- Provide a longer summary of what this model is. -->
30
+
31
+ - **Developed by:** M3RG, IIT Delhi
32
+ - **Model type:** Large Language Model based on LLaMA-3 architecture
33
+ - **Language(s) (NLP):** English
34
+ - **License:** LLaMA-3
35
+ - **Finetuned from model [optional]:** m3rg-iitd/llamat-3
36
+
37
+ ### Model Sources [optional]
38
+
39
+ <!-- Provide the basic links for the model. -->
40
+
41
+ - **Repository:** https://github.com/M3RG-IITD/llamat
42
+ <!-- - **Paper [optional]:** [More Information Needed] -->
43
+ <!-- - **Demo [optional]:** [More Information Needed] -->
44
+
45
+
46
+ ### Compute Infrastructure
47
+ This work was supported by the Edinburgh International Data Facility (EIDF) and the Data-Driven Innovation Programme at the University of Edinburgh. The EIDF provided access to Cerebras CS2 clusters for pretraining the language models.
48
+ Link - https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs
49
+
50
+ This work is also supported by High Performance Computing cluster and Yardi School of AI at IIT Delhi.
51
+
52
+ #### Hardware
53
+ Pretraining: 2 CS-2 Cerebras Wafer-Scale Engine (WSE-2)
54
+ Finetuning: 8 NVIDIA-A100 80GB GPUs
55
+ Inferencing: 1 NVIDIA-A100 80GB GPU
56
+ #### Software
57
+ PyTorch, HuggingFace, Transformers