|
--- |
|
license: llama3 |
|
language: |
|
- en |
|
base_model: |
|
- m3rg-iitd/llamat-3 |
|
tags: |
|
- material science |
|
- large language model |
|
- domain adaptation |
|
- scientific domain adaptation |
|
- materials copilot |
|
- information extraction |
|
- table understanding |
|
- table data parsing |
|
--- |
|
# Model Card for LLaMat-3-Chat |
|
|
|
**LLaMat-3-Chat** is a specialized large language model designed to serve as a copilot for materials research. Finetuned from **LLaMat-3**, this model is adapted for tasks such as information extraction from material science text and tabular data. |
|
|
|
--- |
|
|
|
## Overview |
|
|
|
- **Model Type:** Large Language Model (LLM) |
|
- **Base Model:** LLaMat-3 (continued pretraining of LLaMA-3 on material science data) |
|
- **Language:** English |
|
- **License:** LLaMA-3 License |
|
- **Tags:** Material Science, Domain Adaptation, Table Understanding, Scientific Data Parsing, Materials Copilot |
|
|
|
--- |
|
|
|
## Model Details |
|
|
|
### Key Features |
|
- **Instruction Following Abilities:** Optimized for understanding and processing instructions in the material science domain. |
|
- **Domain-Specific Expertise:** Pretrained on material science tokens, enabling high performance in scientific applications. |
|
- **Applications:** information extraction, table understanding, and parsing data for research tasks. |
|
|
|
### Development and Support |
|
- **Developed by:** [M3RG, IIT Delhi](https://github.com/M3RG-IITD/) & [DAIR, IIT Delhi](https://github.com/dair-iitd) |
|
- **Compute Support:** |
|
- **Edinburgh International Data Facility (EIDF):** Provided access to Cerebras CS2 clusters for pretraining. |
|
- **IIT Delhi High-Performance Computing Cluster:** Supported fine-tuning and inference stages. |
|
|
|
--- |
|
|
|
## Technical Specifications |
|
|
|
### Hardware Infrastructure |
|
- **Pretraining:** 2 Cerebras CS-2 Wafer-Scale Engines (WSE-2) |
|
- **Finetuning:** 8 NVIDIA A100 80GB GPUs |
|
- **Inferencing:** 1 NVIDIA A100 80GB GPU |
|
|
|
### Software Stack |
|
- **Frameworks:** PyTorch, Hugging Face Transformers |
|
|
|
--- |
|
|
|
## Model Sources |
|
- **Repository:** [LLaMat-3 on GitHub](https://github.com/M3RG-IITD/llamat) |
|
- **Compute Resources:** [EIDF Cerebras CS Clusters](https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs) |