license: llama3
language:
- en
base_model:
- m3rg-iitd/llamat-3
tags:
- material science
- large language model
- domain adaptation
- scientific domain adaptation
- crystal generation
- materials copilot
- information extraction
- table understanding
- table data parsing
Model Card for llamat-3-chat
LLaMat-3-chat is a materials research copilot.
Model Details
foundational model that is finetuned from LLaMat-3, which is made by continued pretraining of LLaMA-3 on material science tokens. It has instruction following abilities and can be used as a copilot for information extraction from material science textual or tabular data.
Model Description
- Developed by: M3RG, IIT Delhi
- Model type: Large Language Model based on LLaMA-3 architecture
- Language(s) (NLP): English
- License: LLaMA-3
- Finetuned from model [optional]: m3rg-iitd/llamat-3
Model Sources [optional]
- Repository: https://github.com/M3RG-IITD/llamat
Compute Infrastructure
This work was supported by the Edinburgh International Data Facility (EIDF) and the Data-Driven Innovation Programme at the University of Edinburgh. The EIDF provided access to Cerebras CS2 clusters for pretraining the language models. Link - https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs
This work is also supported by High Performance Computing cluster and Yardi School of AI at IIT Delhi.
Hardware
Pretraining: 2 CS-2 Cerebras Wafer-Scale Engine (WSE-2) Finetuning: 8 NVIDIA-A100 80GB GPUs Inferencing: 1 NVIDIA-A100 80GB GPU
Software
PyTorch, HuggingFace, Transformers