metadata

license: llama3
language:
  - en
base_model:
  - m3rg-iitd/llamat-3
tags:
  - material science
  - large language model
  - domain adaptation
  - scientific domain adaptation
  - crystal generation
  - materials copilot
  - information extraction
  - table understanding
  - table data parsing

Model Card for llamat-3-chat

LLaMat-3-chat is a materials research copilot.

Model Details

foundational model that is finetuned from LLaMat-3, which is made by continued pretraining of LLaMA-3 on material science tokens. It has instruction following abilities and can be used as a copilot for information extraction from material science textual or tabular data.

Model Description

Developed by: M3RG, IIT Delhi
Model type: Large Language Model based on LLaMA-3 architecture
Language(s) (NLP): English
License: LLaMA-3
Finetuned from model [optional]: m3rg-iitd/llamat-3

Model Sources [optional]

Repository: https://github.com/M3RG-IITD/llamat

Compute Infrastructure

This work was supported by the Edinburgh International Data Facility (EIDF) and the Data-Driven Innovation Programme at the University of Edinburgh. The EIDF provided access to Cerebras CS2 clusters for pretraining the language models. Link - https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs

This work is also supported by High Performance Computing cluster and Yardi School of AI at IIT Delhi.

Hardware

Pretraining: 2 CS-2 Cerebras Wafer-Scale Engine (WSE-2) Finetuning: 8 NVIDIA-A100 80GB GPUs Inferencing: 1 NVIDIA-A100 80GB GPU

Software

PyTorch, HuggingFace, Transformers