llamat-3-chat / README.md
m3rg-iitd's picture
added Model Card
336283a verified
|
raw
history blame
2.15 kB
metadata
license: llama3
language:
  - en
base_model:
  - m3rg-iitd/llamat-3
tags:
  - material science
  - large language model
  - domain adaptation
  - scientific domain adaptation
  - crystal generation
  - materials copilot
  - information extraction
  - table understanding
  - table data parsing

Model Card for llamat-3-chat

LLaMat-3-chat is a materials research copilot.

Model Details

foundational model that is finetuned from LLaMat-3, which is made by continued pretraining of LLaMA-3 on material science tokens. It has instruction following abilities and can be used as a copilot for information extraction from material science textual or tabular data.

Model Description

  • Developed by: M3RG, IIT Delhi
  • Model type: Large Language Model based on LLaMA-3 architecture
  • Language(s) (NLP): English
  • License: LLaMA-3
  • Finetuned from model [optional]: m3rg-iitd/llamat-3

Model Sources [optional]

Compute Infrastructure

This work was supported by the Edinburgh International Data Facility (EIDF) and the Data-Driven Innovation Programme at the University of Edinburgh. The EIDF provided access to Cerebras CS2 clusters for pretraining the language models. Link - https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs

This work is also supported by High Performance Computing cluster and Yardi School of AI at IIT Delhi.

Hardware

Pretraining: 2 CS-2 Cerebras Wafer-Scale Engine (WSE-2) Finetuning: 8 NVIDIA-A100 80GB GPUs Inferencing: 1 NVIDIA-A100 80GB GPU

Software

PyTorch, HuggingFace, Transformers