m3rg-iitd
/

llamat-3-chat

material science

large language model

domain adaptation

scientific domain adaptation

materials copilot

information extraction

table understanding

table data parsing

Model card Files Files and versions Community

llamat-3-chat / README.md

m3rg-iitd's picture

Update README.md

bc2a8c0 verified about 2 months ago

|

history blame contribute delete

2.19 kB

	---
	license: llama3
	language:
	- en
	base_model:
	- m3rg-iitd/llamat-3
	tags:
	- material science
	- large language model
	- domain adaptation
	- scientific domain adaptation
	- materials copilot
	- information extraction
	- table understanding
	- table data parsing
	---
	# Model Card for LLaMat-3-Chat

	LLaMat-3-Chat is a specialized large language model designed to serve as a copilot for materials research. Finetuned from LLaMat-3, this model is adapted for tasks such as information extraction from material science text and tabular data.

	---

	## Overview

	- Model Type: Large Language Model (LLM)
	- Base Model: LLaMat-3 (continued pretraining of LLaMA-3 on material science data)
	- Language: English
	- License: LLaMA-3 License
	- Tags: Material Science, Domain Adaptation, Table Understanding, Scientific Data Parsing, Materials Copilot

	---

	## Model Details

	### Key Features
	- Instruction Following Abilities: Optimized for understanding and processing instructions in the material science domain.
	- Domain-Specific Expertise: Pretrained on material science tokens, enabling high performance in scientific applications.
	- Applications: information extraction, table understanding, and parsing data for research tasks.

	### Development and Support
	- Developed by: [M3RG, IIT Delhi](https://github.com/M3RG-IITD/) & [DAIR, IIT Delhi](https://github.com/dair-iitd)
	- Compute Support:
	- Edinburgh International Data Facility (EIDF): Provided access to Cerebras CS2 clusters for pretraining.
	- IIT Delhi High-Performance Computing Cluster: Supported fine-tuning and inference stages.

	---

	## Technical Specifications

	### Hardware Infrastructure
	- Pretraining: 2 Cerebras CS-2 Wafer-Scale Engines (WSE-2)
	- Finetuning: 8 NVIDIA A100 80GB GPUs
	- Inferencing: 1 NVIDIA A100 80GB GPU

	### Software Stack
	- Frameworks: PyTorch, Hugging Face Transformers

	---

	## Model Sources
	- Repository: [LLaMat-3 on GitHub](https://github.com/M3RG-IITD/llamat)
	- Compute Resources: [EIDF Cerebras CS Clusters](https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs)