Matellem-Gemma3n-E4B-Graphene-1
A fine-tuned language model, part of the Matellem project, specialized for multi-task analysis of scientific literature in the field of graphene research.
About The Project
The field of materials science, particularly research into graphene, is expanding at an incredible rate. The sheer volume of published literature makes it challenging for researchers to stay updated and find specific information efficiently.
Matellem is designed to address this challenge. This model, built upon Google's powerful and efficient gemma-3n-E4B-it
, has been specifically fine-tuned to understand the complex language, nuances, and key concepts within graphene-related scientific abstracts. It serves as a specialized tool to accelerate the research process through precise data extraction, summarization, and question answering.
Model Details
- Base Model:
google/gemma-3n-E4B-it
- Fine-tuning Data: The model was fine-tuned on a custom, high-quality dataset consisting of 2,329 question-answer pairs. This dataset was meticulously generated from 462 research paper abstracts focused on graphene.
- Fine-tuning Technique: The model was trained using Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning (PEFT) method. LoRA was applied to the attention mechanism layers (
q_proj
,k_proj
,v_proj
,o_proj
,gate_proj
,up_proj
,down_proj
) to adapt the model to the specific domain while preserving its core capabilities. - Training Configuration: Trained using
bf16
precision for stability and speed, with theadamw_8bit
optimizer.
Capabilities
This model is designed to perform a variety of tasks related to scientific literature analysis:
- Precise Question Answering: Answering specific technical questions based on the content of a provided abstract.
- Accurate Summarization: Generating concise yet comprehensive summaries of the key findings and methodologies of a paper.
- Information Extraction: Identifying and extracting specific data points, such as material properties, numerical values, or synthesis methods, from unstructured text.
- Semantic Retrieval: Understanding the core concepts of a research paper, enabling the identification of relevant literature from natural language descriptions.
How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
import torch
model_id = "Shinapri/Matellem-Gemma3n-E4B-Graphene-1"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto").eval()
tokenizer = AutoTokenizer.from_pretrained(model_id)
INSTRUCTION = "You are a scientific literature search expert. Your task is to identify the title of a research paper based on a user's description of its key methods and findings."
USER_INPUT = """I'm looking for a paper about manipulating graphene plasmons.
The key method involved using a ferroelectric nanocavity array to create a periodic doping pattern on the graphene.
I remember they could tune the plasmon resonance by dynamically changing the applied gate voltage. Can you identify the title?"""
messages = [
{"role": "system", "content": [{"type": "text", "text": INSTRUCTION}]},
{"role": "user", "content": [{"type": "text", "text": USER_INPUT}]},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
streamer = TextStreamer(tokenizer, skip_prompt=True)
with torch.no_grad():
_ = model.generate(
**inputs,
streamer=streamer,
do_sample=True,
max_new_tokens=1024,
top_p=0.9,
temperature=0.7
)
Example Output & Evaluation Note:
The model's response for this query is highly relevant:
> The title of the paper is likely: **"Voltage-tunable plasmonics on few-layer graphene based on a ferroelectric nanocavity array"**
For reference, the original paper's title is "Tunable plasmonic devices by integrating graphene with ferroelectric nanocavity". A cosine similarity score of 0.9638 was obtained when comparing the embeddings of these two titles using the intfloat/e5-large
model.
Future Work / Roadmap
This fine-tuned model is the core component of the larger Matellem project. The next major step is to integrate it into an agentic framework.
- Agentic RAG System: Develop an agent that can use this model to autonomously search for relevant literature via APIs (e.g., arXiv), analyze the retrieved documents, and synthesize information from multiple sources to answer complex user queries.
Authorship & Contact
– Model processed by: Shinapri
– GitHub: https://github.com/ShinapriLN
- Downloads last month
- 2