deshanksuman's picture
Update README.md
061588f verified
metadata
base_model: meta-llama/Llama-3.2-1B-Instruct
tags:
  - text-generation-inference
  - transformers
  - llama
  - trl
license: apache-2.0
language:
  - en
datasets:
  - deshanksuman/WSD_DATASET_FEWS_SEMCOR

Uploaded model

  • Developed by: deshanksuman
  • License: apache-2.0
  • Finetuned from model : meta-llama/Llama-3.2-1B-Instruct

Dataset

Fews Training data arranged in the format of Instruction, Input and output https://huggingface.co/datasets/deshanksuman/WSD_DATASET_FEWS_SEMCOR

Code

import re
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
from pydantic import BaseModel
import json

# Load local model + LoRA adapter as before
local_directory = "meta-llama/Llama-3.2-1B-Instruct"
adapter_repo = "deshanksuman/finetuned-WSD-llama3-8b-Instruct_fews_semcor"
access_token = "hfxtoken"

tokenizer = AutoTokenizer.from_pretrained(local_directory, use_auth_token=access_token)
base_model = AutoModelForCausalLM.from_pretrained(
    local_directory,
    use_auth_token=access_token,
    device_map="auto",
    torch_dtype="auto",
    load_in_4bit=False
)
model = PeftModel.from_pretrained(base_model, adapter_repo, use_auth_token=access_token)
model.to("cuda" if torch.cuda.is_available() else "cpu")


# Function to generate structured JSON response
def generate_structured_response(question, context="You are a helpful assistant. Respond only with valid JSON.", device="cuda"):
    prompt = (
        f"{context}\n\n"
        f"Question: {question}\n\n"
        f"Respond with valid JSON only in the format: {{\"meaning\":}}"
    )
    
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    
    output_ids = model.generate(
        inputs.input_ids,
        max_new_tokens=256,
        temperature=0.3,
        top_p=0.9,
        do_sample=True,
        num_beams=3,
        no_repeat_ngram_size=3,
        early_stopping=True
    )
    
    response_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    
    return response_text

This is developed by Deshan Sumanathilaka https://sumanathilaka.github.io

Acknowledgement

We acknowledge the support of the Supercomputing Wales project, which is part-funded by the European Regional Development Fund (ERDF) via Welsh Government.