--- language: en license: apache-2.0 library_name: peft tags: - llama - llama-3 - construction - building-regulations - lora - custom construction industry dataset --- # LLAMA3.1-8B-Instruct-Construction This is a fine-tuned version of LLAMA3.1-8B-Instruct optimized for construction industry and building regulations knowledge. ## Model Details - **Base Model:** meta-llama/Llama-3.1-8B-Instruct - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) - **Training Data:** Custom dataset focusing on construction industry standards, building regulations, and safety requirements - **Usage:** This model is designed to answer questions about building codes, construction best practices, and regulatory compliance ## Example Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import PeftModel, PeftConfig import torch import re # Load the adapter configuration config = PeftConfig.from_pretrained("SamuelJaja/llama-3.1-8b-instruct-construction-lora-a100") # Load base model with quantization bnb_config = BitsAndBytesConfig(load_in_8bit=True) model = AutoModelForCausalLM.from_pretrained( config.base_model_name_or_path, quantization_config=bnb_config, device_map="auto" ) # Load LoRA adapter model = PeftModel.from_pretrained(model, "SamuelJaja/llama-3.1-8b-instruct-construction-lora-a100") # Load tokenizer tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) tokenizer.pad_token = tokenizer.eos_token # Clean response function def clean_response(text): return re.sub(r'\[/?INST\]', '', text).strip() # Generate text def generate_response(prompt, temperature=0.1, max_tokens=256): # Format properly if not prompt.startswith("[INST]"): formatted_prompt = f"[INST] {prompt} [/INST]" else: formatted_prompt = prompt inputs = tokenizer(formatted_prompt, return_tensors="pt").to("cuda") outputs = model.generate( input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, max_new_tokens=max_tokens, temperature=temperature, top_p=0.9, do_sample=False ) full_response = tokenizer.decode(outputs[0], skip_special_tokens=True) # Remove prompt from output if formatted_prompt in full_response: response = full_response.replace(formatted_prompt, "").strip() else: response = full_response # Clean any remaining instruction tags response = clean_response(response) return response # Example use question = "What are the main requirements for fire safety in commercial buildings?" answer = generate_response(question) print(answer) ```