--- base_model: unsloth/llama-3.2-11b-vision-instruct-bnb-4bit library_name: peft --- # Model Card: LlamaFloorPlanVisionAIAdaptor ## Model Details **Model Name:** FloorPlanVisionAIAdaptor **Task:** Floor plan analysis for architectural and interior design insights **Framework:** PyTorch with `unsloth` --- ## Model Description The `FloorPlanVisionAIAdaptor` model is a state-of-the-art Vision-Language Model (VLM) designed for analyzing floor plan images. The model leverages a deep neural architecture optimized for tasks requiring detailed visual understanding combined with textual reasoning. It can infer the layout, room counts, key features, and other architectural details from images of floor plans. ### Key Features: - **Multi-modal Input:** Accepts both image and text input for contextual understanding. - **Expertise Emulation:** Simulates the expertise of an architect or interior designer. - **Gradient Checkpointing:** Reduces memory usage, enabling analysis of high-resolution images. - **Flexible Precision:** Supports 4-bit inference depending on memory constraints. ### Applications: - Automated floor plan analysis for real estate listings. - Assisting architects in creating and verifying designs. - Generating insights for interior design and space planning. - Educational purposes in architecture and design training. --- ## Intended Use ### Primary Use Cases: - To analyze and interpret floor plan images, providing detailed descriptions of: - Room layout and connections. - Room dimensions and count. - Unique architectural features. - To assist architects, designers, and real estate professionals in understanding and documenting floor plans. ### Users: - Architects - Interior Designers - Real Estate Professionals - Educators and Students in Architecture and Design --- ## How to Use the Model ### Installation To use the model, ensure that the required libraries such as `torch`, `unsloth`, and `transformers` are installed: ```bash pip install torch unsloth transformers ``` ### Loading the Model The following Python script demonstrates how to load and use the model: ```python import os from unsloth import FastVisionModel # Import FastVisionModel for Vision-Language tasks import torch # Load the pre-trained model and tokenizer model, tokenizer = FastVisionModel.from_pretrained( "sabaridsnfuji/FloorPlanVisionAIAdaptor", load_in_4bit=True, # Use 4-bit precision to save memory if needed use_gradient_checkpointing="unsloth" # Enable gradient checkpointing for efficiency ) FastVisionModel.for_inference(model) # Enable inference mode from PIL import Image # Function to load image using PIL and return image object def load_image(image_path): try: image = Image.open(image_path) return image except Exception as e: print(f"Error loading image {image_path}: {e}") return None # Define the instruction and input instruction = """You are an expert in architecture and interior design. Analyze the floor plan image and describe accurately the key features, room count, layout, and any other important details you observe.""" image = load_image("/content/sample_images/5_2.jpg") # converted_dataset[0]["image"] # Format input message messages = [ {"role": "user", "content": [ {"type": "image"}, {"type": "text", "text": instruction} ]} ] input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True) # Prepare inputs inputs = tokenizer( image, # Replace with the actual image tensor input_text, add_special_tokens=False, return_tensors="pt", ).to("cuda") # Perform inference from transformers import TextStreamer text_streamer = TextStreamer(tokenizer, skip_prompt=True) output = model.generate( **inputs, streamer=text_streamer, max_new_tokens=2048, use_cache=True, temperature=1.5, min_p=0.1 ) ``` ## Input image: ![Floor Plan Image](https://huggingface.co/sabaridsnfuji/FloorPlanVisionAIAdaptor/resolve/main/5_1.jpg) ## Sample Output: ``` **Room Count:** 1 bedroom, 1 study/office, 1 bathroom, kitchen, living room, dining room, verandah. **Room Types and Labels:** Bedroom, kitchen, living room, dining room, study/office, bathroom, verandah. **Room Sizes:** - Bedroom: 9'8" x 9'10" - Kitchen: 22'8" x 13'0" - Dining Room: 10'0" x 13'0" - Living Room: 13'8" x 15'6" - Study/Office: 9'8" x 9'10" **Primary Features:** Stairs, verandah, windows along perimeter, kitchen island. **Functional Areas:** Bathroom adjacent to kitchen; no pantry or mudroom. Kitchen island provides functional space. **Layout Overview:** Central stairs with rooms radiating off. Kitchen near bathroom; living and dining areas open-plan. **Flooring and Attributes:** Tile in bathroom, verandah, and main living spaces. Likely standard ceiling height. **Summary:** Compact, single-floor layout with essential living spaces and utility rooms. Open-plan living areas provide fluid movement; stairs likely provide additional storage.<|eot_id|> ``` --- ## Limitations - **Domain-Specific Knowledge:** While the model is trained to emulate architectural expertise, it may not replace human professionals for complex design tasks. - **Image Quality:** Performance may degrade for low-resolution or incomplete floor plans. - **Generalization:** The model may struggle with floor plans featuring unconventional layouts or non-standard symbols. --- ## Training Data The model was trained on a curated dataset of architectural floor plans, annotated with detailed descriptions of rooms, layouts, and features. --- --- ## Acknowledgements The development of this model was inspired by advancements in multi-modal AI and the need for intelligent systems in the architectural domain. Special thanks to the contributors of `unsloth` and `transformers` libraries. ---