Emotionally-Aware AI Companion

Fine-tuned VideoLLaMA3 for Digital Arts Analysis

Created by Institution Art

A specialized multimodal AI model for understanding and analyzing digital artwork with emotional intelligence.

🎨 About This Model

Emotionally-Aware AI Companion is a fine-tuned version of VideoLLaMA3-7B, specifically optimized for digital arts analysis and emotional understanding. This model has been trained to recognize artistic styles, interpret visual emotions, identify artists, and provide insightful commentary on digital artwork.

🌟 Key Features

  • 🎭 Emotional Intelligence: Understands and analyzes emotional content in artwork
  • πŸ–ΌοΈ Artwork Recognition: Identifies artists, styles, and artistic movements
  • 🎨 Digital Arts Expertise: Specialized knowledge of digital art techniques and mediums
  • πŸ’¬ Conversational Interface: Natural language interaction about artwork
  • πŸ” Detailed Analysis: Provides comprehensive analysis of visual elements, composition, and artistic intent

🎯 Fine-tuning Details

  • Base Model: VideoLLaMA3-7B (DAMO-NLP-SG)
  • Training Epochs: 20 epochs
  • Specialized Dataset: Custom artwork dataset with artist annotations and emotional labels
  • Components Trained:
    • βœ… Vision Encoder (fine-tuned for artwork understanding)
    • βœ… Multimodal Projector (enhanced visual-language alignment)
    • βœ… Language Model (specialized for art terminology and analysis)

πŸš€ Quick Start

import torch
from transformers import AutoModelForCausalLM, AutoProcessor

model_name = "OneEyeDJ/videollama3-artwork-institution"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    trust_remote_code=True,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
)
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)

# Artwork analysis example
image_path = "path/to/your/artwork.jpg"
question = "Can you analyze this artwork and identify the artist and emotional themes?"

conversation = [
    {"role": "system", "content": "You are an emotionally-aware AI art companion specialized in analyzing digital artwork and understanding artistic emotions."},
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image_path},
            {"type": "text", "text": question},
        ]
    },
]

inputs = processor(conversation=conversation, return_tensors="pt")
inputs = {k: v.cuda() if isinstance(v, torch.Tensor) else v for k, v in inputs.items()}
if "pixel_values" in inputs:
    inputs["pixel_values"] = inputs["pixel_values"].to(torch.bfloat16)

output_ids = model.generate(**inputs, max_new_tokens=256)
response = processor.batch_decode(output_ids, skip_special_tokens=True)[0].strip()
print(response)

🎨 Use Cases

Artwork Analysis

  • Artist Identification: Recognize artistic styles and identify potential artists
  • Style Analysis: Analyze artistic movements, techniques, and influences
  • Composition Analysis: Understand visual elements, color theory, and composition

Emotional Understanding

  • Mood Detection: Identify emotional themes and feelings conveyed in artwork
  • Sentiment Analysis: Analyze the emotional impact and viewer response
  • Symbolic Interpretation: Understand symbolic elements and their emotional significance

Educational Applications

  • Art History: Learn about different artistic periods and movements
  • Technique Explanation: Understand digital art techniques and tools
  • Creative Inspiration: Generate ideas and artistic direction

πŸ›οΈ Institution Art

This model was developed by Institution Art, an organization dedicated to advancing the intersection of artificial intelligence and creative arts. Our mission is to create AI tools that enhance artistic understanding and creative expression.

Our Vision

To democratize art education and appreciation through AI-powered tools that make artistic knowledge accessible to everyone.

πŸ“Š Training Information

  • Training Duration: 20 epochs
  • Dataset Size: Custom artwork dataset with professional annotations
  • Model Size: ~16GB
  • Training Focus: Digital arts, emotional recognition, artist identification
  • Special Features: Enhanced vision encoder for artistic detail recognition

πŸ”§ Technical Details

Architecture

  • Vision Encoder: Fine-tuned SigLIP for artwork understanding
  • Multimodal Projector: Enhanced for visual-language alignment in art context
  • Language Model: Qwen2.5-7B with specialized art vocabulary

Performance Optimizations

  • Flash Attention 2 support for efficient inference
  • Optimized for artwork analysis tasks
  • Balanced training for both technical and emotional understanding

πŸ“ License & Usage

This model is released under the Apache 2.0 license. It builds upon the original VideoLLaMA3 work by DAMO-NLP-SG.

πŸ™ Acknowledgments

This work builds upon the excellent foundation provided by:

  • VideoLLaMA3 by DAMO-NLP-SG
  • Qwen2.5 by Alibaba Group
  • The broader open-source AI and computer vision community

Citation

If you use this model in your research or applications, please cite:

@misc{emotionally-aware-ai-companion-2025,
  title={Emotionally-Aware AI Companion: Fine-tuned VideoLLaMA3 for Digital Arts Analysis},
  author={Institution Art},
  year={2025},
  howpublished={\url{https://huggingface.co/OneEyeDJ/videollama3-artwork-institution}},
}

Original VideoLLaMA3 Citation

@article{damonlpsg2025videollama3,
  title={VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding},
  author={Boqiang Zhang, Kehan Li, Zesen Cheng, Zhiqiang Hu, Yuqian Yuan, Guanzheng Chen, Sicong Leng, Yuming Jiang, Hang Zhang, Xin Li, Peng Jin, Wenqi Zhang, Fan Wang, Lidong Bing, Deli Zhao},
  journal={arXiv preprint arXiv:2501.13106},
  year={2025},
  url = {https://arxiv.org/abs/2501.13106}
}

Emotionally-Aware AI Companion - Bridging the gap between artificial intelligence and artistic understanding 🎨✨

Downloads last month
2
Safetensors
Model size
8.04B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for OneEyeDJ/Emotionally-Aware_AI_Companion

Base model

Qwen/Qwen2.5-7B
Finetuned
(3)
this model

Datasets used to train OneEyeDJ/Emotionally-Aware_AI_Companion