# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

LIA-X is a Portrait Animator application built with Gradio that enables image animation, image editing, and video editing using deep learning models. It's deployed as a Hugging Face Space with GPU acceleration.

## Architecture

### Core Components

1. **Main Application** (`app.py`): Gradio web interface that loads the model and serves three main tabs
2. **Generator Network** (`networks/generator.py`): Core neural network model that handles animation and editing
   - Uses encoder-decoder architecture
   - Implements motion encoding and style transfer
   - Pre-allocates tensors for performance optimization
3. **Gradio Tabs** (`gradio_tabs/`): UI modules for different functionalities
   - `animation.py`: Handles image-to-video animation
   - `img_edit.py`: Image editing interface  
   - `vid_edit.py`: Video editing interface

### Model Architecture

- **Encoder** (`networks/encoder.py`): Encodes source images and motion
- **Decoder** (`networks/decoder.py`): Reconstructs edited/animated outputs
- **Custom Ops** (`networks/op/`): CUDA kernels for optimized operations (fused_act, upfirdn2d)

## Development Commands

### Running the Application

```bash
python app.py
```

The app launches a Gradio interface on local server. Note: Requires CUDA-capable GPU.

### Installing Dependencies

```bash
pip install -r requirements.txt
```

Key dependencies: PyTorch 2.5.1, torchvision, Gradio 5.42.0, einops, imageio, av

### Model Loading

The model checkpoint is automatically downloaded from Hugging Face Hub:
- Repository: `YaohuiW/LIA-X`
- File: `lia-x.pt`

## Important Notes

- This is a GPU-only application (uses `torch.device("cuda")`)
- Uses `@spaces` decorator for Hugging Face Spaces GPU allocation
- Model operates at 512x512 resolution with motion_dim=40
- Chunk size of 16 frames for video processing
- Custom CUDA kernels in `networks/op/` require compilation with ninja
- Git LFS is configured for large files (models, videos, images)

## File Processing

- Images: Loaded as RGB, resized to 512x512, normalized to [-1, 1]
- Videos: Processed with torchvision, maintains original FPS
- Supports cropping tools for better results (referenced in instruction.md)

## Testing

No explicit test suite found. Manual testing through Gradio interface.

## Data Structure

- `data/source/`: Source images for examples
- `data/driving/`: Driving videos for animation examples
- `assets/`: Documentation and UI text (instruction.md, title.md)