Spaces:

Timxjl
/

text2svg-demo-app

Paused

App Files Files Community

Jinglong Xiong commited on Apr 22

Commit

8bf4ef4

1 Parent(s): c0f4df5

add dockerfile, add readme

Browse files

Files changed (7) hide show

.dockerignore +58 -0
.gitignore +1 -0
Dockerfile +54 -0
README.md +170 -0
app.py +31 -7
docker-compose.yml +21 -0
requirements.txt +2 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,58 @@

+results/png/
+results/svg/
+results/*.json
+unsloth_compiled_cache/
+*.ipynb
+SVGDreamer/
+*.parquet
+# Git
+.git
+.gitignore
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual Environment
+venv/
+.env.local
+# Generated files
+logs/
+*.log
+.ipynb_checkpoints
+results/
+# VSCode
+.vscode/
+# Model caches
+.cache/
+unsloth_compiled_cache/
+# Docker
+Dockerfile
+docker-compose.yml
+.dockerignore
+# Documentation
+README-HF.md

.gitignore CHANGED Viewed

@@ -7,6 +7,7 @@ star-vector/
 SVGDreamer/
 *.parquet
 *.pth
 # Byte-compiled / optimized / DLL files
 __pycache__/

 SVGDreamer/
 *.parquet
 *.pth
+diff_image.png
 # Byte-compiled / optimized / DLL files
 __pycache__/

Dockerfile ADDED Viewed

	@@ -0,0 +1,54 @@

+FROM nvidia/cuda:12.1.1-devel-ubuntu22.04
+# Set environment variables
+ENV PYTHONUNBUFFERED=1 \
+    PYTHONDONTWRITEBYTECODE=1 \
+    DEBIAN_FRONTEND=noninteractive
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    python3-pip \
+    python3-dev \
+    git \
+    wget \
+    libcairo2-dev \
+    pkg-config \
+    libgl1 \
+    libglib2.0-0 \
+    libsm6 \
+    libxrender1 \
+    libxext6 \
+    ffmpeg \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements first to leverage Docker cache
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir --upgrade pip && \
+    pip install --no-cache-dir -r requirements.txt && \
+    pip install --no-cache-dir 'tensorflow[and-cuda]' && \
+    pip install --no-cache-dir git+https://github.com/openai/CLIP.git
+# Copy the whole application
+COPY . .
+# Install and build star-vector if it exists
+# COPY star-vector/ ./star-vector/
+# RUN if [ -d "star-vector" ]; then cd star-vector && pip install -e . && cd ..; fi
+# Set environment variables for GPU usage
+ENV NVIDIA_VISIBLE_DEVICES=all \
+    NVIDIA_DRIVER_CAPABILITIES=compute,utility
+# Expose port for Streamlit
+EXPOSE 8501
+# Create a healthcheck
+HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health || exit 1
+# Set entry point
+CMD yes | streamlit run app.py --server.port=8501 --server.address=0.0.0.0

README.md ADDED Viewed

	@@ -0,0 +1,170 @@

+# Drawing with LLM 🎨
+A Streamlit application that converts text descriptions into SVG graphics using multiple AI models.
+## Overview
+This project allows users to create vector graphics (SVG) from text descriptions using three different approaches:
+1. **ML Model** - Uses Stable Diffusion to generate images and vtracer to convert them to SVG
+2. **DL Model** - Uses Stable Diffusion for initial image creation and StarVector for direct image-to-SVG conversion
+3. **Naive Model** - Uses Phi-4 LLM to directly generate SVG code from text descriptions
+## Features
+- Text-to-SVG generation with three different model approaches
+- Adjustable parameters for each model type
+- Real-time SVG preview and code display
+- SVG download functionality
+- GPU acceleration for faster generation
+## Requirements
+- Python 3.11+
+- CUDA-compatible GPU (recommended)
+- Dependencies listed in `requirements.txt`
+## Installation
+### Using Miniconda (Recommended)
+```bash
+# Install Miniconda
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
+bash miniconda.sh -b -p $HOME/miniconda
+echo 'export PATH="$HOME/miniconda/bin:$PATH"' >> ~/.bashrc
+source ~/.bashrc
+# Create and activate environment
+conda create -n svg-app python=3.11 -y
+conda activate svg-app
+# Install star-vector
+cd star-vector
+pip install -e .
+cd ..
+# Install other dependencies
+pip install -r requirements.txt
+```
+### Using Docker
+```bash
+# Build and run with Docker Compose
+docker-compose up -d
+```
+## Usage
+Start the Streamlit application:
+```bash
+streamlit run app.py
+```
+Or with the yes flag to automatically accept:
+```bash
+yes | streamlit run app.py
+```
+The application will be available at http://localhost:8501
+## Models
+### ML Model (vtracer)
+Uses Stable Diffusion to generate an image from the text prompt, then applies vtracer to convert the raster image to SVG.
+Configurable parameters:
+- Simplify SVG
+- Color Precision
+- Filter Speckle
+- Path Precision
+### DL Model (starvector)
+Uses Stable Diffusion for initial image creation followed by StarVector, a specialized model designed to convert images directly to SVG.
+### Naive Model (phi-4)
+Directly generates SVG code using the Phi-4 language model with specialized prompting.
+Configurable parameters:
+- Max New Tokens
+## Evaluation Data and Results
+### Data
+The `data` directory contains synthetic evaluation data created using custom scripts:
+- The first 15 examples are from the Kaggle competition "Drawing with LLM"
+- `descriptions.csv` - Text descriptions for generating SVGs
+- `eval.csv` - Evaluation metrics
+- `gen_descriptions.py` - Script for generating synthetic descriptions
+- `gen_vqa.py` - Script for generating visual question answering data
+- Sample images (`gray_coat.png`, `purple_forest.png`) for reference
+### Results
+The `results` directory contains evaluation results comparing different models:
+- Evaluation results for both Naive (Phi-4) and ML (vtracer) models
+- The DL model (StarVector) was not evaluated as it typically fails on transforming natural images, often returning blank SVGs
+- Performance visualizations:
+  - `category_radar.png` - Performance comparison across categories
+  - `complexity_performance.png` - Performance relative to prompt complexity
+  - `quality_vs_time.png` - Quality-time tradeoff analysis
+  - `generation_time.png` - Comparison of generation times
+  - `model_comparison.png` - Overall model performance comparison
+- Generated SVGs and PNGs in respective subdirectories
+- Detailed results in JSON and CSV formats
+## Project Structure
+```
+drawing-with-llm/             # Root directory
+│
+├── app.py                    # Main Streamlit application
+├── requirements.txt          # Python dependencies
+├── Dockerfile                # Docker container definition
+├── docker-compose.yml        # Docker Compose configuration
+│
+├── ml.py                     # ML model implementation (vtracer approach)
+├── dl.py                     # DL model implementation (StarVector approach)
+├── naive.py                  # Naive model implementation (Phi-4 approach)
+├── gen_image.py              # Common image generation using Stable Diffusion
+│
+├── eval.py                   # Evaluation script for model comparison
+├── eval_analysis.py          # Analysis script for evaluation results
+├── metric.py                 # Metrics implementation for evaluation
+│
+├── data/                     # Evaluation data directory
+│   ├── descriptions.csv      # Text descriptions for evaluation
+│   ├── eval.csv              # Evaluation metrics
+│   ├── gen_descriptions.py   # Script for generating synthetic descriptions
+│   ├── gen_vqa.py            # Script for generating VQA data
+│   ├── gray_coat.png         # Sample image by GPT-4o
+│   └── purple_forest.png     # Sample image by GPT-4o
+│
+├── results/                  # Evaluation results directory
+│   ├── category_radar.png    # Performance comparison across categories
+│   ├── complexity_performance.png # Performance by prompt complexity
+│   ├── quality_vs_time.png   # Quality-time tradeoff analysis
+│   ├── generation_time.png   # Comparison of generation times
+│   ├── model_comparison.png  # Overall model performance comparison
+│   ├── summary_*.csv         # Summary metrics in CSV format
+│   ├── results_*.json        # Detailed results in JSON format
+│   ├── svg/                  # Generated SVG outputs
+│   └── png/                  # Generated PNG outputs
+│
+├── star-vector/              # StarVector dependency (installed locally)
+└── starvector/               # StarVector Python package
+```
+## License
+[Specify your license information here]
+## Acknowledgments
+This project utilizes several key technologies:
+- [Stable Diffusion](https://github.com/CompVis/stable-diffusion) for image generation
+- [StarVector](https://github.com/joanrod/star-vector) for image-to-SVG conversion
+- [vtracer](https://github.com/visioncortex/vtracer) for raster-to-vector conversion
+- [Phi-4](https://huggingface.co/microsoft/phi-4) for text-to-SVG generation
+- [Streamlit](https://streamlit.io/) for the web interface

app.py CHANGED Viewed

@@ -1,7 +1,8 @@
 import streamlit as st
 import base64
 from ml import MLModel
-from dl import DLModel
 st.set_page_config(page_title="Drawing with LLM", page_icon="🎨", layout="wide")
@@ -10,18 +11,38 @@ def load_ml_model():
     return MLModel(device="cuda" if st.session_state.get("use_gpu", True) else "cpu")
 @st.cache_resource
-def load_dl_model():
-    return DLModel(device="cuda" if st.session_state.get("use_gpu", True) else "cpu")
 def render_svg(svg_content):
     b64 = base64.b64encode(svg_content.encode("utf-8")).decode("utf-8")
     return f'<img src="data:image/svg+xml;base64,{b64}" width="100%" height="auto"/>'
 st.title("Drawing with LLM 🎨")
 with st.sidebar:
     st.header("Settings")
-    model_type = st.selectbox("Model Type", ["ML Model (vtracer)", "DL Model (starvector)"])
     use_gpu = st.checkbox("Use GPU", value=True)
     st.session_state["use_gpu"] = use_gpu
@@ -31,6 +52,9 @@ with st.sidebar:
         color_precision = st.slider("Color Precision", 1, 10, 6)
         filter_speckle = st.slider("Filter Speckle", 0, 10, 4)
         path_precision = st.slider("Path Precision", 1, 10, 8)
 prompt = st.text_area("Enter your description", "A cat sitting on a windowsill at sunset")
@@ -45,9 +69,9 @@ if st.button("Generate SVG"):
                 filter_speckle=filter_speckle,
                 path_precision=path_precision
             )
-        else:
-            model = load_dl_model()
-            svg_content = model.predict(prompt)
         col1, col2 = st.columns(2)

 import streamlit as st
 import base64
 from ml import MLModel
+from naive import NaiveModel
+import torch
 st.set_page_config(page_title="Drawing with LLM", page_icon="🎨", layout="wide")
     return MLModel(device="cuda" if st.session_state.get("use_gpu", True) else "cpu")
 @st.cache_resource
+def load_naive_model():
+    return NaiveModel(device="cuda" if st.session_state.get("use_gpu", True) else "cpu")
 def render_svg(svg_content):
     b64 = base64.b64encode(svg_content.encode("utf-8")).decode("utf-8")
     return f'<img src="data:image/svg+xml;base64,{b64}" width="100%" height="auto"/>'
+def clear_gpu_memory():
+    if torch.cuda.is_available():
+        torch.cuda.empty_cache()
+        torch.cuda.ipc_collect()
 st.title("Drawing with LLM 🎨")
+# Initialize session state for model type if not already set
+if "current_model_type" not in st.session_state:
+    st.session_state["current_model_type"] = None
 with st.sidebar:
     st.header("Settings")
+    previous_model_type = st.session_state.get("current_model_type")
+    model_type = st.selectbox("Model Type", ["ML Model (vtracer)", "Naive Model (phi-4)"])
+    # Check if model type has changed
+    if previous_model_type is not None and previous_model_type != model_type:
+        st.cache_resource.clear()
+        clear_gpu_memory()
+        st.success(f"Cleared VRAM after switching from {previous_model_type} to {model_type}")
+    # Update current model type in session state
+    st.session_state["current_model_type"] = model_type
     use_gpu = st.checkbox("Use GPU", value=True)
     st.session_state["use_gpu"] = use_gpu
         color_precision = st.slider("Color Precision", 1, 10, 6)
         filter_speckle = st.slider("Filter Speckle", 0, 10, 4)
         path_precision = st.slider("Path Precision", 1, 10, 8)
+    elif model_type == "Naive Model (phi-4)":
+        st.subheader("Naive Model Settings")
+        max_new_tokens = st.slider("Max New Tokens", 256, 1024, 512)
 prompt = st.text_area("Enter your description", "A cat sitting on a windowsill at sunset")
                 filter_speckle=filter_speckle,
                 path_precision=path_precision
             )
+        else:  # Naive Model
+            model = load_naive_model()
+            svg_content = model.predict(prompt, max_new_tokens=max_new_tokens)
         col1, col2 = st.columns(2)

docker-compose.yml ADDED Viewed

	@@ -0,0 +1,21 @@

+version: '3.8'
+services:
+  app:
+    build:
+      context: .
+      dockerfile: Dockerfile
+    restart: unless-stopped
+    ports:
+      - "8501:8501"
+    volumes:
+      - ./.env:/app/.env
+    environment:
+      - NVIDIA_VISIBLE_DEVICES=all
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]

requirements.txt CHANGED Viewed

@@ -25,6 +25,8 @@ vtracer==0.6.11
 deepspeed==0.16.7
 torch==2.5.1
 torchvision==0.20.1
 # pip install 'tensorflow[and-cuda]'
 # pip install git+https://github.com/openai/CLIP.git

 deepspeed==0.16.7
 torch==2.5.1
 torchvision==0.20.1
+streamlit==1.44.1
+lxml==5.3.2
 # pip install 'tensorflow[and-cuda]'
 # pip install git+https://github.com/openai/CLIP.git