Spaces:

rdesai2
/

LoRACaptioner

Running

App Files Files Community

Rishi Desai commited on May 1

Commit

524c601

1 Parent(s): aa6982c

readme + trigger word

Browse files

Files changed (2) hide show

README.md +20 -40
caption.py +1 -1

README.md CHANGED Viewed

@@ -1,18 +1,18 @@
-# AutoCaptioner
-A tool to automatically
-* generate detailed image captions to train higher-quality LoRA and
-* optimize your prompts during inference.
-<div style="text-align: center;">
-  <img src="examples/caption_example.gif" alt="Captioning Example" width="600"/>
-</div>
-## What is AutoCaptioner?
-AutoCaptioner creates detailed, principled image captions for your LoRA dataset. These captions can be used to:
-- Train more expressive LoRAs on Flux or SDXL
-- Make inference easy via prompt optimization
-- Save time compared to manual captioning or ignoring captioning
 ## Installation
@@ -20,7 +20,6 @@ AutoCaptioner creates detailed, principled image captions for your LoRA dataset.
 - Python 3.11 or higher
 - [Together API](https://together.ai/) account and API key
 ### Setup
 1. Create the virtual environment:
@@ -30,9 +29,7 @@ AutoCaptioner creates detailed, principled image captions for your LoRA dataset.
    python -m pip install -r requirements.txt
    ```
-2. Set your Together API key: `TOGETHER_API_KEY`
-3. Run inference on one set of images:
    ```bash
    python main.py --input examples/ --output output/
@@ -43,8 +40,7 @@ AutoCaptioner creates detailed, principled image captions for your LoRA dataset.
    - `--input` (str): Directory containing images to caption.
    - `--output` (str): Directory to save images and captions (defaults to input directory).
-   - `--fix_outfit` (flag): Indicate if character has one outfit (for consistent descriptions).
-   - `--batch_images` (flag): Process images in batches by category.
    </details>
@@ -55,31 +51,15 @@ Launch a user-friendly web interface for captioning and prompt optimization:
 python demo.py
 ```
-### Features
-- High-accuracy image captioning with detailed contextual descriptions
-- Consistent character descriptions when using the outfit flag
-- Batch processing for large image collections
-- Optimized for AI model training datasets
-- Web interface for easy use
-## How It Works
-AutoCaptioner leverages the Llama-4-Maverick model through the Together AI platform to:
-1. Analyze the visual content of your images
-2. Generate detailed, structured captions
-3. Save the captions as text files alongside your images
-## Notes
 - Images are processed individually in standard mode
 - For large collections, batch processing by category is recommended
 - Each caption is saved as a .txt file with the same name as the image
 ### Troubleshooting
-- **API errors**: Ensure your Together API key is set correctly
-- **Unsupported formats**: Only .png, .jpg, .jpeg, and .webp files are supported
-- **Memory issues**: For very large images, try processing in smaller batches
 ### Examples

+---
+title: LoRACaptioner
+emoji: 🤠
+colorFrom: red
+colorTo: green
+sdk: gradio
+sdk_version: 5.25.2
+app_file: demo.py
+pinned: false
+---
+# LoRACaptioner
+- **Image Captioning**: Automatically generate detailed and structured captions for your LoRA dataset.
+- **Prompt Optimization**: Enhance prompts during inference to achieve high-quality outputs.
 ## Installation
 - Python 3.11 or higher
 - [Together API](https://together.ai/) account and API key
 ### Setup
 1. Create the virtual environment:
    python -m pip install -r requirements.txt
    ```
+2. Run inference on one set of images:
    ```bash
    python main.py --input examples/ --output output/
    - `--input` (str): Directory containing images to caption.
    - `--output` (str): Directory to save images and captions (defaults to input directory).
+   - `--batch_images` (flag): Caption images in batches by category.
    </details>
 python demo.py
 ```
+### Notes
 - Images are processed individually in standard mode
 - For large collections, batch processing by category is recommended
 - Each caption is saved as a .txt file with the same name as the image
 ### Troubleshooting
+- **API errors**: Ensure your Together API key is set and has funds
+- **Image formats**: Only .png, .jpg, .jpeg, and .webp files are supported
 ### Examples

caption.py CHANGED Viewed

@@ -4,7 +4,7 @@ import os
 from together import Together
 MODEL_ID = "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
-TRIGGER_WORD = "tr1gger"
 def get_system_prompt():
     return f"""Automated Image Captioning (for LoRA Training)

 from together import Together
 MODEL_ID = "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
+TRIGGER_WORD = "tr1gg3r"
 def get_system_prompt():
     return f"""Automated Image Captioning (for LoRA Training)