Rishi Desai commited on
Commit
524c601
·
1 Parent(s): aa6982c

readme + trigger word

Browse files
Files changed (2) hide show
  1. README.md +20 -40
  2. caption.py +1 -1
README.md CHANGED
@@ -1,18 +1,18 @@
1
- # AutoCaptioner
2
- A tool to automatically
3
- * generate detailed image captions to train higher-quality LoRA and
4
- * optimize your prompts during inference.
5
-
6
- <div style="text-align: center;">
7
- <img src="examples/caption_example.gif" alt="Captioning Example" width="600"/>
8
- </div>
9
-
10
- ## What is AutoCaptioner?
11
-
12
- AutoCaptioner creates detailed, principled image captions for your LoRA dataset. These captions can be used to:
13
- - Train more expressive LoRAs on Flux or SDXL
14
- - Make inference easy via prompt optimization
15
- - Save time compared to manual captioning or ignoring captioning
16
 
17
  ## Installation
18
 
@@ -20,7 +20,6 @@ AutoCaptioner creates detailed, principled image captions for your LoRA dataset.
20
  - Python 3.11 or higher
21
  - [Together API](https://together.ai/) account and API key
22
 
23
-
24
  ### Setup
25
 
26
  1. Create the virtual environment:
@@ -30,9 +29,7 @@ AutoCaptioner creates detailed, principled image captions for your LoRA dataset.
30
  python -m pip install -r requirements.txt
31
  ```
32
 
33
- 2. Set your Together API key: `TOGETHER_API_KEY`
34
-
35
- 3. Run inference on one set of images:
36
 
37
  ```bash
38
  python main.py --input examples/ --output output/
@@ -43,8 +40,7 @@ AutoCaptioner creates detailed, principled image captions for your LoRA dataset.
43
 
44
  - `--input` (str): Directory containing images to caption.
45
  - `--output` (str): Directory to save images and captions (defaults to input directory).
46
- - `--fix_outfit` (flag): Indicate if character has one outfit (for consistent descriptions).
47
- - `--batch_images` (flag): Process images in batches by category.
48
  </details>
49
 
50
 
@@ -55,31 +51,15 @@ Launch a user-friendly web interface for captioning and prompt optimization:
55
  python demo.py
56
  ```
57
 
58
- ### Features
59
-
60
- - High-accuracy image captioning with detailed contextual descriptions
61
- - Consistent character descriptions when using the outfit flag
62
- - Batch processing for large image collections
63
- - Optimized for AI model training datasets
64
- - Web interface for easy use
65
-
66
- ## How It Works
67
-
68
- AutoCaptioner leverages the Llama-4-Maverick model through the Together AI platform to:
69
- 1. Analyze the visual content of your images
70
- 2. Generate detailed, structured captions
71
- 3. Save the captions as text files alongside your images
72
-
73
- ## Notes
74
  - Images are processed individually in standard mode
75
  - For large collections, batch processing by category is recommended
76
  - Each caption is saved as a .txt file with the same name as the image
77
 
78
  ### Troubleshooting
79
 
80
- - **API errors**: Ensure your Together API key is set correctly
81
- - **Unsupported formats**: Only .png, .jpg, .jpeg, and .webp files are supported
82
- - **Memory issues**: For very large images, try processing in smaller batches
83
 
84
  ### Examples
85
 
 
1
+ ---
2
+ title: LoRACaptioner
3
+ emoji: 🤠
4
+ colorFrom: red
5
+ colorTo: green
6
+ sdk: gradio
7
+ sdk_version: 5.25.2
8
+ app_file: demo.py
9
+ pinned: false
10
+ ---
11
+
12
+ # LoRACaptioner
13
+
14
+ - **Image Captioning**: Automatically generate detailed and structured captions for your LoRA dataset.
15
+ - **Prompt Optimization**: Enhance prompts during inference to achieve high-quality outputs.
16
 
17
  ## Installation
18
 
 
20
  - Python 3.11 or higher
21
  - [Together API](https://together.ai/) account and API key
22
 
 
23
  ### Setup
24
 
25
  1. Create the virtual environment:
 
29
  python -m pip install -r requirements.txt
30
  ```
31
 
32
+ 2. Run inference on one set of images:
 
 
33
 
34
  ```bash
35
  python main.py --input examples/ --output output/
 
40
 
41
  - `--input` (str): Directory containing images to caption.
42
  - `--output` (str): Directory to save images and captions (defaults to input directory).
43
+ - `--batch_images` (flag): Caption images in batches by category.
 
44
  </details>
45
 
46
 
 
51
  python demo.py
52
  ```
53
 
54
+ ### Notes
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
  - Images are processed individually in standard mode
56
  - For large collections, batch processing by category is recommended
57
  - Each caption is saved as a .txt file with the same name as the image
58
 
59
  ### Troubleshooting
60
 
61
+ - **API errors**: Ensure your Together API key is set and has funds
62
+ - **Image formats**: Only .png, .jpg, .jpeg, and .webp files are supported
 
63
 
64
  ### Examples
65
 
caption.py CHANGED
@@ -4,7 +4,7 @@ import os
4
  from together import Together
5
 
6
  MODEL_ID = "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
7
- TRIGGER_WORD = "tr1gger"
8
 
9
  def get_system_prompt():
10
  return f"""Automated Image Captioning (for LoRA Training)
 
4
  from together import Together
5
 
6
  MODEL_ID = "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
7
+ TRIGGER_WORD = "tr1gg3r"
8
 
9
  def get_system_prompt():
10
  return f"""Automated Image Captioning (for LoRA Training)