Spaces:

tokeron
/

SAM-Grounding-DINO

Running

App Files Files Community

SAM-Grounding-DINO / README.md

tokeron

Upload folder using huggingface_hub

0a9b595 verified 2 months ago

preview code

raw

history blame contribute delete

4.85 kB

	---
	title: SAM-Grounding-DINO
	emoji: 🎭
	colorFrom: indigo
	colorTo: purple
	sdk: gradio
	app_file: app.py
	---
	# 🎭 SAM 2.1 + Grounding DINO Interactive Segmentation

	A web application combining Meta's SAM 2.1 and Grounding DINO for both text-based and point-based image segmentation to enable creating and downloading a desired mask.

	## ✨ Features

	- 🔍 Text-Based Segmentation: Type what you want to segment (e.g., "snoopy", "person", "car")
	- 📍 Point-Based Segmentation: Click on objects for precise manual control
	- 🎭 Multiple Mask Generation: Generate 1-5 masks and browse through them
	- 🤖 SAM 2.1 + Grounding DINO: Powered by Meta's SAM 2.1 and IDEA Research's Grounding DINO
	- 📱 Smart Auto-Detection: Automatically chooses between text and point modes
	- 💾 Multiple Export Formats: Download masks as PNG, JPG, or PyTorch tensors
	- 🖼️ High-Resolution Display: View images and masks in full detail
	- ⚡ Real-Time Processing: Fast inference with GPU acceleration

	## 🚀 Quick Start

	### Installation

	1. Clone or download the repository
	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	### Running the App

	```bash
	streamlit run streamlit_sam_app.py
	```

	The app will open in your browser at `http://localhost:8501`

	## 🎯 How to Use

	### 1. Upload an Image
	- Click "📷 Upload an image" to select an image file
	- Supported formats: JPG, JPEG, PNG, BMP

	### 2. Add Points
	Choose between Positive (include) or Negative (exclude) point mode:

	#### Quick Presets:
	- 🎯 Center: Add point at image center
	- ↖️ Top-Left: Add point at top-left quarter
	- ↗️ Top-Right: Add point at top-right quarter
	- 🎲 Random: Add random point anywhere

	#### Manual Input:
	- Enter X,Y coordinates manually
	- Points are validated against image boundaries

	### 3. Generate Segmentation Mask
	- Click "🎯 Generate Segmentation Mask"
	- Adjust the mask threshold in the sidebar (0.0-1.0)
	- Wait for SAM 2.0 to process (may take 10-30 seconds)

	### 4. View Results
	- Original Image with Points: Shows your input selections
	- Generated Segmentation Mask: Red overlay on original image
	- Binary Mask Preview: Black/white mask for download
	- Statistics: Pixel counts and coverage percentage

	### 5. Download Results
	- 📥 Download Mask (PNG): Binary mask file
	- 📥 Download Overlay (PNG): Mask overlaid on original
	- 📥 Download Data (JSON): Complete metadata and statistics

	## 🎛️ Advanced Controls

	### Sidebar Options:
	- Point Mode: Switch between Positive/Negative points
	- Mask Threshold: Control mask sensitivity (lower = larger masks)
	- Clear Points: Remove all points at once

	### Point Management:
	- View all current points with coordinates
	- Delete individual points with 🗑️ buttons
	- Real-time count of positive/negative points

	## 🔧 Technical Details

	### SAM 2.0 Model
	- Uses `facebook/sam2-hiera-small` by default
	- Automatically downloads model weights on first run
	- Runs on GPU if available, CPU otherwise

	### Dependencies
	- `streamlit`: Web interface
	- `torch`: PyTorch for model inference
	- `transformers`: Hugging Face model loading
	- `PIL`: Image processing
	- `matplotlib`: Visualization
	- `numpy`: Numerical operations
	- `opencv-python`: Image processing utilities

	### System Requirements
	- Python 3.8+
	- 4GB+ RAM recommended
	- GPU recommended for faster processing

	## 🐛 Troubleshooting

	### Common Issues:

	1. Model Download Fails:
	- Check internet connection
	- Ensure Hugging Face access (may require token for some models)

	2. CUDA Out of Memory:
	- Try smaller model size
	- Reduce image resolution
	- Use CPU mode: set `CUDA_VISIBLE_DEVICES=""`

	3. Slow Processing:
	- Use GPU if available
	- Try `sam2-hiera-tiny` model for faster inference

	4. Import Errors:
	- Ensure all dependencies are installed: `pip install -r requirements.txt`

	## 📁 File Structure

	```
	SAM/
	├── streamlit_sam_app.py # Main application
	├── fixed_sam_interface.py # Original Gradio version
	├── requirements.txt # Dependencies
	└── README.md # This file
	```

	## 🎨 Interface Screenshots

	The app features a clean, modern interface with:
	- Full-width image display
	- Intuitive sidebar controls
	- Real-time point visualization
	- Side-by-side result comparison
	- Comprehensive download options

	## 🤝 Contributing

	Feel free to submit issues, feature requests, or pull requests!

	## 📄 License

	This project uses Meta's SAM 2.0 model. Please refer to Meta's license terms for the model weights.

	## 🙏 Acknowledgments

	- Meta AI for the incredible SAM 2.0 model
	- Streamlit for the amazing web app framework
	- Hugging Face for model hosting
	- The open-source community for all the dependencies