Spaces:
Running
Running
| title: SAM-Grounding-DINO | |
| emoji: π | |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: gradio | |
| app_file: app.py | |
| # π SAM 2.1 + Grounding DINO Interactive Segmentation | |
| A web application combining Meta's SAM 2.1 and Grounding DINO for both text-based and point-based image segmentation to enable creating and downloading a desired mask. | |
| ## β¨ Features | |
| - **π Text-Based Segmentation**: Type what you want to segment (e.g., "snoopy", "person", "car") | |
| - **π Point-Based Segmentation**: Click on objects for precise manual control | |
| - **π Multiple Mask Generation**: Generate 1-5 masks and browse through them | |
| - **π€ SAM 2.1 + Grounding DINO**: Powered by Meta's SAM 2.1 and IDEA Research's Grounding DINO | |
| - **π± Smart Auto-Detection**: Automatically chooses between text and point modes | |
| - **πΎ Multiple Export Formats**: Download masks as PNG, JPG, or PyTorch tensors | |
| - **πΌοΈ High-Resolution Display**: View images and masks in full detail | |
| - **β‘ Real-Time Processing**: Fast inference with GPU acceleration | |
| ## π Quick Start | |
| ### Installation | |
| 1. Clone or download the repository | |
| 2. Install dependencies: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### Running the App | |
| ```bash | |
| streamlit run streamlit_sam_app.py | |
| ``` | |
| The app will open in your browser at `http://localhost:8501` | |
| ## π― How to Use | |
| ### 1. Upload an Image | |
| - Click "π· Upload an image" to select an image file | |
| - Supported formats: JPG, JPEG, PNG, BMP | |
| ### 2. Add Points | |
| Choose between **Positive** (include) or **Negative** (exclude) point mode: | |
| #### Quick Presets: | |
| - **π― Center**: Add point at image center | |
| - **βοΈ Top-Left**: Add point at top-left quarter | |
| - **βοΈ Top-Right**: Add point at top-right quarter | |
| - **π² Random**: Add random point anywhere | |
| #### Manual Input: | |
| - Enter X,Y coordinates manually | |
| - Points are validated against image boundaries | |
| ### 3. Generate Segmentation Mask | |
| - Click "π― Generate Segmentation Mask" | |
| - Adjust the mask threshold in the sidebar (0.0-1.0) | |
| - Wait for SAM 2.0 to process (may take 10-30 seconds) | |
| ### 4. View Results | |
| - **Original Image with Points**: Shows your input selections | |
| - **Generated Segmentation Mask**: Red overlay on original image | |
| - **Binary Mask Preview**: Black/white mask for download | |
| - **Statistics**: Pixel counts and coverage percentage | |
| ### 5. Download Results | |
| - **π₯ Download Mask (PNG)**: Binary mask file | |
| - **π₯ Download Overlay (PNG)**: Mask overlaid on original | |
| - **π₯ Download Data (JSON)**: Complete metadata and statistics | |
| ## ποΈ Advanced Controls | |
| ### Sidebar Options: | |
| - **Point Mode**: Switch between Positive/Negative points | |
| - **Mask Threshold**: Control mask sensitivity (lower = larger masks) | |
| - **Clear Points**: Remove all points at once | |
| ### Point Management: | |
| - View all current points with coordinates | |
| - Delete individual points with ποΈ buttons | |
| - Real-time count of positive/negative points | |
| ## π§ Technical Details | |
| ### SAM 2.0 Model | |
| - Uses `facebook/sam2-hiera-small` by default | |
| - Automatically downloads model weights on first run | |
| - Runs on GPU if available, CPU otherwise | |
| ### Dependencies | |
| - `streamlit`: Web interface | |
| - `torch`: PyTorch for model inference | |
| - `transformers`: Hugging Face model loading | |
| - `PIL`: Image processing | |
| - `matplotlib`: Visualization | |
| - `numpy`: Numerical operations | |
| - `opencv-python`: Image processing utilities | |
| ### System Requirements | |
| - Python 3.8+ | |
| - 4GB+ RAM recommended | |
| - GPU recommended for faster processing | |
| ## π Troubleshooting | |
| ### Common Issues: | |
| 1. **Model Download Fails**: | |
| - Check internet connection | |
| - Ensure Hugging Face access (may require token for some models) | |
| 2. **CUDA Out of Memory**: | |
| - Try smaller model size | |
| - Reduce image resolution | |
| - Use CPU mode: set `CUDA_VISIBLE_DEVICES=""` | |
| 3. **Slow Processing**: | |
| - Use GPU if available | |
| - Try `sam2-hiera-tiny` model for faster inference | |
| 4. **Import Errors**: | |
| - Ensure all dependencies are installed: `pip install -r requirements.txt` | |
| ## π File Structure | |
| ``` | |
| SAM/ | |
| βββ streamlit_sam_app.py # Main application | |
| βββ fixed_sam_interface.py # Original Gradio version | |
| βββ requirements.txt # Dependencies | |
| βββ README.md # This file | |
| ``` | |
| ## π¨ Interface Screenshots | |
| The app features a clean, modern interface with: | |
| - Full-width image display | |
| - Intuitive sidebar controls | |
| - Real-time point visualization | |
| - Side-by-side result comparison | |
| - Comprehensive download options | |
| ## π€ Contributing | |
| Feel free to submit issues, feature requests, or pull requests! | |
| ## π License | |
| This project uses Meta's SAM 2.0 model. Please refer to Meta's license terms for the model weights. | |
| ## π Acknowledgments | |
| - Meta AI for the incredible SAM 2.0 model | |
| - Streamlit for the amazing web app framework | |
| - Hugging Face for model hosting | |
| - The open-source community for all the dependencies |