---
title: Gemma-2 Multimodal Chat
emoji: 🚀
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.0.0"
app_file: app.py
pinned: false
---

# 🚀 Gemma-2 Multimodal Chat Application

A sophisticated Gradio-based chat application featuring multimodal capabilities with Google's Gemma-2 model.

## ✨ Features

- 💬 **Interactive Chat Interface**: Persistent conversation history with context awareness
- 🖼️ **Vision Capabilities**: Upload and analyze images with AI-powered insights
- 📄 **File Processing**: Support for PDF and TXT file uploads with text extraction
- 🧠 **Contextual Responses**: Maintains conversation context for follow-up questions
- 🎨 **Modern UI**: Clean, responsive interface built with Gradio
- 🔄 **State Management**: Persistent chat history and file context across interactions

## 🛠️ Technologies Used

- **Frontend**: Gradio 4.0+
- **AI Model**: Google's Gemma-2-2B-IT
- **File Processing**: PyPDF2 for PDFs, PIL for images
- **Backend**: Python with Hugging Face Transformers
- **Deployment**: Hugging Face Spaces

## 🚀 Quick Start

### Local Development

1. **Clone the repository**:
   ```bash
   git clone <repository-url>
   cd gemma
   ```

2. **Install dependencies**:
   ```bash
   pip install -r requirements.txt
   ```

3. **Run the application**:
   ```bash
   python app.py
   ```

4. **Open your browser** and navigate to `http://localhost:7860`

### Hugging Face Spaces Deployment

1. Create a new Space on [Hugging Face Spaces](https://huggingface.co/spaces)
2. Choose "Gradio" as the SDK
3. Upload the files from this repository
4. The app will automatically deploy and be accessible via your Space URL

## 📖 How to Use

### Basic Chat
1. Type your message in the text input box
2. Click "Submit" or press Enter
3. View the AI response in the chat history

### Image Analysis
1. Upload an image using the image upload component
2. Type a question about the image (e.g., "What do you see in this image?")
3. Submit to get AI-powered image analysis

### File Processing
1. Upload a PDF or TXT file using the file upload component
2. Ask questions about the file content
3. The extracted text will be used as context for responses

### Advanced Features
- **Persistent Context**: Previous conversations are remembered
- **File Context**: Uploaded file content persists for follow-up questions
- **Clear Chat**: Reset conversation history and uploaded files

## 🔧 Configuration

### Model Configuration
The application uses Google's Gemma-2-2B-IT model from Hugging Face. The model is loaded and used for inference in the `gemma_3_inference` function in `app.py`.

### Customization
- Modify the UI theme in the `gr.Blocks` configuration
- Adjust file size limits and supported formats
- Customize the chat history display format
- Add additional file processing capabilities

## 📁 Project Structure

```
gemma/
├── .gitattributes            # Git configuration
├── .gitignore                # Git ignore file
├── .huggingface/             # Hugging Face configuration
│   └── CODEOWNERS            # Space ownership configuration
├── app.py                    # Main Gradio application
├── app_config.yaml           # Hugging Face Space configuration
├── HUGGINGFACE_DEPLOYMENT.md # Deployment instructions
├── push_to_huggingface.bat   # Windows deployment script
├── push_to_huggingface.py    # Python deployment script
├── README.md                 # Project documentation (with Space config)
├── README.space.md           # Hugging Face Space README
└── requirements.txt          # Python dependencies
```

## 🔮 Future Enhancements

- [ ] Upgrade to Gemma-3 model when available
- [ ] Support for additional file formats (DOCX, XLSX)
- [ ] Advanced image processing capabilities
- [ ] User authentication and personalized chat history
- [ ] Export chat conversations
- [ ] Multi-language support
- [ ] Voice input/output capabilities

## 🤝 Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## 📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

## 🙏 Acknowledgments

- Google for the Gemma model family
- Hugging Face for the amazing ecosystem and Spaces platform
- Gradio team for the intuitive UI framework

## 📞 Support

If you encounter any issues or have questions, please open an issue on the repository or contact the maintainers.

---

**Note**: This application uses Google's Gemma-2-2B-IT model. The model doesn't have native vision capabilities, but the application is designed to handle image uploads with appropriate messaging.