--- title: Gemma-2 Multimodal Chat emoji: 🚀 colorFrom: blue colorTo: purple sdk: gradio sdk_version: "4.0.0" app_file: app.py pinned: false --- # 🚀 Gemma-2 Multimodal Chat Application A sophisticated Gradio-based chat application featuring multimodal capabilities with Google's Gemma-2 model. ## ✨ Features - 💬 **Interactive Chat Interface**: Persistent conversation history with context awareness - 🖼️ **Vision Capabilities**: Upload and analyze images with AI-powered insights - 📄 **File Processing**: Support for PDF and TXT file uploads with text extraction - 🧠 **Contextual Responses**: Maintains conversation context for follow-up questions - 🎨 **Modern UI**: Clean, responsive interface built with Gradio - 🔄 **State Management**: Persistent chat history and file context across interactions ## 🛠️ Technologies Used - **Frontend**: Gradio 4.0+ - **AI Model**: Google's Gemma-2-2B-IT - **File Processing**: PyPDF2 for PDFs, PIL for images - **Backend**: Python with Hugging Face Transformers - **Deployment**: Hugging Face Spaces ## 🚀 Quick Start ### Local Development 1. **Clone the repository**: ```bash git clone cd gemma ``` 2. **Install dependencies**: ```bash pip install -r requirements.txt ``` 3. **Run the application**: ```bash python app.py ``` 4. **Open your browser** and navigate to `http://localhost:7860` ### Hugging Face Spaces Deployment 1. Create a new Space on [Hugging Face Spaces](https://huggingface.co/spaces) 2. Choose "Gradio" as the SDK 3. Upload the files from this repository 4. The app will automatically deploy and be accessible via your Space URL ## 📖 How to Use ### Basic Chat 1. Type your message in the text input box 2. Click "Submit" or press Enter 3. View the AI response in the chat history ### Image Analysis 1. Upload an image using the image upload component 2. Type a question about the image (e.g., "What do you see in this image?") 3. Submit to get AI-powered image analysis ### File Processing 1. Upload a PDF or TXT file using the file upload component 2. Ask questions about the file content 3. The extracted text will be used as context for responses ### Advanced Features - **Persistent Context**: Previous conversations are remembered - **File Context**: Uploaded file content persists for follow-up questions - **Clear Chat**: Reset conversation history and uploaded files ## 🔧 Configuration ### Model Configuration The application uses Google's Gemma-2-2B-IT model from Hugging Face. The model is loaded and used for inference in the `gemma_3_inference` function in `app.py`. ### Customization - Modify the UI theme in the `gr.Blocks` configuration - Adjust file size limits and supported formats - Customize the chat history display format - Add additional file processing capabilities ## 📁 Project Structure ``` gemma/ ├── .gitattributes # Git configuration ├── .gitignore # Git ignore file ├── .huggingface/ # Hugging Face configuration │ └── CODEOWNERS # Space ownership configuration ├── app.py # Main Gradio application ├── app_config.yaml # Hugging Face Space configuration ├── HUGGINGFACE_DEPLOYMENT.md # Deployment instructions ├── push_to_huggingface.bat # Windows deployment script ├── push_to_huggingface.py # Python deployment script ├── README.md # Project documentation (with Space config) ├── README.space.md # Hugging Face Space README └── requirements.txt # Python dependencies ``` ## 🔮 Future Enhancements - [ ] Upgrade to Gemma-3 model when available - [ ] Support for additional file formats (DOCX, XLSX) - [ ] Advanced image processing capabilities - [ ] User authentication and personalized chat history - [ ] Export chat conversations - [ ] Multi-language support - [ ] Voice input/output capabilities ## 🤝 Contributing 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Commit your changes (`git commit -m 'Add amazing feature'`) 4. Push to the branch (`git push origin feature/amazing-feature`) 5. Open a Pull Request ## 📄 License This project is licensed under the MIT License - see the LICENSE file for details. ## 🙏 Acknowledgments - Google for the Gemma model family - Hugging Face for the amazing ecosystem and Spaces platform - Gradio team for the intuitive UI framework ## 📞 Support If you encounter any issues or have questions, please open an issue on the repository or contact the maintainers. --- **Note**: This application uses Google's Gemma-2-2B-IT model. The model doesn't have native vision capabilities, but the application is designed to handle image uploads with appropriate messaging.