jarvis_gaia_agent / README.md
onisj's picture
feat(tools): add more tool to extend the functionaily of jarvis
751d628
---
title: JARVIS Gaia Agent
emoji: 🦾
colorFrom: indigo
colorTo: green
sdk: gradio
pinned: false
license: mit
short_description: Enhanced JARVIS AI agent for GAIA benchmark
models:
- meta-llama/Llama-3.2-1B-Instruct
- sentence-transformers/all-MiniLM-L6-v2
datasets:
- gaia-benchmark/GAIA
---
# Evolved JARVIS Gaia Agent
An advanced Python-based AI agent built with `langchain`, `langgraph`, SERPAPI, and OCR capabilities for web searches, file parsing, image analysis, and data retrieval. Deployed as a Hugging Face Space (`onisj/jarvis_gaia_agent`) for evaluating performance on the GAIA benchmark, targeting a score >30% (6/20 correct).
## Features
- **Web Search**: Integrates SERPAPI and DuckDuckGo for robust, multi-hop searches.
- **File Parsing**: Processes CSV, TXT, Excel, and PDF files for GAIA tasks.
- **Image Parsing**: Uses OCR (`easyocr`) to extract text from images.
- **Data Retrieval**: Includes a guest info retriever for structured queries.
- **External APIs**: Supports weather data (OpenWeatherMap) and Hugging Face Hub stats.
- **State Management**: Employs `langgraph` for multi-step reasoning workflows.
- **Exact-Match Answers**: Optimized for GAIA Level 1 questions with precise formatting (e.g., USD to two decimals, comma-separated lists).
- **Gradio Interface**: Provides a user-friendly UI for running evaluations and submitting answers.
## Directory Structure
```
jarvis_gaia_agent/
β”œβ”€β”€ app.py # Main Gradio application with agent logic
β”œβ”€β”€ state.py # Defines JARVISState for LangGraph state management
β”œβ”€β”€ search.py # Web search tools (SERPAPI, multi-hop search)
β”œβ”€β”€ tools/ # Directory for all tools
β”‚ β”œβ”€β”€ __init__.py # Exports all tools
β”‚ β”œβ”€β”€ file_parser.py # Parses CSV, TXT, Excel, and PDF files
β”‚ β”œβ”€β”€ image_parser.py # OCR-based image parsing
β”‚ β”œβ”€β”€ calculator.py # Mathematical calculations
β”‚ β”œβ”€β”€ document_retriever.py # PDF document retrieval
β”‚ β”œβ”€β”€ duckduckgo_search.py # DuckDuckGo search integration
β”‚ β”œβ”€β”€ weather_info.py # Weather data via OpenWeatherMap
β”‚ β”œβ”€β”€ hub_stats.py # Hugging Face Hub statistics
β”‚ β”œβ”€β”€ guest_info.py # Guest information retrieval
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # Project documentation
β”œβ”€β”€ .gitignore # Excludes .env, temp/, etc.
β”œβ”€β”€ temp/ # Temporary directory for GAIA files (created at runtime)
```
## Models and Datasets
- **Models**:
- `meta-llama/Llama-3.2-1B-Instruct`: Primary LLM for reasoning and tool selection (Hugging Face Inference API or local).
- `sentence-transformers/all-MiniLM-L6-v2`: Embedding model for text similarity tasks.
- Note: Together AI models (`meta-llama/Llama-3.3-70B-Instruct-Turbo-Free`, `deepseek-ai/DeepSeek-R1-Distill-Llama-70B-free`) are used via API but not hosted on Hugging Face, so they’re not listed in metadata.
- **Datasets**:
- `gaia-benchmark/GAIA`: Benchmark dataset for evaluating agent performance.
## Prerequisites
- **Python**: 3.9 or higher.
- **Tesseract OCR**: Required for image parsing.
- macOS: `brew install tesseract`
- Ubuntu: `sudo apt-get install tesseract-ocr`
- Windows: Install via [Tesseract Installer](https://github.com/UB-Mannheim/tesseract/wiki).
- **API Keys**: Set in `.env` (local) or Hugging Face Space Secrets (deployment):
- `HUGGINGFACEHUB_API_TOKEN`: Hugging Face token for model access.
- `TOGETHER_API_KEY`: Together AI API key for LLM inference.
- `SERPAPI_API_KEY`: SERPAPI key for web searches.
- `OPENWEATHERMAP_API_KEY`: OpenWeatherMap key for weather queries.
- `SPACE_ID`: `onisj/jarvis_gaia_agent`.
- Install dependencies:
```bash
pip install -r requirements.txt
```
## Setup and Local Testing
1. **Clone the Repository**:
```bash
git clone https://huggingface.co/spaces/onisj/jarvis_gaia_agent
cd jarvis_gaia_agent
```
2. **Create Virtual Environment**:
```bash
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
```
3. **Install Dependencies**:
```bash
pip install -r requirements.txt
```
4. **Configure Environment Variables**:
Create a `.env` file:
```text
SPACE_ID=onisj/jarvis_gaia_agent
HUGGINGFACEHUB_API_TOKEN=your_hf_token
TOGETHER_API_KEY=your_together_api_key
SERPAPI_API_KEY=your_serpapi_key
OPENWEATHERMAP_API_KEY=your_openweather_key
```
5. **Test with Mock File** (optional):
```bash
mkdir temp
echo "Item,Type,Sales\nBurger,Food,1000\nCola,Drink,500" > temp/7bd855d8-463d-4ed5-93ca-5fe35145f733.xlsx
```
6. **Run Locally**:
```bash
python app.py
```
- Open `http://127.0.0.1:7860` (port may vary).
- Log in with Hugging Face credentials.
- Click β€œRun Evaluation & Submit All Answers” to test GAIA tasks.
## Deployment to Hugging Face Space
1. **Push Code**:
```bash
git add .
git commit -m "Update JARVIS Gaia Agent with README metadata"
git push origin main
```
2. **Set Space Secrets**:
- Go to `https://huggingface.co/spaces/onisj/jarvis_gaia_agent` > Settings > Repository Secrets.
- Add:
- `SPACE_ID`: `onisj/jarvis_gaia_agent`
- `HUGGINGFACEHUB_API_TOKEN`
- `TOGETHER_API_KEY`
- `SERPAPI_API_KEY`
- `OPENWEATHERMAP_API_KEY`
3. **Build and Run**:
- Hugging Face auto-builds the Space after pushing.
- Access the Gradio interface at `https://onisj-jarvis-gaia-agent.hf.space`.
- Log in and click β€œRun Evaluation & Submit All Answers” to submit GAIA answers.
4. **Verify Submission**:
- Check `status_output` for:
```
Submission Successful!
User: your_username
Overall Score: XX% (Y/20 correct)
Message: ...
```
- Aim for >30% (6/20 correct).
## Troubleshooting
- **Model Access (404)**: Verify API keys; test `initialize_llm` locally.
- **SERPAPI Timeout**: Ensure `SERPAPI_API_KEY` is valid; check `search.py` logs.
- **GAIA File Access**: Confirm `temp/` directory permissions; test `download_file`.
- **Low GAIA Score**: Analyze `results_table` for errors; enhance `multi_hop_search_tool` or answer formatting.
- **Logs**: Check Space > Settings > Logs for build/run errors.
## License
MIT License. See [LICENSE](LICENSE) for details.
## Acknowledgements
- Built with `langchain`, `langgraph`, and Hugging Face tools.
- Evaluated on the GAIA benchmark (`gaia-benchmark/GAIA`).