Spaces:
Paused
Paused
Upload 60 files
Browse files- Dockerfile +4 -0
- RUNTIME_FIXES_SUMMARY.md +40 -17
- app/main.py +9 -7
- app/services/database_service.py +1 -1
- validate_fixes.py +57 -2
Dockerfile
CHANGED
@@ -13,6 +13,10 @@ RUN apt-get update && apt-get install -y \
|
|
13 |
# Create volume-safe directories with proper permissions
|
14 |
RUN mkdir -p /app/data /app/cache && chmod -R 777 /app/data /app/cache
|
15 |
|
|
|
|
|
|
|
|
|
16 |
# Copy all project files
|
17 |
COPY . .
|
18 |
|
|
|
13 |
# Create volume-safe directories with proper permissions
|
14 |
RUN mkdir -p /app/data /app/cache && chmod -R 777 /app/data /app/cache
|
15 |
|
16 |
+
# Set environment variables for Hugging Face cache
|
17 |
+
ENV TRANSFORMERS_CACHE=/app/cache
|
18 |
+
ENV HF_HOME=/app/cache
|
19 |
+
|
20 |
# Copy all project files
|
21 |
COPY . .
|
22 |
|
RUNTIME_FIXES_SUMMARY.md
CHANGED
@@ -1,25 +1,25 @@
|
|
1 |
# Runtime Fixes Summary
|
2 |
|
3 |
## Overview
|
4 |
-
This document summarizes the fixes applied to resolve runtime errors in the Legal Dashboard OCR application, specifically addressing:
|
5 |
|
6 |
1. **SQLite Database Path Issues** (`sqlite3.OperationalError: unable to open database file`)
|
7 |
2. **Hugging Face Transformers Cache Permissions** (`/.cache` not writable)
|
8 |
|
9 |
-
## π§ Fixes Applied
|
10 |
|
11 |
### 1. SQLite Database Path Fix
|
12 |
|
13 |
**File Modified:** `app/services/database_service.py`
|
14 |
|
15 |
**Changes:**
|
16 |
-
- Updated default database path
|
17 |
- Added directory creation with `os.makedirs(os.path.dirname(self.db_path), exist_ok=True)`
|
18 |
- Added `check_same_thread=False` parameter for better thread safety
|
19 |
|
20 |
**Code Changes:**
|
21 |
```python
|
22 |
-
def __init__(self, db_path: str = "/app/data/
|
23 |
self.db_path = db_path
|
24 |
self.connection = None
|
25 |
# Create directory if it doesn't exist
|
@@ -38,30 +38,36 @@ def _init_database(self):
|
|
38 |
**File Modified:** `app/main.py`
|
39 |
|
40 |
**Changes:**
|
41 |
-
- Added
|
42 |
-
-
|
43 |
-
- Ensured
|
44 |
|
45 |
**Code Changes:**
|
46 |
```python
|
47 |
-
#
|
48 |
-
os.environ["TRANSFORMERS_CACHE"] = "/app/cache"
|
49 |
os.makedirs("/app/cache", exist_ok=True)
|
|
|
|
|
50 |
```
|
51 |
|
52 |
-
### 3. Dockerfile Updates
|
53 |
|
54 |
**File Modified:** `Dockerfile`
|
55 |
|
56 |
**Changes:**
|
57 |
- Added directory creation for `/app/data` and `/app/cache`
|
58 |
- Set proper permissions (777) for both directories
|
|
|
59 |
- Ensured directories are created before copying application files
|
60 |
|
61 |
**Code Changes:**
|
62 |
```dockerfile
|
63 |
# Create volume-safe directories with proper permissions
|
64 |
RUN mkdir -p /app/data /app/cache && chmod -R 777 /app/data /app/cache
|
|
|
|
|
|
|
|
|
65 |
```
|
66 |
|
67 |
### 4. Docker Ignore Updates
|
@@ -82,10 +88,10 @@ cache/
|
|
82 |
|
83 |
## π― Expected Results
|
84 |
|
85 |
-
After applying these fixes, the application should:
|
86 |
|
87 |
1. **Database Operations:**
|
88 |
-
- Successfully create and access SQLite database at `/app/data/
|
89 |
- No more `sqlite3.OperationalError: unable to open database file` errors
|
90 |
- Database persists across container restarts
|
91 |
|
@@ -93,19 +99,22 @@ After applying these fixes, the application should:
|
|
93 |
- Successfully download and cache models in `/app/cache`
|
94 |
- No more cache permission errors
|
95 |
- Models load correctly on first run
|
|
|
96 |
|
97 |
3. **Container Deployment:**
|
98 |
- Builds successfully on Hugging Face Docker SDK
|
99 |
- Runs without permission-related runtime errors
|
100 |
- Maintains data persistence in volume-safe directories
|
|
|
101 |
|
102 |
## π§ͺ Validation
|
103 |
|
104 |
-
A validation script has been created (`validate_fixes.py`) that tests:
|
105 |
|
106 |
- Database path creation and access
|
107 |
- Cache directory setup and permissions
|
108 |
-
- Dockerfile configuration
|
|
|
109 |
- Docker ignore settings
|
110 |
|
111 |
Run the validation script to verify all fixes are working:
|
@@ -122,7 +131,7 @@ After fixes, the container will have this structure:
|
|
122 |
```
|
123 |
/app/
|
124 |
βββ data/ # Database storage (persistent)
|
125 |
-
β βββ
|
126 |
βββ cache/ # HF model cache (persistent)
|
127 |
β βββ transformers/
|
128 |
βββ app/ # Application code
|
@@ -145,5 +154,19 @@ The application is now ready for deployment on Hugging Face Spaces with:
|
|
145 |
2. **No cache permission errors**
|
146 |
3. **Persistent data storage**
|
147 |
4. **Proper model caching**
|
148 |
-
|
149 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# Runtime Fixes Summary
|
2 |
|
3 |
## Overview
|
4 |
+
This document summarizes the complete fixes applied to resolve runtime errors in the Legal Dashboard OCR application, specifically addressing:
|
5 |
|
6 |
1. **SQLite Database Path Issues** (`sqlite3.OperationalError: unable to open database file`)
|
7 |
2. **Hugging Face Transformers Cache Permissions** (`/.cache` not writable)
|
8 |
|
9 |
+
## π§ Complete Fixes Applied
|
10 |
|
11 |
### 1. SQLite Database Path Fix
|
12 |
|
13 |
**File Modified:** `app/services/database_service.py`
|
14 |
|
15 |
**Changes:**
|
16 |
+
- Updated default database path to `/app/data/legal_dashboard.db`
|
17 |
- Added directory creation with `os.makedirs(os.path.dirname(self.db_path), exist_ok=True)`
|
18 |
- Added `check_same_thread=False` parameter for better thread safety
|
19 |
|
20 |
**Code Changes:**
|
21 |
```python
|
22 |
+
def __init__(self, db_path: str = "/app/data/legal_dashboard.db"):
|
23 |
self.db_path = db_path
|
24 |
self.connection = None
|
25 |
# Create directory if it doesn't exist
|
|
|
38 |
**File Modified:** `app/main.py`
|
39 |
|
40 |
**Changes:**
|
41 |
+
- Added directory creation for both `/app/cache` and `/app/data`
|
42 |
+
- Set environment variable `TRANSFORMERS_CACHE` to `/app/cache`
|
43 |
+
- Ensured directories are created before any imports
|
44 |
|
45 |
**Code Changes:**
|
46 |
```python
|
47 |
+
# Create directories and set environment variables
|
|
|
48 |
os.makedirs("/app/cache", exist_ok=True)
|
49 |
+
os.makedirs("/app/data", exist_ok=True)
|
50 |
+
os.environ["TRANSFORMERS_CACHE"] = "/app/cache"
|
51 |
```
|
52 |
|
53 |
+
### 3. Dockerfile Complete Updates
|
54 |
|
55 |
**File Modified:** `Dockerfile`
|
56 |
|
57 |
**Changes:**
|
58 |
- Added directory creation for `/app/data` and `/app/cache`
|
59 |
- Set proper permissions (777) for both directories
|
60 |
+
- Added environment variables `TRANSFORMERS_CACHE` and `HF_HOME`
|
61 |
- Ensured directories are created before copying application files
|
62 |
|
63 |
**Code Changes:**
|
64 |
```dockerfile
|
65 |
# Create volume-safe directories with proper permissions
|
66 |
RUN mkdir -p /app/data /app/cache && chmod -R 777 /app/data /app/cache
|
67 |
+
|
68 |
+
# Set environment variables for Hugging Face cache
|
69 |
+
ENV TRANSFORMERS_CACHE=/app/cache
|
70 |
+
ENV HF_HOME=/app/cache
|
71 |
```
|
72 |
|
73 |
### 4. Docker Ignore Updates
|
|
|
88 |
|
89 |
## π― Expected Results
|
90 |
|
91 |
+
After applying these complete fixes, the application should:
|
92 |
|
93 |
1. **Database Operations:**
|
94 |
+
- Successfully create and access SQLite database at `/app/data/legal_dashboard.db`
|
95 |
- No more `sqlite3.OperationalError: unable to open database file` errors
|
96 |
- Database persists across container restarts
|
97 |
|
|
|
99 |
- Successfully download and cache models in `/app/cache`
|
100 |
- No more cache permission errors
|
101 |
- Models load correctly on first run
|
102 |
+
- Environment variables properly set for HF cache
|
103 |
|
104 |
3. **Container Deployment:**
|
105 |
- Builds successfully on Hugging Face Docker SDK
|
106 |
- Runs without permission-related runtime errors
|
107 |
- Maintains data persistence in volume-safe directories
|
108 |
+
- FastAPI boots without SQLite errors
|
109 |
|
110 |
## π§ͺ Validation
|
111 |
|
112 |
+
A comprehensive validation script has been created (`validate_fixes.py`) that tests:
|
113 |
|
114 |
- Database path creation and access
|
115 |
- Cache directory setup and permissions
|
116 |
+
- Dockerfile configuration with environment variables
|
117 |
+
- Main.py updates for directory creation
|
118 |
- Docker ignore settings
|
119 |
|
120 |
Run the validation script to verify all fixes are working:
|
|
|
131 |
```
|
132 |
/app/
|
133 |
βββ data/ # Database storage (persistent)
|
134 |
+
β βββ legal_dashboard.db
|
135 |
βββ cache/ # HF model cache (persistent)
|
136 |
β βββ transformers/
|
137 |
βββ app/ # Application code
|
|
|
154 |
2. **No cache permission errors**
|
155 |
3. **Persistent data storage**
|
156 |
4. **Proper model caching**
|
157 |
+
5. **Environment variables properly configured**
|
158 |
+
6. **FastAPI boots successfully on port 7860**
|
159 |
+
|
160 |
+
All runtime errors related to file permissions, database access, and Hugging Face cache should be completely resolved.
|
161 |
+
|
162 |
+
## β
Complete Fix Checklist
|
163 |
+
|
164 |
+
- [x] SQLite database path updated to `/app/data/legal_dashboard.db`
|
165 |
+
- [x] Database directory creation with proper permissions
|
166 |
+
- [x] Hugging Face cache directory set to `/app/cache`
|
167 |
+
- [x] Environment variables `TRANSFORMERS_CACHE` and `HF_HOME` configured
|
168 |
+
- [x] Dockerfile updated with directory creation and environment variables
|
169 |
+
- [x] Main.py updated with directory creation and environment setup
|
170 |
+
- [x] Docker ignore updated to exclude cache directories
|
171 |
+
- [x] Validation script created to test all fixes
|
172 |
+
- [x] Documentation updated with complete fix summary
|
app/main.py
CHANGED
@@ -8,14 +8,14 @@ Features real-time document processing, AI scoring, and WebSocket support.
|
|
8 |
Run with: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
|
9 |
"""
|
10 |
|
11 |
-
from .models.document_models import LegalDocument
|
12 |
-
from .services.ai_service import AIScoringEngine
|
13 |
-
from .services.database_service import DatabaseManager
|
14 |
-
from .services.ocr_service import OCRPipeline
|
15 |
from .api import documents, ocr, dashboard
|
|
|
|
|
|
|
|
|
|
|
16 |
import asyncio
|
17 |
import logging
|
18 |
-
import os
|
19 |
from fastapi import FastAPI, HTTPException, BackgroundTasks, WebSocket, WebSocketDisconnect, UploadFile, File
|
20 |
from fastapi.middleware.cors import CORSMiddleware
|
21 |
from fastapi.responses import HTMLResponse, JSONResponse
|
@@ -25,9 +25,11 @@ from pydantic import BaseModel
|
|
25 |
import tempfile
|
26 |
from pathlib import Path
|
27 |
|
28 |
-
#
|
29 |
-
os.environ["TRANSFORMERS_CACHE"] = "/app/cache"
|
30 |
os.makedirs("/app/cache", exist_ok=True)
|
|
|
|
|
|
|
31 |
|
32 |
# Import our modules
|
33 |
|
|
|
8 |
Run with: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
|
9 |
"""
|
10 |
|
|
|
|
|
|
|
|
|
11 |
from .api import documents, ocr, dashboard
|
12 |
+
from .services.ocr_service import OCRPipeline
|
13 |
+
from .services.database_service import DatabaseManager
|
14 |
+
from .services.ai_service import AIScoringEngine
|
15 |
+
from .models.document_models import LegalDocument
|
16 |
+
import os
|
17 |
import asyncio
|
18 |
import logging
|
|
|
19 |
from fastapi import FastAPI, HTTPException, BackgroundTasks, WebSocket, WebSocketDisconnect, UploadFile, File
|
20 |
from fastapi.middleware.cors import CORSMiddleware
|
21 |
from fastapi.responses import HTMLResponse, JSONResponse
|
|
|
25 |
import tempfile
|
26 |
from pathlib import Path
|
27 |
|
28 |
+
# Create directories and set environment variables
|
|
|
29 |
os.makedirs("/app/cache", exist_ok=True)
|
30 |
+
os.makedirs("/app/data", exist_ok=True)
|
31 |
+
os.environ["TRANSFORMERS_CACHE"] = "/app/cache"
|
32 |
+
|
33 |
|
34 |
# Import our modules
|
35 |
|
app/services/database_service.py
CHANGED
@@ -20,7 +20,7 @@ logger = logging.getLogger(__name__)
|
|
20 |
class DatabaseManager:
|
21 |
"""Database manager for legal documents"""
|
22 |
|
23 |
-
def __init__(self, db_path: str = "/app/data/
|
24 |
self.db_path = db_path
|
25 |
self.connection = None
|
26 |
# Create directory if it doesn't exist
|
|
|
20 |
class DatabaseManager:
|
21 |
"""Database manager for legal documents"""
|
22 |
|
23 |
+
def __init__(self, db_path: str = "/app/data/legal_dashboard.db"):
|
24 |
self.db_path = db_path
|
25 |
self.connection = None
|
26 |
# Create directory if it doesn't exist
|
validate_fixes.py
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
#!/usr/bin/env python3
|
2 |
"""
|
3 |
Validation Script for Database and Cache Fixes
|
4 |
-
|
5 |
|
6 |
Tests the fixes for:
|
7 |
1. SQLite database path issues
|
@@ -23,7 +23,7 @@ def test_database_path():
|
|
23 |
# Test the new database path
|
24 |
from app.services.database_service import DatabaseManager
|
25 |
|
26 |
-
# Test with default path (should be /app/data/
|
27 |
db = DatabaseManager()
|
28 |
print("β
Database manager initialized with default path")
|
29 |
|
@@ -115,6 +115,19 @@ def test_dockerfile_updates():
|
|
115 |
print("β Permission setting command missing")
|
116 |
return False
|
117 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
118 |
return True
|
119 |
|
120 |
except Exception as e:
|
@@ -122,6 +135,46 @@ def test_dockerfile_updates():
|
|
122 |
return False
|
123 |
|
124 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
125 |
def test_dockerignore_updates():
|
126 |
"""Test .dockerignore updates"""
|
127 |
print("\nπ Testing .dockerignore updates...")
|
@@ -169,6 +222,7 @@ def main():
|
|
169 |
test_database_path,
|
170 |
test_cache_directory,
|
171 |
test_dockerfile_updates,
|
|
|
172 |
test_dockerignore_updates
|
173 |
]
|
174 |
|
@@ -197,6 +251,7 @@ def main():
|
|
197 |
print("\nβ
Runtime errors should be resolved:")
|
198 |
print(" β’ SQLite database path fixed")
|
199 |
print(" β’ Hugging Face cache permissions fixed")
|
|
|
200 |
print(" β’ Docker container ready for deployment")
|
201 |
return 0
|
202 |
else:
|
|
|
1 |
#!/usr/bin/env python3
|
2 |
"""
|
3 |
Validation Script for Database and Cache Fixes
|
4 |
+
============================================
|
5 |
|
6 |
Tests the fixes for:
|
7 |
1. SQLite database path issues
|
|
|
23 |
# Test the new database path
|
24 |
from app.services.database_service import DatabaseManager
|
25 |
|
26 |
+
# Test with default path (should be /app/data/legal_dashboard.db)
|
27 |
db = DatabaseManager()
|
28 |
print("β
Database manager initialized with default path")
|
29 |
|
|
|
115 |
print("β Permission setting command missing")
|
116 |
return False
|
117 |
|
118 |
+
# Check for environment variables
|
119 |
+
if "ENV TRANSFORMERS_CACHE=/app/cache" in content:
|
120 |
+
print("β
TRANSFORMERS_CACHE environment variable found")
|
121 |
+
else:
|
122 |
+
print("β TRANSFORMERS_CACHE environment variable missing")
|
123 |
+
return False
|
124 |
+
|
125 |
+
if "ENV HF_HOME=/app/cache" in content:
|
126 |
+
print("β
HF_HOME environment variable found")
|
127 |
+
else:
|
128 |
+
print("β HF_HOME environment variable missing")
|
129 |
+
return False
|
130 |
+
|
131 |
return True
|
132 |
|
133 |
except Exception as e:
|
|
|
135 |
return False
|
136 |
|
137 |
|
138 |
+
def test_main_py_updates():
|
139 |
+
"""Test main.py updates"""
|
140 |
+
print("\nπ Testing main.py updates...")
|
141 |
+
|
142 |
+
try:
|
143 |
+
main_py_path = "app/main.py"
|
144 |
+
if not os.path.exists(main_py_path):
|
145 |
+
print("β main.py not found")
|
146 |
+
return False
|
147 |
+
|
148 |
+
with open(main_py_path, 'r') as f:
|
149 |
+
content = f.read()
|
150 |
+
|
151 |
+
# Check for directory creation
|
152 |
+
if "os.makedirs(\"/app/cache\", exist_ok=True)" in content:
|
153 |
+
print("β
Cache directory creation found")
|
154 |
+
else:
|
155 |
+
print("β Cache directory creation missing")
|
156 |
+
return False
|
157 |
+
|
158 |
+
if "os.makedirs(\"/app/data\", exist_ok=True)" in content:
|
159 |
+
print("β
Data directory creation found")
|
160 |
+
else:
|
161 |
+
print("β Data directory creation missing")
|
162 |
+
return False
|
163 |
+
|
164 |
+
# Check for environment variable setting
|
165 |
+
if "os.environ[\"TRANSFORMERS_CACHE\"] = \"/app/cache\"" in content:
|
166 |
+
print("β
TRANSFORMERS_CACHE environment variable setting found")
|
167 |
+
else:
|
168 |
+
print("β TRANSFORMERS_CACHE environment variable setting missing")
|
169 |
+
return False
|
170 |
+
|
171 |
+
return True
|
172 |
+
|
173 |
+
except Exception as e:
|
174 |
+
print(f"β main.py test failed: {e}")
|
175 |
+
return False
|
176 |
+
|
177 |
+
|
178 |
def test_dockerignore_updates():
|
179 |
"""Test .dockerignore updates"""
|
180 |
print("\nπ Testing .dockerignore updates...")
|
|
|
222 |
test_database_path,
|
223 |
test_cache_directory,
|
224 |
test_dockerfile_updates,
|
225 |
+
test_main_py_updates,
|
226 |
test_dockerignore_updates
|
227 |
]
|
228 |
|
|
|
251 |
print("\nβ
Runtime errors should be resolved:")
|
252 |
print(" β’ SQLite database path fixed")
|
253 |
print(" β’ Hugging Face cache permissions fixed")
|
254 |
+
print(" β’ Environment variables properly set")
|
255 |
print(" β’ Docker container ready for deployment")
|
256 |
return 0
|
257 |
else:
|