Spaces:
Running
on
Zero
Running
on
Zero
Update app.py
Browse files
app.py
CHANGED
|
@@ -1,6 +1,3 @@
|
|
| 1 |
-
I'll create a chat application for the UserLM-8b model with a clean interface and proper GPU optimization. Since this model runs locally, I won't use the @spaces.GPU decorator as it's not needed for external model loading.
|
| 2 |
-
|
| 3 |
-
```python
|
| 4 |
import gradio as gr
|
| 5 |
import spaces
|
| 6 |
import torch
|
|
@@ -305,28 +302,4 @@ if __name__ == "__main__":
|
|
| 305 |
show_error=True,
|
| 306 |
server_name="0.0.0.0",
|
| 307 |
server_port=7860,
|
| 308 |
-
)
|
| 309 |
-
```
|
| 310 |
-
|
| 311 |
-
This chat application provides:
|
| 312 |
-
|
| 313 |
-
## Key Features:
|
| 314 |
-
|
| 315 |
-
1. **Clean Chat Interface**: A modern, responsive chat UI with message bubbles and avatars
|
| 316 |
-
2. **Streaming Responses**: Character-by-character streaming for better UX
|
| 317 |
-
3. **Customizable Settings**: Temperature, top-p, and max token controls
|
| 318 |
-
4. **System Prompt**: Configurable system prompt with the default sequence example
|
| 319 |
-
5. **Chat Management**: Clear, retry, and undo functionality
|
| 320 |
-
6. **GPU Optimization**: Automatic GPU detection and FP16 precision on CUDA
|
| 321 |
-
7. **Example Messages**: Pre-defined examples to get started quickly
|
| 322 |
-
8. **Model Info Display**: Shows current device and model configuration
|
| 323 |
-
|
| 324 |
-
## Technical Highlights:
|
| 325 |
-
|
| 326 |
-
- **Lazy Loading**: Model loads only when first message is sent
|
| 327 |
-
- **Memory Efficient**: Uses `low_cpu_mem_usage=True` and appropriate precision
|
| 328 |
-
- **Proper Token Handling**: Implements the special tokens from your example
|
| 329 |
-
- **State Management**: Maintains conversation history properly
|
| 330 |
-
- **Error Handling**: Graceful fallback to CPU if CUDA unavailable
|
| 331 |
-
|
| 332 |
-
The interface preserves your original model loading and generation logic while wrapping it in a user-friendly Gradio interface. Users can adjust parameters on the fly and have full control over the conversation flow.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
import gradio as gr
|
| 2 |
import spaces
|
| 3 |
import torch
|
|
|
|
| 302 |
show_error=True,
|
| 303 |
server_name="0.0.0.0",
|
| 304 |
server_port=7860,
|
| 305 |
+
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|