nazdridoy commited on
Commit
6cb2ff0
·
verified ·
1 Parent(s): f6b1c52

feat(app): add Gradio AI chat and image generation app

Browse files

- [add] Implement new Gradio web application (app.py)
- [feat] Add chat completion function with token management (app.py:chat_respond())
- [feat] Add image generation function with token management (app.py:generate_image())
- [feat] Implement image dimension validation utility (app.py:validate_dimensions())
- [feat] Set up Gradio UI with Chat Assistant and Image Generator tabs (app.py)
- [feat] Add handler for image generation button (app.py:on_generate_image())
- [add] Create new module for HF-Inferoxy proxy token utilities (hf_token_utils.py)
- [add] Define function to provision proxy tokens (hf_token_utils.py:get_proxy_token())
- [add] Define function to report token usage status (hf_token_utils.py:report_token_status())

Files changed (5) hide show
  1. .gitattributes +35 -0
  2. README.md +272 -0
  3. app.py +432 -0
  4. hf_token_utils.py +83 -0
  5. requirements.txt +4 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,272 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: HF-Inferoxy AI Hub
3
+ emoji: 🚀
4
+ colorFrom: purple
5
+ colorTo: blue
6
+ sdk: gradio
7
+ app_file: app.py
8
+ pinned: false
9
+ ---
10
+
11
+ # 🚀 HF-Inferoxy AI Hub
12
+
13
+ A comprehensive AI platform that combines conversational AI and text-to-image generation capabilities with intelligent HuggingFace API token management through HF-Inferoxy.
14
+
15
+ ## ✨ Features
16
+
17
+ ### 💬 Chat Assistant
18
+ - **🤖 Smart Conversations**: Advanced chat interface with streaming responses
19
+ - **🎯 Model Flexibility**: Support for any HuggingFace chat model
20
+ - **⚙️ Customizable Parameters**: Control temperature, top-p, max tokens, and system messages
21
+ - **🌐 Multi-Provider Support**: Works with Cerebras, Cohere, Groq, Together, and more
22
+
23
+ ### 🎨 Image Generator
24
+ - **🖼️ Text-to-Image Generation**: Create stunning images from text descriptions
25
+ - **🎛️ Advanced Controls**: Fine-tune dimensions, inference steps, guidance scale, and seeds
26
+ - **🎯 Multiple Providers**: HF Inference, Fal.ai, Nebius, NScale, Replicate, Together
27
+ - **📱 Beautiful UI**: Modern interface with preset configurations and examples
28
+
29
+ ### 🔄 Smart Token Management
30
+ - **🚀 Automatic Token Provisioning**: No manual token management required
31
+ - **⚡ Intelligent Rotation**: Automatic switching when tokens fail or reach limits
32
+ - **🛡️ Error Resilience**: Failed tokens are quarantined and replaced seamlessly
33
+ - **📊 Usage Tracking**: Comprehensive monitoring of token usage and errors
34
+
35
+ ## 🛠️ Setup
36
+
37
+ ### 1. HuggingFace Space Secrets
38
+
39
+ Add the following secret to your HuggingFace Space:
40
+
41
+ - **Key**: `PROXY_KEY`
42
+ - **Value**: Your HF-Inferoxy proxy API key
43
+
44
+ ### 2. HF-Inferoxy Server
45
+
46
+ The app is configured to use the HF-Inferoxy server at: `http://scw.nazdev.tech:11155`
47
+
48
+ ### 3. Dependencies
49
+
50
+ The app requires:
51
+ - `gradio` - Modern web interface framework
52
+ - `huggingface-hub` - HuggingFace API integration
53
+ - `requests` - HTTP communication with the proxy
54
+ - `Pillow` - Image processing capabilities
55
+ - `torch` & `transformers` - Model support
56
+
57
+ ## 🎯 How It Works
58
+
59
+ ### Token Management Flow
60
+ 1. **Token Provisioning**: The app requests a valid token from the HF-Inferoxy server
61
+ 2. **API Calls**: Uses the provisioned token for HuggingFace API requests
62
+ 3. **Status Reporting**: Reports token usage success/failure back to the proxy
63
+ 4. **Automatic Rotation**: HF-Inferoxy handles token rotation and error management
64
+
65
+ ### Chat Assistant
66
+ 1. **Model Selection**: Choose any HuggingFace model with optional provider specification
67
+ 2. **Conversation**: Engage in natural conversations with streaming responses
68
+ 3. **Customization**: Adjust the AI's personality with system messages and parameters
69
+
70
+ ### Image Generation
71
+ 1. **Prompt Creation**: Write detailed descriptions of desired images
72
+ 2. **Model & Provider**: Select from preset combinations or specify custom ones
73
+ 3. **Parameter Tuning**: Fine-tune generation settings for optimal results
74
+ 4. **Image Creation**: Generate high-quality images with automatic token management
75
+
76
+ ## 🌟 Supported Models & Providers
77
+
78
+ ### Chat Models
79
+
80
+ | Model | Provider | Description |
81
+ |-------|----------|-------------|
82
+ | `openai/gpt-oss-20b` | Fireworks AI, Cerebras, Groq | Fast general purpose model |
83
+ | `meta-llama/Llama-2-7b-chat-hf` | HF Inference | Chat-optimized model |
84
+ | `mistralai/Mistral-7B-Instruct-v0.2` | Featherless AI | Instruction following |
85
+ | `CohereLabs/c4ai-command-r-plus` | Cohere | Advanced language model |
86
+
87
+ ### Image Models
88
+
89
+ | Model | Provider | Description |
90
+ |-------|----------|-------------|
91
+ | `stabilityai/stable-diffusion-xl-base-1.0` | HF Inference, NScale | High-quality SDXL model |
92
+ | `black-forest-labs/FLUX.1-dev` | Nebius, Together | State-of-the-art image model |
93
+ | `Qwen/Qwen-Image` | Fal.ai, Replicate | Advanced image generation |
94
+
95
+ ## 🎨 Usage Examples
96
+
97
+ ### Chat Assistant
98
+
99
+ #### Basic Conversation
100
+ 1. Go to the "💬 Chat Assistant" tab
101
+ 2. Type your message in the chat input
102
+ 3. Adjust parameters if needed (temperature, model, etc.)
103
+ 4. Watch the AI respond with streaming text
104
+
105
+ #### Custom Model with Provider
106
+ ```
107
+ Model Name: openai/gpt-oss-20b:fireworks-ai
108
+ System Message: You are a helpful coding assistant specializing in Python.
109
+ ```
110
+
111
+ ### Image Generation
112
+
113
+ #### Basic Image Creation
114
+ 1. Go to the "🎨 Image Generator" tab
115
+ 2. Enter your prompt: "A serene mountain lake at sunset, photorealistic, 8k"
116
+ 3. Choose a model and provider
117
+ 4. Click "🎨 Generate Image"
118
+
119
+ #### Advanced Settings
120
+ - **Dimensions**: 1024x1024 (must be divisible by 8)
121
+ - **Inference Steps**: 20-50 for good quality
122
+ - **Guidance Scale**: 7-10 for following prompts closely
123
+ - **Negative Prompt**: "blurry, low quality, distorted"
124
+
125
+ ## ⚙️ Configuration Options
126
+
127
+ ### Chat Parameters
128
+ - **System Message**: Define the AI's personality and behavior
129
+ - **Max New Tokens**: Control response length (1-4096)
130
+ - **Temperature**: Creativity level (0.1-2.0)
131
+ - **Top-p**: Response diversity (0.1-1.0)
132
+
133
+ ### Image Parameters
134
+ - **Prompt**: Detailed description of desired image
135
+ - **Negative Prompt**: What to avoid in the image
136
+ - **Dimensions**: Width and height (256-2048, divisible by 8)
137
+ - **Inference Steps**: Quality vs speed trade-off (10-100)
138
+ - **Guidance Scale**: Prompt adherence (1.0-20.0)
139
+ - **Seed**: Reproducibility (-1 for random)
140
+
141
+ ## 🎯 Provider-Specific Features
142
+
143
+ ### Chat Providers
144
+ - **Fireworks AI**: Fast and reliable inference service
145
+ - **Cerebras**: High-performance inference with low latency
146
+ - **Cohere**: Advanced language models with multilingual support
147
+ - **Groq**: Ultra-fast inference, optimized for speed
148
+ - **Together**: Collaborative AI hosting, wide model support
149
+ - **Featherless AI**: Specialized fine-tuned models
150
+
151
+ ### Image Providers
152
+ - **HF Inference**: Core API with comprehensive model support
153
+ - **Fal.ai**: High-quality image generation with fast processing
154
+ - **Nebius**: Cloud-native services with enterprise features
155
+ - **NScale**: Optimized inference performance
156
+ - **Replicate**: Collaborative AI hosting with version control
157
+ - **Together**: Fast inference service with wide model support
158
+
159
+ ## 💡 Tips for Better Results
160
+
161
+ ### Chat Tips
162
+ - **Clear Instructions**: Be specific about what you want
163
+ - **System Messages**: Set context and personality upfront
164
+ - **Model Selection**: Choose appropriate models for your task
165
+ - **Parameter Tuning**: Lower temperature for factual responses, higher for creativity
166
+
167
+ ### Image Tips
168
+ - **Detailed Prompts**: Use specific, descriptive language
169
+ - **Style Keywords**: Include art style, lighting, and quality descriptors
170
+ - **Negative Prompts**: Specify what you don't want to avoid common issues
171
+ - **Aspect Ratios**: Consider the subject when choosing dimensions
172
+ - **Provider Testing**: Try different providers for varied artistic styles
173
+
174
+ ### Example Prompts
175
+
176
+ #### Chat Examples
177
+ ```
178
+ "Explain quantum computing in simple terms"
179
+ "Help me debug this Python code: [paste code]"
180
+ "Write a creative story about a time-traveling cat"
181
+ "What are the pros and cons of renewable energy?"
182
+ ```
183
+
184
+ #### Image Examples
185
+ ```
186
+ "A majestic dragon flying over a medieval castle, epic fantasy art, detailed, 8k"
187
+ "A serene Japanese garden with cherry blossoms, zen atmosphere, peaceful, high quality"
188
+ "A futuristic cityscape with flying cars and neon lights, cyberpunk style, cinematic"
189
+ "Portrait of a wise old wizard with flowing robes, magical aura, fantasy character art"
190
+ ```
191
+
192
+ ## 🔒 Security & Authentication
193
+
194
+ ### RBAC System
195
+ - All operations require authentication with the HF-Inferoxy proxy server
196
+ - API keys are managed securely through HuggingFace Space secrets
197
+ - No sensitive information is logged or exposed
198
+
199
+ ### Token Security
200
+ - Tokens are automatically rotated when they fail or reach limits
201
+ - Failed tokens are quarantined to prevent repeated failures
202
+ - Usage is tracked comprehensively for monitoring and optimization
203
+
204
+ ## 🐛 Troubleshooting
205
+
206
+ ### Common Issues
207
+
208
+ #### Setup Issues
209
+ 1. **PROXY_KEY Missing**: Ensure the secret is set in your HuggingFace Space settings
210
+ 2. **Connection Errors**: Verify the HF-Inferoxy server is accessible
211
+ 3. **Import Errors**: Check that all dependencies are properly installed
212
+
213
+ #### Chat Issues
214
+ 1. **No Response**: Check model name format and provider availability
215
+ 2. **Slow Responses**: Try different providers or smaller models
216
+ 3. **Poor Quality**: Adjust temperature and top-p parameters
217
+
218
+ #### Image Issues
219
+ 1. **Generation Fails**: Verify model supports text-to-image generation
220
+ 2. **Dimension Errors**: Ensure width and height are divisible by 8
221
+ 3. **Poor Quality**: Increase inference steps or adjust guidance scale
222
+
223
+ ### Error Types
224
+ - **401 Errors**: Authentication issues (handled automatically by token rotation)
225
+ - **402 Errors**: Credit limit exceeded (reported to proxy for token management)
226
+ - **Network Errors**: Connection issues (reported to proxy for monitoring)
227
+ - **Model Errors**: Invalid model or provider combinations
228
+
229
+ ## 📚 Additional Resources
230
+
231
+ - **[HF-Inferoxy Documentation](https://nazdridoy.github.io/hf-inferoxy/)**: Complete platform documentation
232
+ - **[HuggingFace Hub Integration Guide](https://nazdridoy.github.io/hf-inferoxy/huggingface-hub-integration/)**: Detailed integration instructions
233
+ - **[Provider Examples](https://nazdridoy.github.io/hf-inferoxy/examples/)**: Code examples for different providers
234
+ - **[Gradio Documentation](https://gradio.app/docs/)**: Interface framework documentation
235
+
236
+ ## 🤝 Contributing
237
+
238
+ This application is part of the HF-Inferoxy ecosystem. For contributions or issues:
239
+
240
+ 1. Review the [HF-Inferoxy documentation](https://nazdridoy.github.io/hf-inferoxy/)
241
+ 2. Test with different models and providers
242
+ 3. Report any issues or suggest improvements
243
+ 4. Contribute examples and use cases
244
+
245
+ ## 🚀 Advanced Usage
246
+
247
+ ### Environment Variables
248
+
249
+ You can customize the proxy URL using environment variables:
250
+
251
+ ```python
252
+ import os
253
+ os.environ["HF_PROXY_URL"] = "http://your-proxy-server:8000"
254
+ ```
255
+
256
+ ### Custom Providers
257
+
258
+ The app supports any provider that works with HF-Inferoxy. Simply specify the provider name when entering model information.
259
+
260
+ ### Batch Operations
261
+
262
+ For multiple operations, consider the token reuse patterns documented in the HF-Inferoxy integration guide.
263
+
264
+ ## 📄 License
265
+
266
+ This project is part of the HF-Inferoxy ecosystem. Please refer to the main project for licensing information.
267
+
268
+ ---
269
+
270
+ **Built with ❤️ using [HF-Inferoxy](https://nazdridoy.github.io/hf-inferoxy/) for intelligent token management**
271
+
272
+ **Ready to explore AI? Start chatting or generating images above! 🚀**
app.py ADDED
@@ -0,0 +1,432 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import os
3
+ from huggingface_hub import InferenceClient
4
+ from huggingface_hub.errors import HfHubHTTPError
5
+ from hf_token_utils import get_proxy_token, report_token_status
6
+ import PIL.Image
7
+ import io
8
+
9
+
10
+ def chat_respond(
11
+ message,
12
+ history: list[dict[str, str]],
13
+ system_message,
14
+ model_name,
15
+ max_tokens,
16
+ temperature,
17
+ top_p,
18
+ ):
19
+ """
20
+ Chat completion function using HF-Inferoxy token management.
21
+ """
22
+ # Get proxy API key from environment variable (set in HuggingFace Space secrets)
23
+ proxy_api_key = os.getenv("PROXY_KEY")
24
+ if not proxy_api_key:
25
+ yield "❌ Error: PROXY_KEY not found in environment variables. Please set it in your HuggingFace Space secrets."
26
+ return
27
+
28
+ try:
29
+ # Get token from HF-Inferoxy proxy server
30
+ print(f"🔑 Chat: Requesting token from proxy...")
31
+ token, token_id = get_proxy_token(api_key=proxy_api_key)
32
+ print(f"✅ Chat: Got token: {token_id}")
33
+
34
+ # Parse model name and provider if specified
35
+ if ":" in model_name:
36
+ model, provider = model_name.split(":", 1)
37
+ else:
38
+ model = model_name
39
+ provider = None
40
+
41
+ print(f"🤖 Chat: Using model='{model}', provider='{provider if provider else 'auto'}'")
42
+
43
+ # Prepare messages first
44
+ messages = [{"role": "system", "content": system_message}]
45
+ messages.extend(history)
46
+ messages.append({"role": "user", "content": message})
47
+
48
+ print(f"💬 Chat: Prepared {len(messages)} messages, creating client...")
49
+
50
+ # Create client with provider (auto if none specified) and always pass model
51
+ client = InferenceClient(
52
+ provider=provider if provider else "auto",
53
+ api_key=token
54
+ )
55
+
56
+ print(f"🚀 Chat: Client created, starting inference...")
57
+
58
+ chat_completion_kwargs = {
59
+ "model": model,
60
+ "messages": messages,
61
+ "max_tokens": max_tokens,
62
+ "stream": True,
63
+ "temperature": temperature,
64
+ "top_p": top_p,
65
+ }
66
+
67
+ response = ""
68
+
69
+ print(f"📡 Chat: Making streaming request...")
70
+ stream = client.chat_completion(**chat_completion_kwargs)
71
+ print(f"🔄 Chat: Got stream, starting to iterate...")
72
+
73
+ for message in stream:
74
+ choices = message.choices
75
+ token_content = ""
76
+ if len(choices) and choices[0].delta.content:
77
+ token_content = choices[0].delta.content
78
+
79
+ response += token_content
80
+ yield response
81
+
82
+ # Report successful token usage
83
+ report_token_status(token_id, "success", api_key=proxy_api_key)
84
+
85
+ except HfHubHTTPError as e:
86
+ # Report HF Hub errors
87
+ if 'token_id' in locals():
88
+ report_token_status(token_id, "error", str(e), api_key=proxy_api_key)
89
+ yield f"❌ HuggingFace API Error: {str(e)}"
90
+
91
+ except Exception as e:
92
+ # Report other errors
93
+ if 'token_id' in locals():
94
+ report_token_status(token_id, "error", str(e), api_key=proxy_api_key)
95
+ yield f"❌ Unexpected Error: {str(e)}"
96
+
97
+
98
+ def generate_image(
99
+ prompt: str,
100
+ model_name: str,
101
+ provider: str,
102
+ negative_prompt: str = "",
103
+ width: int = 1024,
104
+ height: int = 1024,
105
+ num_inference_steps: int = 20,
106
+ guidance_scale: float = 7.5,
107
+ seed: int = -1,
108
+ ):
109
+ """
110
+ Generate an image using the specified model and provider through HF-Inferoxy.
111
+ """
112
+ # Get proxy API key from environment variable (set in HuggingFace Space secrets)
113
+ proxy_api_key = os.getenv("PROXY_KEY")
114
+ if not proxy_api_key:
115
+ return None, "❌ Error: PROXY_KEY not found in environment variables. Please set it in your HuggingFace Space secrets."
116
+
117
+ try:
118
+ # Get token from HF-Inferoxy proxy server
119
+ token, token_id = get_proxy_token(api_key=proxy_api_key)
120
+
121
+ # Create client with specified provider
122
+ client = InferenceClient(
123
+ provider=provider,
124
+ api_key=token
125
+ )
126
+
127
+ # Prepare generation parameters
128
+ generation_params = {
129
+ "model": model_name,
130
+ "prompt": prompt,
131
+ "width": width,
132
+ "height": height,
133
+ "num_inference_steps": num_inference_steps,
134
+ "guidance_scale": guidance_scale,
135
+ }
136
+
137
+ # Add optional parameters if provided
138
+ if negative_prompt:
139
+ generation_params["negative_prompt"] = negative_prompt
140
+ if seed != -1:
141
+ generation_params["seed"] = seed
142
+
143
+ # Generate image
144
+ image = client.text_to_image(**generation_params)
145
+
146
+ # Report successful token usage
147
+ report_token_status(token_id, "success", api_key=proxy_api_key)
148
+
149
+ return image, f"✅ Image generated successfully using {model_name} on {provider}!"
150
+
151
+ except HfHubHTTPError as e:
152
+ # Report HF Hub errors
153
+ if 'token_id' in locals():
154
+ report_token_status(token_id, "error", str(e), api_key=proxy_api_key)
155
+ return None, f"❌ HuggingFace API Error: {str(e)}"
156
+
157
+ except Exception as e:
158
+ # Report other errors
159
+ if 'token_id' in locals():
160
+ report_token_status(token_id, "error", str(e), api_key=proxy_api_key)
161
+ return None, f"❌ Unexpected Error: {str(e)}"
162
+
163
+
164
+ def validate_dimensions(width, height):
165
+ """Validate that dimensions are divisible by 8 (required by most diffusion models)"""
166
+ if width % 8 != 0 or height % 8 != 0:
167
+ return False, "Width and height must be divisible by 8"
168
+ return True, ""
169
+
170
+
171
+ # Create the main Gradio interface with tabs
172
+ with gr.Blocks(title="HF-Inferoxy AI Hub", theme=gr.themes.Soft()) as demo:
173
+
174
+ # Main header
175
+ gr.Markdown("""
176
+ # 🚀 HF-Inferoxy AI Hub
177
+
178
+ A comprehensive AI platform combining chat and image generation capabilities with intelligent token management through HF-Inferoxy.
179
+
180
+ **Features:**
181
+ - 💬 **Smart Chat**: Conversational AI with streaming responses
182
+ - 🎨 **Image Generation**: Text-to-image creation with multiple providers
183
+ - 🔄 **Intelligent Token Management**: Automatic token rotation and error handling
184
+ - 🌐 **Multi-Provider Support**: Works with HF Inference, Cerebras, Cohere, Groq, Together, Fal.ai, and more
185
+ """)
186
+
187
+ with gr.Tabs() as tabs:
188
+
189
+ # ==================== CHAT TAB ====================
190
+ with gr.Tab("💬 Chat Assistant", id="chat"):
191
+ with gr.Row():
192
+ with gr.Column(scale=3):
193
+ # Create chat interface
194
+ chatbot = gr.ChatInterface(
195
+ chat_respond,
196
+ type="messages",
197
+ title="",
198
+ description="",
199
+ additional_inputs=[
200
+ gr.Textbox(
201
+ value="You are a helpful and friendly AI assistant. Provide clear, accurate, and helpful responses.",
202
+ label="System Message",
203
+ lines=2,
204
+ placeholder="Define the assistant's personality and behavior..."
205
+ ),
206
+ gr.Textbox(
207
+ value="openai/gpt-oss-20b:fireworks-ai",
208
+ label="Model Name",
209
+ placeholder="e.g., openai/gpt-oss-20b:fireworks-ai or mistralai/Mistral-7B-Instruct-v0.2:groq"
210
+ ),
211
+ gr.Slider(
212
+ minimum=1, maximum=4096, value=1024, step=1,
213
+ label="Max New Tokens"
214
+ ),
215
+ gr.Slider(
216
+ minimum=0.1, maximum=2.0, value=0.7, step=0.1,
217
+ label="Temperature"
218
+ ),
219
+ gr.Slider(
220
+ minimum=0.1, maximum=1.0, value=0.95, step=0.05,
221
+ label="Top-p (nucleus sampling)"
222
+ ),
223
+ ],
224
+ )
225
+
226
+ with gr.Column(scale=1):
227
+ gr.Markdown("""
228
+ ### 💡 Chat Tips
229
+
230
+ **Model Format:**
231
+ - Single model: `openai/gpt-oss-20b`
232
+ - With provider: `model:provider`
233
+
234
+ **Popular Models:**
235
+ - `openai/gpt-oss-20b` - Fast general purpose
236
+ - `meta-llama/Llama-2-7b-chat-hf` - Chat optimized
237
+ - `microsoft/DialoGPT-medium` - Conversation
238
+ - `google/flan-t5-base` - Instruction following
239
+
240
+ **Popular Providers:**
241
+ - `fireworks-ai` - Fast and reliable
242
+ - `cerebras` - High performance
243
+ - `groq` - Ultra-fast inference
244
+ - `together` - Wide model support
245
+ - `cohere` - Advanced language models
246
+
247
+ **Example:**
248
+ `openai/gpt-oss-20b:fireworks-ai`
249
+ """)
250
+
251
+ # ==================== IMAGE GENERATION TAB ====================
252
+ with gr.Tab("🎨 Image Generator", id="image"):
253
+ with gr.Row():
254
+ with gr.Column(scale=2):
255
+ # Image output
256
+ output_image = gr.Image(
257
+ label="Generated Image",
258
+ type="pil",
259
+ height=600,
260
+ show_download_button=True
261
+ )
262
+ status_text = gr.Textbox(
263
+ label="Generation Status",
264
+ interactive=False,
265
+ lines=2
266
+ )
267
+
268
+ with gr.Column(scale=1):
269
+ # Model and provider inputs
270
+ with gr.Group():
271
+ gr.Markdown("**🤖 Model & Provider**")
272
+ img_model_name = gr.Textbox(
273
+ value="stabilityai/stable-diffusion-xl-base-1.0",
274
+ label="Model Name",
275
+ placeholder="e.g., stabilityai/stable-diffusion-xl-base-1.0"
276
+ )
277
+ img_provider = gr.Dropdown(
278
+ choices=["hf-inference", "fal-ai", "nebius", "nscale", "replicate", "together"],
279
+ value="hf-inference",
280
+ label="Provider",
281
+ interactive=True
282
+ )
283
+
284
+ # Generation parameters
285
+ with gr.Group():
286
+ gr.Markdown("**📝 Prompts**")
287
+ img_prompt = gr.Textbox(
288
+ value="A beautiful landscape with mountains and a lake at sunset, photorealistic, 8k, highly detailed",
289
+ label="Prompt",
290
+ lines=3,
291
+ placeholder="Describe the image you want to generate..."
292
+ )
293
+ img_negative_prompt = gr.Textbox(
294
+ value="blurry, low quality, distorted, deformed, ugly, bad anatomy",
295
+ label="Negative Prompt",
296
+ lines=2,
297
+ placeholder="Describe what you DON'T want in the image..."
298
+ )
299
+
300
+ with gr.Group():
301
+ gr.Markdown("**⚙️ Generation Settings**")
302
+ with gr.Row():
303
+ img_width = gr.Slider(
304
+ minimum=256, maximum=2048, value=1024, step=64,
305
+ label="Width", info="Must be divisible by 8"
306
+ )
307
+ img_height = gr.Slider(
308
+ minimum=256, maximum=2048, value=1024, step=64,
309
+ label="Height", info="Must be divisible by 8"
310
+ )
311
+
312
+ with gr.Row():
313
+ img_steps = gr.Slider(
314
+ minimum=10, maximum=100, value=20, step=1,
315
+ label="Inference Steps", info="More steps = better quality"
316
+ )
317
+ img_guidance = gr.Slider(
318
+ minimum=1.0, maximum=20.0, value=7.5, step=0.5,
319
+ label="Guidance Scale", info="How closely to follow prompt"
320
+ )
321
+
322
+ img_seed = gr.Slider(
323
+ minimum=-1, maximum=999999, value=-1, step=1,
324
+ label="Seed", info="-1 for random"
325
+ )
326
+
327
+ # Generate button
328
+ generate_btn = gr.Button(
329
+ "🎨 Generate Image",
330
+ variant="primary",
331
+ size="lg",
332
+ scale=2
333
+ )
334
+
335
+ # Quick model presets
336
+ with gr.Group():
337
+ gr.Markdown("**🎯 Popular Presets**")
338
+ preset_buttons = []
339
+ presets = [
340
+ ("SDXL (HF)", "stabilityai/stable-diffusion-xl-base-1.0", "hf-inference"),
341
+ ("FLUX.1 (Nebius)", "black-forest-labs/FLUX.1-dev", "nebius"),
342
+ ("Qwen (Fal.ai)", "Qwen/Qwen-Image", "fal-ai"),
343
+ ("SDXL (NScale)", "stabilityai/stable-diffusion-xl-base-1.0", "nscale"),
344
+ ]
345
+
346
+ for name, model, provider in presets:
347
+ btn = gr.Button(name, size="sm")
348
+ btn.click(
349
+ lambda m=model, p=provider: (m, p),
350
+ outputs=[img_model_name, img_provider]
351
+ )
352
+
353
+ # Examples for image generation
354
+ with gr.Group():
355
+ gr.Markdown("**🌟 Example Prompts**")
356
+ img_examples = gr.Examples(
357
+ examples=[
358
+ ["A majestic dragon flying over a medieval castle, epic fantasy art, detailed, 8k"],
359
+ ["A serene Japanese garden with cherry blossoms, zen atmosphere, peaceful, high quality"],
360
+ ["A futuristic cityscape with flying cars and neon lights, cyberpunk style, cinematic"],
361
+ ["A cute robot cat playing with yarn, adorable, cartoon style, vibrant colors"],
362
+ ["A magical forest with glowing mushrooms and fairy lights, fantasy, ethereal beauty"],
363
+ ["Portrait of a wise old wizard with flowing robes, magical aura, fantasy character art"],
364
+ ["A cozy coffee shop on a rainy day, warm lighting, peaceful atmosphere, detailed"],
365
+ ["An astronaut floating in space with Earth in background, photorealistic, stunning"]
366
+ ],
367
+ inputs=img_prompt
368
+ )
369
+
370
+ # Event handlers for image generation
371
+ def on_generate_image(prompt_val, model_val, provider_val, negative_prompt_val, width_val, height_val, steps_val, guidance_val, seed_val):
372
+ # Validate dimensions
373
+ is_valid, error_msg = validate_dimensions(width_val, height_val)
374
+ if not is_valid:
375
+ return None, f"❌ Validation Error: {error_msg}"
376
+
377
+ # Generate image
378
+ return generate_image(
379
+ prompt=prompt_val,
380
+ model_name=model_val,
381
+ provider=provider_val,
382
+ negative_prompt=negative_prompt_val,
383
+ width=width_val,
384
+ height=height_val,
385
+ num_inference_steps=steps_val,
386
+ guidance_scale=guidance_val,
387
+ seed=seed_val
388
+ )
389
+
390
+ # Connect image generation events
391
+ generate_btn.click(
392
+ fn=on_generate_image,
393
+ inputs=[
394
+ img_prompt, img_model_name, img_provider, img_negative_prompt,
395
+ img_width, img_height, img_steps, img_guidance, img_seed
396
+ ],
397
+ outputs=[output_image, status_text]
398
+ )
399
+
400
+ # Footer with helpful information
401
+ gr.Markdown("""
402
+ ---
403
+ ### 📚 How to Use
404
+
405
+ **Chat Tab:**
406
+ - Enter your message and customize the AI's behavior with system messages
407
+ - Choose models and providers using the format `model:provider`
408
+ - Adjust temperature for creativity and top-p for response diversity
409
+
410
+ **Image Tab:**
411
+ - Write detailed prompts describing your desired image
412
+ - Use negative prompts to avoid unwanted elements
413
+ - Experiment with different models and providers for varied styles
414
+ - Higher inference steps = better quality but slower generation
415
+
416
+ **Supported Providers:**
417
+ - **hf-inference**: Core API with comprehensive model support
418
+ - **cerebras**: High-performance inference
419
+ - **cohere**: Advanced language models with multilingual support
420
+ - **groq**: Ultra-fast inference, optimized for speed
421
+ - **together**: Collaborative AI hosting, wide model support
422
+ - **fal-ai**: High-quality image generation
423
+ - **nebius**: Cloud-native services with enterprise features
424
+ - **nscale**: Optimized inference performance
425
+ - **replicate**: Collaborative AI hosting
426
+
427
+ **Built with ❤️ using [HF-Inferoxy](https://nazdridoy.github.io/hf-inferoxy/) for intelligent token management**
428
+ """)
429
+
430
+
431
+ if __name__ == "__main__":
432
+ demo.launch()
hf_token_utils.py ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # hf_token_utils.py
2
+ import os
3
+ import requests
4
+ import json
5
+ from typing import Dict, Optional, Any, Tuple
6
+
7
+ def get_proxy_token(proxy_url: str = "http://scw.nazdev.tech:11155", api_key: str = None) -> Tuple[str, str]:
8
+ """
9
+ Get a valid token from the proxy server.
10
+
11
+ Args:
12
+ proxy_url: URL of the HF-Inferoxy server
13
+ api_key: Your API key for authenticating with the proxy server
14
+
15
+ Returns:
16
+ Tuple of (token, token_id)
17
+
18
+ Raises:
19
+ Exception: If token provisioning fails
20
+ """
21
+ headers = {}
22
+ if api_key:
23
+ headers["Authorization"] = f"Bearer {api_key}"
24
+
25
+ response = requests.get(f"{proxy_url}/keys/provision", headers=headers)
26
+ if response.status_code != 200:
27
+ raise Exception(f"Failed to provision token: {response.text}")
28
+
29
+ data = response.json()
30
+ token = data["token"]
31
+ token_id = data["token_id"]
32
+
33
+ # For convenience, also set environment variable
34
+ os.environ["HF_TOKEN"] = token
35
+
36
+ return token, token_id
37
+
38
+ def report_token_status(
39
+ token_id: str,
40
+ status: str = "success",
41
+ error: Optional[str] = None,
42
+ proxy_url: str = "http://scw.nazdev.tech:11155",
43
+ api_key: str = None
44
+ ) -> bool:
45
+ """
46
+ Report token usage status back to the proxy server.
47
+
48
+ Args:
49
+ token_id: ID of the token to report (from get_proxy_token)
50
+ status: Status to report ('success' or 'error')
51
+ error: Error message if status is 'error'
52
+ proxy_url: URL of the HF-Inferoxy server
53
+ api_key: Your API key for authenticating with the proxy server
54
+
55
+ Returns:
56
+ True if report was accepted, False otherwise
57
+ """
58
+ payload = {"token_id": token_id, "status": status}
59
+
60
+ if error:
61
+ payload["error"] = error
62
+
63
+ # Extract error classification based on actual HF error patterns
64
+ error_type = None
65
+ if "401 Client Error" in error:
66
+ error_type = "invalid_credentials"
67
+ elif "402 Client Error" in error and "exceeded your monthly included credits" in error:
68
+ error_type = "credits_exceeded"
69
+
70
+ if error_type:
71
+ payload["error_type"] = error_type
72
+
73
+ headers = {"Content-Type": "application/json"}
74
+ if api_key:
75
+ headers["Authorization"] = f"Bearer {api_key}"
76
+
77
+ try:
78
+ response = requests.post(f"{proxy_url}/keys/report", json=payload, headers=headers)
79
+ return response.status_code == 200
80
+ except Exception as e:
81
+ # Silently fail to avoid breaking the client application
82
+ # In production, consider logging this error
83
+ return False
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ gradio
2
+ huggingface-hub
3
+ requests
4
+ Pillow