Spaces:
Paused
Paused
metadata
title: CLIP Service
emoji: π
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
CLIP Service π
A FastAPI service that provides CLIP (Contrastive Language-Image Pre-training) embeddings for images and text using the openai/clip-vit-large-patch14
model.
π Features
- Image Encoding: Generate 768-dimensional embeddings from image URLs
- Text Encoding: Generate embeddings from text descriptions
- High Performance: Optimized for batch processing
- REST API: Simple HTTP endpoints for easy integration
π API Endpoints
POST /encode/image
Generate embeddings for an image from URL.
Request:
{
"image_url": "https://example.com/image.jpg"
}
Response:
{
"embedding": [0.1, -0.2, 0.3, ...], // 768 dimensions
"dimensions": 768
}
POST /encode/text
Generate embeddings for text.
Request:
{
"text": "a beautiful sunset over mountains"
}
Response:
{
"embedding": [0.1, -0.2, 0.3, ...], // 768 dimensions
"dimensions": 768
}
GET /health
Check service health and status.
π§ Usage Examples
# Encode an image
curl -X POST "https://your-username-clip-service.hf.space/encode/image" \
-H "Content-Type: application/json" \
-d '{"image_url": "https://example.com/image.jpg"}'
# Encode text
curl -X POST "https://your-username-clip-service.hf.space/encode/text" \
-H "Content-Type: application/json" \
-d '{"text": "a beautiful landscape"}'
ποΈ Integration
This service is designed to work with Pinterest-like applications for:
- Visual similarity search
- Content-based recommendations
- Cross-modal search (text to image, image to text)
π Model Information
- Model:
openai/clip-vit-large-patch14
- Embedding Dimensions: 768
- Supported Images: JPG, PNG, GIF, WebP
- Max Image Size: Recommended < 10MB
β‘ Performance
- CPU: ~2-5 seconds per image
- GPU: ~0.5-1 second per image (when available)
- Batch Processing: Supported for multiple requests
Built with β€οΈ using Transformers and FastAPI