Spaces:
Running
[v0.28.0]: Third-party Inference Providers on the Hub & multiple quality of life improvements and bug fixes
โก๏ธUnified Inference Across Multiple Inference Providers
The InferenceClient
now supports third-party providers, offering a unified interface to run inference across multiple services while leveraging models from the Hugging Face Hub. This update enables developers to:
- ๐ Switch providers seamlessly - Transition between inference providers with a single interface.
- ๐ Unified model IDs - Always reference Hugging Face Hub model IDs, even when using external providers.
- ๐ Simplified billing and access management - You can use your Hugging Face Token for routing to third-party providers (billed through your HF account).
A list of supported third-party providers can be found here.
Example of text-to-image inference with Replicate:
>>> from huggingface_hub import InferenceClient
>>> replicate_client = InferenceClient(
... provider="replicate",
... api_key="my_replicate_api_key", # Using your personal Replicate key
)
>>> image = replicate_client.text_to_image(
... "A cyberpunk cat hacking neural networks",
... model="black-forest-labs/FLUX.1-schnell"
)
>>> image.save("cybercat.png")
Another example of chat completion with Together AI:
>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient(
... provider="together", # Use Together AI provider
... api_key="<together_api_key>", # Pass your Together API key directly
... )
>>> client.chat_completion(
... model="deepseek-ai/DeepSeek-R1",
... messages=[{"role": "user", "content": "How many r's are there in strawberry?"}],
... )
When using external providers, you can choose between two access modes: either use the provider's native API key, as shown in the examples above, or route calls through Hugging Face infrastructure (billed to your HF account):
>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient(
... provider="fal-ai",
... token="hf_****" # Your Hugging Face token
)
โ ๏ธ Parameters availability may vary between providers - check provider documentation.
๐ New providers/models/tasks will be added iteratively in the future.
๐ You can find a list of supported tasks per provider and more details here.
- [InferenceClient] Add third-party providers support by @celinah in #2757
- Unified
prepare_request
method + class-based providers by @Wauplin in #2777- [InferenceClient] Support proxy calls for 3rd party providers by @celinah in #2781
- [InferenceClient] Add
text-to-video
task and update supported tasks and models by @celinah in #2786- Add type hints for providers by @Wauplin in #2788
- [InferenceClient] Update inference documentation by @celinah in #2776
- Add text-to-video to supported tasks by @Wauplin in #2790
โจ HfApi
The following change aligns the client with server-side updates by adding new repositories properties: usedStorage
and resourceGroup
.
[HfApi] update list of repository properties following server side updates by @celinah in #2728
Extends empty commit prevention to file copy operations, preserving clean version histories when no changes are made.
[HfApi] prevent empty commits when copying files by @celinah in #2730
๐ ๐ Documentation
Thanks to @WizKnight , the hindi translation is much better!
Improved Hindi Translation in Documentation๐ by @WizKnight in #2697
๐ Breaking changes
The like
endpoint has been removed to prevent misuse. You can still remove existing likes using the unlike
endpoint.
[HfApi] remove
like
endpoint by @celinah in #2739
๐ ๏ธ Small fixes and maintenance
๐ QoL improvements
- [InferenceClient] flag
chat_completion()
'slogit_bias
as UNUSED by @celinah in #2724 - Remove unused parameters from method's docstring by @celinah in #2738
- Add optional rejection_reason when rejecting a user access token by @Wauplin in #2758
- Add
py.typed
to be compliant with PEP-561 again by @celinah in #2752
๐ Bug and typo fixes
- Fix super_squash_history revision not urlencoded by @Wauplin in #2795
- Replace model repo with repo in docstrings by @albertvillanova in #2715
- [BUG] Fix 404 NOT FOUND issue caused by endpoint tail slash by @Mingqi2 in #2721
- Fix
typing.get_type_hints
call on aModelHubMixin
by @aliberts in #2729 - fix typo by @qwertyforce in #2762
- rejection reason docstring by @Wauplin in #2764
- Add timeout to WeakFileLock by @Wauplin in #2751
- Fix
CardData.get()
to respect default values whenNone
by @celinah in #2770 - Fix RepoCard.load when passing a repo_id that is also a dir path by @Wauplin in #2771
- Fix filename too long when downloading to local folder by @Wauplin in #2789