update-onnx-model
#80
by
kozistr
- opened
Issue with loading the onnx model with TEI
So, I've re-exported the onnx file following code.
onnx_model = SentenceTransformer('answerdotai/ModernBERT-base', backend='onnx', model_kwargs={'export': True})
onnx_model.save_pretrained(output_dir)
And, it works!
./target/release/text-embeddings-router --model-id answerdotai/ModernBERT-base --revision refs/pr/80 --port 8888 --dtype float32 --pooling cls --auto-truncate --max-batch-tokens 1024
2025-06-09T16:11:40.471949Z INFO text_embeddings_router: router/src/main.rs:189: Args { model_id: "ans********/**********-*ase", revision: Some("refs/pr/80"), tokenization_workers: None, dtype: Some(Float32), pooling: Some(Cls), max_concurrent_requests: 512, max_batch_tokens: 1024, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: true, default_prompt_name: None, default_prompt: None, hf_api_token: None, hf_token: None, hostname: "0.0.0.0", port: 8888, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, payload_limit: 2000000, api_key: None, json_output: false, disable_spans: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", prometheus_port: 9000, cors_allow_origin: None }
2025-06-09T16:11:40.472075Z INFO hf_hub: /home/zero/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/hf-hub-0.4.2/src/lib.rs:72: Using token file found "/home/zero/.cache/huggingface/token"
2025-06-09T16:11:40.541460Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2025-06-09T16:11:42.049444Z INFO download_artifacts:download_new_st_config: text_embeddings_core::download: core/src/download.rs:77: Downloading `config_sentence_transformers.json`
2025-06-09T16:11:42.250736Z WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:36: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/answerdotai/ModernBERT-base/resolve/refs%2Fpr%2F80/config_sentence_transformers.json)
2025-06-09T16:11:42.250769Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:40: Downloading `config.json`
2025-06-09T16:11:42.683011Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:43: Downloading `tokenizer.json`
2025-06-09T16:11:43.854866Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:47: Model artifacts downloaded in 3.313405842s
2025-06-09T16:11:43.924153Z WARN text_embeddings_router: router/src/lib.rs:189: Could not find a Sentence Transformers config
2025-06-09T16:11:43.924187Z INFO text_embeddings_router: router/src/lib.rs:193: Maximum number of tokens per request: 8192
2025-06-09T16:11:43.924400Z INFO text_embeddings_core::tokenization: core/src/tokenization.rs:38: Starting 8 tokenization workers
2025-06-09T16:11:44.027294Z INFO text_embeddings_router: router/src/lib.rs:235: Starting model backend
2025-06-09T16:11:44.027480Z INFO text_embeddings_backend: backends/src/lib.rs:534: Downloading `model.onnx`
2025-06-09T16:11:44.233186Z WARN text_embeddings_backend: backends/src/lib.rs:538: Could not download `model.onnx`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/answerdotai/ModernBERT-base/resolve/refs%2Fpr%2F80/model.onnx)
2025-06-09T16:11:44.233217Z INFO text_embeddings_backend: backends/src/lib.rs:539: Downloading `onnx/model.onnx`
2025-06-09T16:12:45.803707Z INFO text_embeddings_backend: backends/src/lib.rs:548: Downloading `model.onnx_data`
2025-06-09T16:12:46.011374Z WARN text_embeddings_backend: backends/src/lib.rs:552: Could not download `model.onnx_data`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/answerdotai/ModernBERT-base/resolve/refs%2Fpr%2F80/model.onnx_data)
2025-06-09T16:12:46.011411Z INFO text_embeddings_backend: backends/src/lib.rs:553: Downloading `onnx/model.onnx_data`
2025-06-09T16:12:46.217593Z WARN text_embeddings_backend: backends/src/lib.rs:557: Could not download `onnx/model.onnx_data`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/answerdotai/ModernBERT-base/resolve/refs%2Fpr%2F80/onnx/model.onnx_data)
2025-06-09T16:12:46.217631Z INFO text_embeddings_backend: backends/src/lib.rs:349: Model ONNX weights downloaded in 60.857207318s
2025-06-09T16:12:48.101908Z INFO text_embeddings_router: router/src/lib.rs:252: Warming up model
2025-06-09T16:12:50.727835Z WARN text_embeddings_router: router/src/lib.rs:261: Backend does not support a batch size > 8
2025-06-09T16:12:50.727903Z WARN text_embeddings_router: router/src/lib.rs:262: forcing `max_batch_requests=8`
2025-06-09T16:12:50.729761Z INFO text_embeddings_router::http::server: router/src/http/server.rs:1847: Starting HTTP server: 0.0.0.0:8888
2025-06-09T16:12:50.729893Z INFO text_embeddings_router::http::server: router/src/http/server.rs:1848: Ready
kozistr
changed pull request status to
open