tuandunghcmut's picture
Add files using upload-large-folder tool
95e1e2e verified
|
raw
history blame
2.11 kB

SGLang Router v0.1.0: Dynamic Scaling and Fault Tolerance

We have released sglang-router v0.1.0 equipped with dynamic scaling and fault tolerance! It is essential for the router to be able to dynamically scale the number of workers and handle worker failures. To achieve this, we have implemented the following features:

1. Dynamic scaling: The router can dynamically scale the number of workers based on the request load.

We offer /add_worker and /remove_worker APIs to dynamically add or remove workers from the router.

  • /add_worker

Usage:

$ curl -X POST http://localhost:30000/add_worker?url=http://worker_url_1

Example:

$ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --port 30001
$ curl -X POST http://localhost:30000/add_worker?url=http://127.0.0.1:30001
Successfully added worker: http://127.0.0.1:30001
  • /remove_worker

Usage:

$ curl -X POST http://localhost:30000/remove_worker?url=http://worker_url_1

Example:

$ curl -X POST http://localhost:30000/remove_worker?url=http://127.0.0.1:30001
Successfully removed worker: http://127.0.0.1:30001

Note:

  • For cache-aware router, the worker will be removed from the tree and the queues.

2. Fault tolerance: The router can handle worker failures and automatically remove the failed worker from the router.

We provide retries based for failure tolerance.

  1. If the request to a worker fails for max_worker_retries times, the router will remove the worker from the router and move on to the next worker.
  2. If the total number of retries exceeds max_total_retries, the router will return an error.

Note:

  • max_worker_retries is 3 and max_total_retries is 6 by default.

Closing remarks:

  1. Please read the full usage at https://sgl-project.github.io/router/router.html
  2. The feature is still under active improvement, so please don't hesitate to raise issues or submit PRs if you have any suggestions or feedback.

Release Instructions

Update the version in rust/pyproject.toml and py_src/sglang_router/version.py.