可以使用vllm吗
#3
by
xiabo0816
- opened
小白想问一下,如果想要使用vllm部署该怎么办呢
可以先参考vllm官方文档:https://docs.vllm.ai/en/v0.7.1/getting_started/examples/embedding.html
具体效果和性能建议和 sentence-transformers 进行对比,避免vllm框架本身的BUG。
注意:由于该模型修改了双向注意力,需加载仓库中的 modeling.py,因此在加载模型时添加 trust_remote_code=True
小白想问一下,如果想要使用vllm部署该怎么办呢
你好,KaLM-Embedding 模型已适配 vllm,相关实现可参考以下代码
import torch
import vllm
from vllm import LLM
def get_detailed_instruct(task_description: str, query: str) -> str:
return f'Instruct: {task_description}\nQuery:{query}'
task = 'Given a query, retrieve documents that answer the query'
queries = [
get_detailed_instruct(task, 'What is the capital of China?'),
get_detailed_instruct(task, 'Explain gravity')
]
documents = [
"The capital of China is Beijing.",
"Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun."
]
input_texts = queries + documents
model = LLM(model="{MODEL_NAME_OR_PATH}", task="embed", trust_remote_code=True, dtype="float16")
outputs = model.embed(input_texts)
embeddings = torch.tensor([o.outputs.embedding for o in outputs])
scores = (embeddings[:2] @ embeddings[2:].T)
print(scores.tolist())