可以使用vllm吗

#3
by xiabo0816 - opened

小白想问一下,如果想要使用vllm部署该怎么办呢

HITsz-Text and Multimodal Generative Intelligence Group(TMG) org

可以先参考vllm官方文档:https://docs.vllm.ai/en/v0.7.1/getting_started/examples/embedding.html
具体效果和性能建议和 sentence-transformers 进行对比,避免vllm框架本身的BUG。
注意:由于该模型修改了双向注意力,需加载仓库中的 modeling.py,因此在加载模型时添加 trust_remote_code=True

HITsz-Text and Multimodal Generative Intelligence Group(TMG) org
edited 13 days ago

小白想问一下,如果想要使用vllm部署该怎么办呢

你好,KaLM-Embedding 模型已适配 vllm,相关实现可参考以下代码

import torch
import vllm
from vllm import LLM
def get_detailed_instruct(task_description: str, query: str) -> str:
    return f'Instruct: {task_description}\nQuery:{query}'

task = 'Given a query, retrieve documents that answer the query'
queries = [
    get_detailed_instruct(task, 'What is the capital of China?'),
    get_detailed_instruct(task, 'Explain gravity')
]
documents = [
    "The capital of China is Beijing.",
    "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun."
]
input_texts = queries + documents

model = LLM(model="{MODEL_NAME_OR_PATH}", task="embed", trust_remote_code=True, dtype="float16")

outputs = model.embed(input_texts)
embeddings = torch.tensor([o.outputs.embedding for o in outputs])
scores = (embeddings[:2] @ embeddings[2:].T)
print(scores.tolist())

Sign up or log in to comment