AI & ML interests

None defined yet.

Recent Activity

Welcome to the Inference Acceleration Team under Zhejiang Innovation Research Institute. We are dedicated to achieving efficient large model inference on NVIDIA and domestic GPU platforms, with a focus on cutting-edge inference acceleration technologies such as speculative decoding and model quantization. Feel free to explore our space!

datasets 0

None public yet