27 5 5

Morgan Funtowicz

mfuntowicz

https://github.com/mfuntowicz

AI & ML interests

Model inference low-level optimization, hardware affinity and large-scale distributed training.

Recent Activity

updated a model 13 days ago

hfendpoints-images/embeddings-sentence-transformers-cpu

updated a model 13 days ago

hfendpoints-images/nvidia-nemo-asr

published a Space 13 days ago

mfuntowicz/kompressor

View all activity

Organizations

mfuntowicz's activity

updated 2 models 13 days ago

hfendpoints-images/embeddings-sentence-transformers-cpu

Updated 13 days ago

hfendpoints-images/nvidia-nemo-asr

Updated 13 days ago

published a Space 13 days ago

Kompressor

🦀

Space allowing to quantize and sparsify models

published a model 14 days ago

hfendpoints-images/nvidia-nemo-asr

Updated 13 days ago

upvoted an article 20 days ago

Article

The Transformers Library: standardizing model definitions

and 3 others •

20 days ago

• 109

commented on Blazingly fast whisper transcriptions with Inference Endpoints 20 days ago

Thanks @vectorventures ! Word-level timestamps is not yet supported and from my readings, even through OAI platform with Whisper it's not perfect either.

I was looking at the CrisperWhisper @razhan mentionned and it seems to provide some good/better word-level timestamps out of the box, so maybe this is something we can investigate in the coming weeks? would you like to take a look and potentially open a PR for this?

commented on Blazingly fast whisper transcriptions with Inference Endpoints 20 days ago

@RavirajDarisi Of course, you can deploy your own endpoint for instance, and use the URL within your Android app 👌- lmk if you need some more guidance.

commented on Blazingly fast whisper transcriptions with Inference Endpoints 20 days ago

We do provide a precompiled image if you want: docker pull mfuntowicz/endpoints-whisper-vllm:v1.0.2-py312

To run with Docker you should invoke with something like this:

$> docker run --gpus 1 -p 8000:80 -e MODEL_ID=openai/whisper-large-v3-turbo mfuntowicz/endpoints-whisper-vllm:v1.0.2-py312

It will starts the container exposing port 8000 on your local machine. You can for instance navigate to http://localhost:8000/docs to see documentation.

If you want some more information, the repository @Younggun linked should provide some more and of course, feel free to reply here 🤗