Morgan Funtowicz's picture

Morgan Funtowicz

mfuntowicz

AI & ML interests

Model inference low-level optimization, hardware affinity and large-scale distributed training.

Recent Activity

Organizations

Hugging Face's profile picture BigScience Workshop's profile picture AWS Inferentia and Trainium's profile picture Hugging Face Infinity's profile picture Hugging Face Optimum's profile picture Hugging Test Lab's profile picture Need4Speed's profile picture Hugging Face Smol Cluster's profile picture Optimum AMD's profile picture Optimum Nvidia's profile picture gg-hf's profile picture Optimum-TPU's profile picture hsramall's profile picture Optimum-Intel's profile picture gg-tt's profile picture Hugging Face Machine Learning Optimization's profile picture Optimum Internal Testing's profile picture blhf's profile picture Huggingface HUGS's profile picture smol-explorers's profile picture Koin Project's profile picture hf-inference's profile picture Inference Endpoints Images's profile picture

mfuntowicz's activity

upvoted an article 20 days ago
view article
Article

The Transformers Library: standardizing model definitions

By lysandre and 3 others •
• 109
view reply

Thanks @vectorventures ! Word-level timestamps is not yet supported and from my readings, even through OAI platform with Whisper it's not perfect either.

I was looking at the CrisperWhisper @razhan mentionned and it seems to provide some good/better word-level timestamps out of the box, so maybe this is something we can investigate in the coming weeks? would you like to take a look and potentially open a PR for this?

view reply

@RavirajDarisi Of course, you can deploy your own endpoint for instance, and use the URL within your Android app 👌- lmk if you need some more guidance.

view reply

We do provide a precompiled image if you want: docker pull mfuntowicz/endpoints-whisper-vllm:v1.0.2-py312

To run with Docker you should invoke with something like this:

$> docker run --gpus 1 -p 8000:80 -e MODEL_ID=openai/whisper-large-v3-turbo mfuntowicz/endpoints-whisper-vllm:v1.0.2-py312

It will starts the container exposing port 8000 on your local machine. You can for instance navigate to http://localhost:8000/docs to see documentation.

If you want some more information, the repository @Younggun linked should provide some more and of course, feel free to reply here 🤗

view reply

If you pull the Docker image locally and have a GPU available on your computer/server, it should yes 👌

view reply

I can give it a try in the coming days!

upvoted an article 22 days ago
view article
Article

Blazingly fast whisper transcriptions with Inference Endpoints

By mfuntowicz and 5 others •
• 66
New activity in freddyaboulton/really-fast-whisper 22 days ago

update link to endpoint

#1 opened 22 days ago by
mfuntowicz
published an article 22 days ago
view article
Article

Blazingly fast whisper transcriptions with Inference Endpoints

By mfuntowicz and 5 others •
• 66
New activity in hfendpoints-images/text-generation-sglang-gpu about 1 month ago

Use transformers backend

#2 opened about 1 month ago by
mfuntowicz

Enable Tool Functions Call

#1 opened about 1 month ago by
mfuntowicz