Morgan Funtowicz's picture

27 5 5

Morgan Funtowicz

mfuntowicz

·

https://github.com/mfuntowicz

AI & ML interests

Model inference low-level optimization, hardware affinity and large-scale distributed training.

Recent Activity

updated a model 13 days ago

hfendpoints-images/embeddings-sentence-transformers-cpu

updated a model 13 days ago

hfendpoints-images/nvidia-nemo-asr

published a Space 13 days ago

mfuntowicz/kompressor

View all activity

Organizations

mfuntowicz's activity

published an article 22 days ago

Article

Blazingly fast whisper transcriptions with Inference Endpoints

By

and 5 others •

22 days ago

• 66

published an article 5 months ago

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

By

and 1 other •

Jan 16

• 74

published an article 8 months ago

Article

Introducing the AMD 5th Gen EPYC™ CPU

Oct 10, 2024

• 6

published an article about 1 year ago

Article

Hugging Face on AMD Instinct MI300 GPU

By

and 3 others •

May 21, 2024

• 15

published an article about 1 year ago

Article

CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG

By

and 5 others •

Mar 15, 2024

• 10

published an article over 1 year ago

Article

Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive

By

and 2 others •

Jan 15, 2024

• 6

published an article over 1 year ago

Article

AMD + 🤗: Large Language Models Out-of-the-Box Acceleration with AMD GPU

Dec 5, 2023

• 4

published an article over 1 year ago

Article

Optimum-NVIDIA - Unlock blazingly fast LLM inference in just 1 line of code

By

and 1 other •

Dec 5, 2023

• 5

published an article over 1 year ago

Article

Accelerating over 130,000 Hugging Face models with ONNX Runtime

By

and 1 other •

Oct 4, 2023

• 1

published an article over 3 years ago

Article

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

By

and 2 others •

Jan 13, 2022

• 2

published an article over 3 years ago

Article

Scaling up BERT-like model Inference on modern CPU - Part 2

By

and 3 others •

Nov 4, 2021

• 1

published an article over 3 years ago

Article

Introducing Optimum: The Optimization Toolkit for Transformers at Scale

By

and 3 others •

Sep 14, 2021

• 1

published an article about 4 years ago

Article

Scaling-up BERT Inference on CPU (Part 1)

By

•

Apr 20, 2021

• 3