Morgan Funtowicz's picture

Morgan Funtowicz

mfuntowicz

AI & ML interests

Model inference low-level optimization, hardware affinity and large-scale distributed training.

Organizations

Hugging Face's profile picture BigScience Workshop's profile picture AWS Inferentia and Trainium's profile picture Hugging Face Infinity's profile picture Hugging Face Optimum's profile picture Need4Speed's profile picture Hugging Face Smol Cluster's profile picture Optimum Nvidia's profile picture Optimum AMD's profile picture gg-hf's profile picture Optimum-TPU's profile picture hsramall's profile picture Optimum-Intel's profile picture gg-tt's profile picture Hugging Face Machine Learning Optimization's profile picture Optimum Internal Testing's profile picture blhf's profile picture Huggingface HUGS's profile picture smol-explorers's profile picture Koin Project's profile picture

mfuntowicz's activity

published an article 2 months ago
view article
Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

By mfuntowicz and 1 other
71
published an article 6 months ago
view article
Article

Introducing the AMD 5th Gen EPYC™ CPU

6
published an article 10 months ago
published an article about 1 year ago
view article
Article

CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG

By peterizsak and 5 others
9
published an article about 1 year ago
view article
Article

Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive

By sschoenmeyer and 2 others
4
published an article over 1 year ago
view article
Article

AMD + 🤗: Large Language Models Out-of-the-Box Acceleration with AMD GPU

2
published an article over 1 year ago
view article
Article

Optimum-NVIDIA - Unlock blazingly fast LLM inference in just 1 line of code

By laikh-nvidia and 1 other
5
published an article over 1 year ago
view article
Article

Accelerating over 130,000 Hugging Face models with ONNX Runtime

By sschoenmeyer and 1 other
1
published an article about 3 years ago
view article
Article

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

By philschmid and 2 others
2
published an article over 3 years ago
view article
Article

Scaling up BERT-like model Inference on modern CPU - Part 2

By mfuntowicz and 3 others
1
published an article over 3 years ago
view article
Article

Introducing Optimum: The Optimization Toolkit for Transformers at Scale

By mfuntowicz and 3 others
1
published an article almost 4 years ago