Loading checkpoint shards: killed at 33%

#1
by awppatel - opened

Hi,

I am running an application using this model and when loading it gets killed at the 33% mark. My setup is as follows

Ubuntu Server 22.04 w/ drivers
1x RTX A6000 (48GB) [Premium]
6 vCPU, 96 GB RAM, 300 GB Storage

htop app shows that i have almost 90 GB of RAM free
further more there is 48 GB of GPU Memory

Here is the sample of my code that i use to access the model

import gradio as gr
import torch
import transformers
import librosa
import numpy as np
import tempfile
import os
from kokoro import KPipeline
import soundfile as sf
from typing import Dict, Optional, Tuple
import huggingface_hub
from huggingface_hub import login

class VoiceAssistant:
# Available voices with their configurations
VOICES = {
"Bella (US Female)": {"code": "af_bella", "lang_code": "a"},
"Nicole (US Female)": {"code": "af_nicole", "lang_code": "a"},
"Michael (US Male)": {"code": "am_michael", "lang_code": "a"},
"Emma (UK Female)": {"code": "bf_emma", "lang_code": "b"},
"George (UK Male)": {"code": "bm_george", "lang_code": "b"}
}

def __init__(self):
    """Initialize both Ultravox and Kokoro TTS models"""
    access_token_read = "token i got from Huggingface for gated repo"
    login(token=access_token_read)
    print("Loading Ultravox model... This may take a few minutes...")
    self.pipe = transformers.pipeline(
        model='fixie-ai/ultravox-v0_5-llama-3_3-70b',  # Updated to v0_5
        # model='fixie-ai/ultravox-v0_4',  # Original 04
        trust_remote_code=True
    )

print("Model loaded successfully!")

Could you please let me know what it is that i am missing?

Thanks,
Sincerely,
Arshad.

Fixie.ai org

The 70B model has higher memory requirements. Quantization could be a potential solution, but it is not currently supported.

What is the minimum requirement for that model. Please let me know so we can configure appropriately.

Thanks

Fixie.ai org

For the 70B model, we use vLLM and multiple H100s for serving. We haven't attempted loading the model on a single GPU yet.

Thanks. Will keep that in mind in the configuration.

awppatel changed discussion status to closed

Sign up or log in to comment