File size: 4,717 Bytes
62bc43d
 
 
 
 
 
 
 
19ddf57
 
eb0d432
19ddf57
 
 
 
eb0d432
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ce1adb2
 
eb0d432
 
 
 
ce1adb2
eb0d432
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
library_name: diffusers
---

# LTX-Video-0-9-6-HFIE

`LTX-Video-0-9-6-HFIE` is a version of LTX-Video 0.9.6 (distilled) that can be deployed to a Hugging Face endpoint.

It is used in production by AiTube2 (https://aitube.at)

# Deployment

When you create a Hugging Face Inference endpoint, make sure to:

- select a Nvidia L40S (at least)
- In the "Advanced Settings" tab, select "Download Pattern" > "Download everything"

My current implementation works in either text-to-video or image-to-video mode.

This is controlled with an environment variable `SUPPORT_INPUT_IMAGE_PROMPT` which has to be truthy or falsy



# Usage

Once deployec you can use it like this:

```python
import requests
import base64
from PIL import Image
from io import BytesIO
import os

API_URL = "https://<USE YOR OWN>.aws.endpoints.huggingface.cloud"

API_TOKEN = "<USE YOR OWN>"

def query(payload):
    response = requests.post(API_URL, headers={
        "Accept": "application/json",
        "Authorization": f"Bearer {API_TOKEN}",
        "Content-Type": "application/json" 
    }, json=payload)
    return response.json()

def save_video(json_response, filename):

    try:
        error = json_response["error"]
        if error:
            print(error)
            return
    except Exception as e:
        pass

    video_data_uri = ""
    try:
        # Extract the video data URI from the response
        video_data_uri = json_response["video"]
    except Exception as e:
        message = str(json_response)
        print(message)
        raise ValueError(message)
    
    # Remove the data URI prefix to get just the base64 data
    # Assumes format like "data:video/mp4;base64,<actual_base64_data>"
    base64_data = video_data_uri.split(",")[1]
    
    # Decode the base64 data
    video_data = base64.b64decode(base64_data)
    
    # Write the binary data to an MP4 file
    with open(filename, "wb") as f:
        f.write(video_data)

def encode_image(image_path):
    """
    Load and encode an image file to base64
    
    Args:
        image_path (str): Path to the image file
        
    Returns:
        str: Base64 encoded image data URI
    """

    with Image.open(image_path) as img:
        # Convert to RGB if necessary
        if img.mode != "RGB":
            img = img.convert("RGB")
        
        # Save image to bytes
        img_byte_arr = BytesIO()
        img.save(img_byte_arr, format="JPEG")

        # Encode to base64
        base64_encoded = base64.b64encode(img_byte_arr.getvalue()).decode('utf-8')
        return f"data:image/jpeg;base64,{base64_encoded}"

# Example usage with image-to-video generation
image_filename = "input.jpg"  # Path to your input image
video_filename = "output.mp4"

config = {
    "inputs": {
        "prompt": "magnificent underwater footage, clownfishes swimming around coral inside the carribean sea, real gopro footage",
        # "image": encode_image(image_filename)
    },
    
    "parameters": {

        # ------------------- settings for LTX-Video -----------------------
        
        #"negative_prompt": "saturated, highlight, overexposed, highlighted, overlit, shaking, too bright, worst quality, inconsistent motion, blurry, jittery, distorted, cropped, watermarked, watermark, logo, subtitle, subtitles, lowres",

        # note about resolution:
        # we cannot use 720 since it cannot be divided by 32
        #
        # for a cinematic look:
        "width": 768,
        "height": 480,

        # for a vertical video look:
        #"width": 480,
        #"height": 768,

        # LTX-Video requires a frame number divisible by 8, plus one frame
        # note: glitches might appear if you use more than 168 frames
        "num_frames": (8 * 16) + 1,

        "num_inference_steps": 8,

        "guidance_scale": 1.0,

        #"seed": 1209877,

        # This will double the number of frames.
        # You can activate this if you want:
        # - a slow motion effect (in that case use double_num_frames=True and fps=24, 25 or 30)
        # - a HD soap / video game effect (in that case use double_num_frames=True and fps=60)
        "double_num_frames": True,

        # controls the number of frames per second
        # use this in combination with the num_frames and double_num_frames settings to control the duration and "feel" of your video
        "fps": 60, # typical values are: 24, 25, 30, 60

        # upscale the video using Real-ESRGAN.
        # This upscaling algorithm is relatively fast,
        # but might create an uncanny "3D render" or "drawing" effect.
        "super_resolution": True,
    }
}

# Make the API call
output = query(config)

# Save the video
save_video(output, video_filename)
```