使用 Spaces 和 Gradio 创建演示

作者: Diego Maniloff

介绍

在本 Notebook 中，我们将展示如何使用 Gradio 让任何机器学习模型栩栩如生。Gradio 是一个库，允许你从任何 Python 函数创建网页演示并与全世界分享 🌎！

📚 本 Notebook 涵盖内容：

创建一个 Hello, World! 演示：Gradio 基础
将你的演示移至 Hugging Face Spaces
让它更有趣：利用 🤗 Hub 的真实案例
一些 Gradio 的酷炫“电池内置”功能

⏭️ 在本 Notebook 的末尾，你会找到一个 进一步阅读 列表，包含链接，帮助你继续深入探索。

设置

要开始，请安装 gradio 库以及 transformers。

!pip -q install gradio==4.36.1
!pip -q install transformers==4.41.2

# the usual shorthand is to import gradio as gr
import gradio as gr

你的第一个演示：Gradio 基础

Gradio 的核心功能是将任何 Python 函数转换为网页界面。

假设我们有一个简单的函数，它接受 name 和 intensity 作为参数，并返回一个字符串，如下所示：

def greet(name: str, intensity: int) -> str:
    return "Hello, " + name + "!" * int(intensity)

如果你对名字为 ‘Diego’ 调用这个函数，你将得到如下所示的输出字符串：

>>> print(greet("Diego", 3))

Hello, Diego!!!

使用 Gradio，我们可以通过 gr.Interface 类为这个函数构建一个界面。我们只需要传入我们上面创建的 greet 函数，以及该函数所期望的输入和输出类型：

demo = gr.Interface(
    fn=greet,
    inputs=["text", "slider"],  # the inputs are a text box and a slider ("text" and "slider" are components in Gradio)
    outputs=["text"],  # the output is a text box
)

注意到我们传入了 ["text", "slider"] 作为输入，以及 ["text"] 作为输出——这些在 Gradio 中被称为组件。

这就是我们第一个演示所需要的全部内容。赶紧试试吧 👇🏼！在 name 文本框中输入你的名字，调整你想要的强度，然后点击 Submit。

# the launch method will fire up the interface we just created
demo.launch()

让我们做得更有趣：会议转录工具

到目前为止，你已经了解了如何将一个基本的 Python 函数转换成一个适合网页展示的演示。然而，我们所做的仅仅是一个非常简单的函数，甚至有点无聊！

现在，让我们考虑一个更有趣的例子，这个例子能突出展示 Gradio 最初的目的：展示最前沿的机器学习模型。最近，我的一位好朋友请我帮忙处理她做的一个访谈录音。她需要将音频文件转化成一个井井有条的文本摘要。我是怎么帮助她的呢？我构建了一个 Gradio 应用！

接下来，我们一起来看看如何构建会议转录工具。我们可以将这个过程分为两个部分：

将音频文件转录成文本
将文本组织成章节、段落、列表等。我们也可以在这里加入摘要功能。

音频转文本

在这一部分，我们将构建一个演示，处理会议转录工具的第一步：将音频转换成文本。

正如我们所学到的，构建一个 Gradio 演示的关键是拥有一个执行我们想展示逻辑的 Python 函数。对于音频转文本的转换，我们将使用强大的 transformers 库，并通过它的 pipeline 工具来使用一个流行的音频转文本模型 —— distil-whisper/distil-large-v3。

以下是 transcribe 函数，它接受我们想要转换的音频作为输入：

import os
import tempfile

import torch
import gradio as gr
from transformers import pipeline

device = 0 if torch.cuda.is_available() else "cpu"

AUDIO_MODEL_NAME = (
    "distil-whisper/distil-large-v3"  # faster and very close in performance to the full-size "openai/whisper-large-v3"
)
BATCH_SIZE = 8


pipe = pipeline(
    task="automatic-speech-recognition",
    model=AUDIO_MODEL_NAME,
    chunk_length_s=30,
    device=device,
)


def transcribe(audio_input):
    """Function to convert audio to text."""
    if audio_input is None:
        raise gr.Error("No audio file submitted!")

    output = pipe(audio_input, batch_size=BATCH_SIZE, generate_kwargs={"task": "transcribe"}, return_timestamps=True)
    return output["text"]

现在我们有了 Python 函数，我们可以通过将它传递给 gr.Interface 来进行演示。注意，在这种情况下，函数所期望的输入是我们要转换的音频。Gradio 包含了许多有用的组件，其中之一是 Audio，正是我们演示所需要的 🎶 😎。

part_1_demo = gr.Interface(
    fn=transcribe,
    inputs=gr.Audio(type="filepath"),  # "filepath" passes a str path to a temporary file containing the audio
    outputs=gr.Textbox(show_copy_button=True),  # give users the option to copy the results
    title="Transcribe Audio to Text",  # give our demo a title :)
)

part_1_demo.launch()

赶紧试试看 👆！你可以上传一个 .mp3 文件，或者点击 🎤 按钮录制你自己的声音。

如果你想要一个实际的会议录音样本，可以查看 MeetingBank_Audio 数据集，这是一个包含6个美国主要城市市议会会议的音频数据集。为了进行测试，我尝试了几个丹佛会议。

另外，查看 Interface 的 from_pipeline 构造函数，它可以直接从一个 pipeline 构建 Interface。

组织和总结文本

在会议转录工具的第二部分，我们需要组织上一部分转录得到的文本。

同样地，为了构建一个 Gradio 演示，我们需要一个包含我们关心的逻辑的 Python 函数。对于文本的组织和总结，我们将使用一个经过“指令调优”的模型，它被训练来执行广泛的任务。我们有许多选项可以选择，例如 meta-llama/Meta-Llama-3-8B-Instruct 或 mistralai/Mistral-7B-Instruct-v0.3。在我们的例子中，我们将使用 microsoft/Phi-3-mini-4k-instruct。

就像第一部分一样，我们本可以利用 transformers 中的 pipeline 工具来完成这项任务，但这次我们将利用这个机会展示无服务器推理 API，它是 Hugging Face Hub 中的一项 API，允许我们免费使用数千个公开可访问的（或你自己私有授权的）机器学习模型！查看无服务器推理 API 的食谱部分。

使用无服务器推理 API 意味着，我们将通过 InferenceClient（这是 huggingface_hub 库的一部分）来调用模型，而不是像我们在音频转换部分那样通过 pipeline 调用模型（Hub Python 库）。为了使用 InferenceClient，我们需要通过 notebook_login() 登录到 🤗 Hub，这将弹出一个对话框，要求输入你的用户访问令牌以进行身份验证。

你可以从个人设置页面管理你的令牌，并请尽量使用细粒度令牌，以提高安全性。

from huggingface_hub import notebook_login, InferenceClient

# running this will prompt you to enter your Hugging Face credentials
notebook_login()

现在我们已经登录到 Hub，我们可以通过 InferenceClient 使用无服务器推理 API 编写我们的文本处理函数。

这一部分的代码将被结构化为两个函数：

build_messages：用于将消息提示格式化为 LLM（大语言模型）所需的格式；
organize_text：实际将原始会议文本传递给 LLM 进行组织（以及根据我们提供的提示进行总结）。

# sample meeting transcript from huuuyeah/MeetingBank_Audio
# this is just a copy-paste from the output of part 1 using one of the Denver meetings
sample_transcript = """
 Good evening. Welcome to the Denver City Council meeting of Monday, May 8, 2017. My name is Kelly Velez. I'm your Council Secretary. According to our rules of procedure, when our Council President, Albus Brooks, and Council President Pro Tem, JoLynn Clark, are both absent, the Council Secretary calls the meeting to order. Please rise and join Councilman Herndon in the Pledge of Allegiance. Madam Secretary, roll call. Roll call. Here. Mark. Espinosa. Here. Platt. Delmar. Here. Here. Here. Here. We have five members present. There is not a quorum this evening. Many of the council members are participating in an urban exploration trip in Portland, Oregon, pursuant to Section 3.3.4 of the city charter. Because there is not a quorum of seven council members present, all of tonight's business will move to next week, to Monday, May 15th. Seeing no other business before this body except to wish Councilwoman Keniche a very happy birthday this meeting is adjourned Thank you. A standard model and an energy efficient model likely will be returned to you in energy savings many times during its lifespan. Now, what size do you need? Air conditioners are not a one-size-or-type fits all. Before you buy an air conditioner, you need to consider the size of your home and the cost to operate the unit per hour. Do you want a room air conditioner, which costs less but cools a smaller area, or do you want a central air conditioner, which cools your entire house but costs more? Do your homework. Now, let's discuss evaporative coolers. In low humidity areas, evaporating water into the air provides a natural and energy efficient means of cooling. Evaporative coolers, also called swamp coolers, cool outdoor air by passing it over water saturated pads, causing the water to evaporate into it. Evaporative coolers cost about one half as much to install as central air conditioners and use about one-quarter as much energy. However, they require more frequent maintenance than refrigerated air conditioners, and they're suitable only for areas with low humidity. Watch the maintenance tips at the end of this segment to learn more. And finally, fans. When air moves around in your home, it creates a wind chill effect. A mere two-mile-an-hour breeze will make your home feel four degrees cooler and therefore you can set your thermostat a bit higher. Ceiling fans and portable oscillating fans are cheap to run and they make your house feel cooler. You can also install a whole house fan to draw the hot air out of your home. A whole house fan draws cool outdoor air inside through open windows and exhausts hot room air through the attic to the outside. The result is excellent ventilation, lower indoor temperatures, and improved evaporative cooling. But remember, there are many low-cost, no-cost ways that you can keep your home cool. You should focus on these long before you turn on your AC or even before you purchase an AC. But if you are going to purchase a new cooling system, remember to get one that's energy efficient and the correct size for your home. Wait, wait, don't go away, there's more. After this segment of the presentation is over, you're going to be given the option to view maintenance tips about air conditioners and evaporative coolers. Now all of these tips are brought to you by the people at Xcel Energy. Thanks for watching.
"""

from huggingface_hub import InferenceClient

TEXT_MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"

client = InferenceClient()


def organize_text(meeting_transcript):
    messages = build_messages(meeting_transcript)
    response = client.chat_completion(messages, model=TEXT_MODEL_NAME, max_tokens=250, seed=430)
    return response.choices[0].message.content


def build_messages(meeting_transcript) -> list:
    system_input = "You are an assitant that organizes meeting minutes."
    user_input = """Take this raw meeting transcript and return an organized version.
    Here is the transcript:
    {meeting_transcript}
    """.format(
        meeting_transcript=meeting_transcript
    )

    messages = [
        {"role": "system", "content": system_input},
        {"role": "user", "content": user_input},
    ]
    return messages

现在我们有了文本组织函数 organize_text，我们也可以为它构建一个演示：

part_2_demo = gr.Interface(
    fn=organize_text,
    inputs=gr.Textbox(value=sample_transcript),
    outputs=gr.Textbox(show_copy_button=True),
    title="Clean Up Transcript Text",
)
part_2_demo.launch()

赶紧试试看 👆！如果你在上面的演示中点击 “Submit”，你会看到输出文本是转录内容的一个更清晰、更有条理的版本，其中包含了标题和会议不同部分的章节。

你可以尝试调整 user_input 变量，控制 LLM 提示，看看能否得到一个总结。

整合所有部分

到目前为止，我们已经为会议转录工具的两个步骤编写了函数：

将音频转换成文本文件，
将该文本文件组织成格式良好的会议文档。

接下来，我们只需要将这两个函数整合在一起，并为组合后的步骤构建一个演示。换句话说，我们的完整会议转录工具就是一个新的函数（我们将其创意命名为 meeting_transcript_tool 😀），它将 transcribe 的输出传递给 organize_text：

def meeting_transcript_tool(audio_input):
    meeting_text = transcribe(audio_input)
    organized_text = organize_text(meeting_text)
    return organized_text


full_demo = gr.Interface(
    fn=meeting_transcript_tool,
    inputs=gr.Audio(type="filepath"),
    outputs=gr.Textbox(show_copy_button=True),
    title="The Complete Meeting Transcription Tool",
)
full_demo.launch()

赶紧试试看 👆！现在这是我们完整的转录工具演示。如果你提供一个音频文件，输出将是已经组织（并可能已总结）的会议版本。超级酷 😎。

将你的演示迁移到 🤗 Spaces

如果你已经完成到这里，那么恭喜你，你现在已经掌握了如何使用 Gradio 创建机器学习模型的演示 👏！

接下来，我们将向你展示如何将你全新的演示迁移到 Hugging Face Spaces。除了 Gradio 的易用性和强大功能，将你的演示迁移到 🤗 Spaces 还能为你提供永久托管、每次更新应用时的便捷部署，以及与任何人分享你工作的能力！需要注意的是，如果你长时间没有使用或更改你的 Space，它会进入休眠状态。

第一步是前往 https://huggingface.co/new-space，从模板中选择 “Gradio”，然后将其余选项保留为默认设置（你以后可以修改这些选项）：

这将创建一个新的 Space，你可以在其中填充你的演示代码。作为一个示例，我创建了 🤗 Space dmaniloff/meeting-transcript-tool，你可以通过这里访问。

我们需要编辑两个文件：

app.py —— 这是演示代码所在的地方，内容应该如下所示：

# app.py 的大纲：

def meeting_transcript_tool(...):
   ...

def transcribe(...):
   ...

def organize_text(...):
   ...

requirements.txt —— 这是我们告诉 Space 需要哪些库的地方，内容应该如下所示：
```
# requirements.txt 的内容：
torch
transformers
```

Gradio 自带“电池” 🔋

Gradio 自带了许多非常酷的功能，开箱即用。我们无法在这篇 Notebook 中覆盖所有功能，但以下是我们将要查看的三个功能：

作为 API 访问
通过公共 URL 分享
标记功能

作为 API 访问

使用 Gradio 构建网页演示的一个好处是，你会自动获得一个 API 🙌！这意味着，你可以使用标准的 HTTP 客户端（如 curl 或 Python 的 requests 库）来访问你 Python 函数的功能。

如果你仔细查看我们上面创建的演示，你会看到在底部有一个链接，写着 “Use via API”。如果你点击我创建的 Space（dmaniloff/meeting-transcript-tool）中的链接，你将看到如下内容：

让我们复制并粘贴下面的代码，使用我们的 Space 作为 API

!pip install gradio_client

>>> from gradio_client import Client, handle_file

>>> client = Client("dmaniloff/meeting-transcript-tool")
>>> result = client.predict(
...     audio_input=handle_file("https://github.com/gradio-app/gradio/raw/main/test/test_files/audio_sample.wav"),
...     api_name="/predict",
... )
>>> print(result)

Loaded as API: https://dmaniloff-meeting-transcript-tool.hf.space ✔
Certainly! Below is an organized version of a hypothetical meeting transcript. Since the original transcript you've provided is quite minimal, I'll create a more detailed and structured example featuring a meeting summary.

---

# Meeting Transcript: Project Alpha Kickoff

**Date:** April 7, 2023

**Location:** Conference Room B, TechCorp Headquarters


**Attendees:**

- John Smith (Project Manager)

- Emily Johnson (Lead Developer)

- Michael Brown (Marketing Lead)

- Lisa Green (Design Lead)


**Meeting Duration:** 1 hour 30 minutes


## Opening Remarks

**John Smith:**

Good morning everyone, and thank you for joining this kickoff meeting for Project Alpha. Today, we'll discuss our project vision, milestones, and roles. Let's get started.


## Vision and Goals

**Emily Johnson:**

The main goal of Project Alpha is to

哇！刚刚发生了什么？让我们来分解一下：

我们安装了 gradio_client，这是一个专门用于与 Gradio 构建的 API 交互的包。
我们通过提供想要查询的 🤗 Space 的名称来实例化客户端。
我们调用了客户端的 predict 方法，并传入了一个示例音频文件。

Gradio 客户端会为我们处理 HTTP POST 请求，同时它还提供了像读取输入音频文件（通过 handle_file 函数）这样的功能，音频文件将由我们的会议转录工具处理。

再次强调，使用这个客户端是一个选择，你完全可以直接运行 curl -X POST https://dmaniloff-meeting-transcript-tool.hf.space/call/predict [...]，并传入请求所需的所有参数。

上述调用得到的输出是一个由我们用于文本组织的 LLM 生成的虚拟会议，因为示例输入文件并非真实的会议录音。你可以调整 LLM 的提示，以适应这种情况。

通过公共 URL 分享

Gradio 的另一个酷功能是，即使你在本地计算机上构建了演示（在将其移到 🤗 Space 之前），你仍然可以通过将 share=True 传递给 launch 来将其分享给全世界，如下所示：

demo.launch(share=True)

你可能已经注意到，在这个 Google Colab 环境中，这个行为默认启用，因此我们之前创建的演示已经有了一个公共 URL，你可以分享给任何人 🌎。返回 ⬆ 并查看日志中 Running on public URL: 的部分来找到它 🔎！

标记功能

标记功能是 Gradio 内置的一项功能，它允许你的演示用户提供反馈。你可能已经注意到，我们创建的第一个演示底部有一个 Flag 按钮。

在默认选项下，当用户点击该按钮时，输入和输出示例会被保存到一个 CSV 日志文件中，供你稍后查看。如果演示涉及音频（如我们的例子），这些音频文件会被单独保存到一个并行目录中，并且这些文件的路径会被记录在 CSV 文件中。

现在，回到我们第一个演示，再次尝试一下，然后点击 Flag 按钮。你会看到一个新的日志文件会在 flagged 目录中创建：

>>> !cat flagged/log.csv

name,intensity,output,flag,username,timestamp
Diego,4,"Hello, Diego!!!!",,,2024-06-29 22:07:50.242707

在这个例子中，我设置了 name=diego 和 intensity=29 作为输入，并点击了标记（Flag）。你可以看到，日志文件包括了函数的输入、输出（"Hello, diego!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"），以及时间戳。

虽然记录用户认为有问题的输入和输出清单总比没有好，Gradio 的标记功能实际上能做得更多。例如，你可以提供一个 flagging_options 参数，让你自定义用户反馈或错误的类型，比如 ["Incorrect", "Ambiguous"]。需要注意的是，这要求 allow_flagging 必须设置为 "manual"，如下所示：

demo_with_custom_flagging = gr.Interface(
    fn=greet,
    inputs=["text", "slider"],  # the inputs are a text box and a slider ("text" and "slider" are components in Gradio)
    outputs=["text"],  # the output is a text box
    allow_flagging="manual",
    flagging_options=["Incorrect", "Ambiguous"],
)
demo_with_custom_flagging.launch()

去试试上面的代码吧 👆！你会看到，标记按钮现在变成了 Flag as Incorrect 和 Flag as Ambiguous，而新的日志文件将反映出这些选项：

>>> !cat flagged/log.csv

name,intensity,output,flag,username,timestamp
Diego,4,"Hello, Diego!!!!",,,2024-06-29 22:07:50.242707
Diego,5,"Hello, Diego!!!!!",Ambiguous,,2024-06-29 22:08:04.281030

总结与下一步

在这篇 Notebook 中，我们学习了如何使用 Gradio 演示任何机器学习模型。

首先，我们了解了如何为一个简单的 Python 函数设置接口；其次，我们深入探讨了 Gradio 的真正强项：为机器学习模型构建演示。

在这一过程中，我们学习了如何通过 transformers 库及其 pipeline 函数轻松地利用 🤗 Hub 中的模型，以及如何使用 gr.Audio 等多媒体输入。

接着，我们学习了如何将你的 Gradio 演示托管到 🤗 Spaces 上，这样你可以让演示一直在云端运行，并为你的演示提供灵活的计算资源。

最后，我们展示了 Gradio 内置的一些非常酷的功能，比如 API 访问、公用 URL 和标记功能。

下一步

如果你希望进一步深入，可以查看每个部分结尾的 进一步阅读 链接，继续探索更多功能和应用。

⏭️ 进一步阅读

< > Update on GitHub

Open-Source AI Cookbook