Table of Contents
GLaDOS voice model, trained on German Portal 1 and Portal 2 game files
Model description
This model uses a checkpoint from the Torsten High model as a base and fine tuned it via the voice lines, directly coming from the game files of Portal 1 and Portal 2 to replicate the German GLaDOS voice for piper.
Training has been performed on an RTX 4000 for over more than 3000 epochs.
You will find two voice models in here
de_DE-glados-high.onnx
andde_DE-glados-high.onnx.json
: GLaDOS herselfde_DE-glados-turret-high.onnx
andde_DE-glados-turret-high.onnx.json
: Fine tuned on the above model to sound like the turret voice.
Dataset & Training
I also added hints on how to build the training dataset and the used toolchain for preparing and training the model in this repo. Reasons being:
- The the training data is intellectual property and copyright by Valve (I cannot include it here for obvious reasons)
- Training a model for piper (as of early 2025) relies on old/outdated tools from 2021 and getting everything up and running can be super frustrating
Requirements
- A PC with an nVidia GPU and the proprietary nVidia drivers, CUDA, Docker + Docker as well as the nvidia-container-toolkit installed
- Ideally use a linux system (WSL untested but potentially might work)
- Basic linux and python knowledge
Build the training dataset
Extract the files from the game
The training dataset has been extracted from the Portal 1 and Portal 2 game files. For legal reason, they are not included in this repo. But you can easily extract them from the gamefiles via VPKEdit Portal 1:
- Switch the game to the desired language (Here: German) via Steam
- Navigate to
<steam>/steamapps/common/Portal/portal
and openportal_pak_dir.vpk
with VPKEdit - Inside
portal_pak_dir.vpk
, navigate tosound/vo/aperture_ai
and extract all*.wav
files into the folderraw
inside this git repo
Portal 2:
- Switch the game to the desired language (Here: German) via Steam
- Navigate to
<steam>/steamapps/common/portal 2
. Select the subfolder matching the language (hereportal2_german
) and openpak01_dir.vpk
via VPKEdit - Inside
pak01_dir.vpk
, navigate tosound/vo/glados
and extract all*.wav
files (but no subfolders) to the folderraw
inside this git repo
Portal 2 DLC 1:
- Repeat the steps 1 for Portal 2 above but now select the
portal2_dlc1_<your language>
folder (if it exists). Here,portal2_dlc1_german
does exist. Openpak01_dir.vpk
with VPKEdit - Repeat step 3 of Portal 2 above but copy the files to
raw
in this git repo
Transcode the files
We need to transcode the files. The portal 1 files have a samplerate of 44.1 kHz WAV while the portal2 files are MP3. For training, we need WAV, 16bit (LE), mono PCM with the sampleratres shown below, depending on the model quality we want to train.
- x-low, low: 16000 kHz
- medium, high: 22050 Hz
NOTE: In principle, we can also train on 44100 Hz, however the piper-train then needs to be modified for training and inference as it only supports Run the following command (needs
ffmpeg
to be installed)# Before running the script, first edit the bitrate, that you want ./0_transcode.sh
Sort by good/bad samples
Now the annoying part: Listen to all voice samples, one by one and sort them by good (same voice style, no degradation in quality, no additional none-voice parts or mumble etc) and bad (the opposite) I have written a helper script for this purpose: 1sortgoodbad.py (Read the comments in it). But hold your horses: Before you perform this annoying job, that can take several hours: I expect the quality of the voice lines to be similar across languges. So you can use my script
1_from_good.py
which uses thegood.txt
file to tag voice samples as good or bad, based on my decisions made during listening to GLaDOS myself. Run the following command./1_from_good.py
Transcribe
Now we need to transcribe the files. For this, we need
faster-whisper
. The easiest way to install and use it, is to do this via Docker. But before you do that, you should edit the file2_transcribe.py
and select the language and model you want to use.Run this to build the docker container(s)
docker compose up --build -d docker exec -it transcribe bash
You should now be in the
transcribe
docker container. Run./2_transcribe.py
This will yield a new file
metadata.csv
. Copy this file toraw_good
, once transcription has finished
Training
For this, you should use the Docker container, which is provided by this repo. But before you do that, you need to configure the new files:
- 3gentraindata.sh: Edit the samplerate (16000 for x-low and low, 22050 for medium, high models) and the language code (en, de, ru, fr, โฆ)
- 4train.sh: Edit the QUALITY, BATCHSIZE, PHONEMEMAX parameters, that suit your training hardware.
Also select the CHKPOINT to start from: You ideally do not want to train from scratch but rather from an already exisiting checkpoint.
Grab one from the piper people, that fits the model (x-low, low, medium, high) and language, that you want to train.
Copy it to
checkpoints
within this repo.
Now run the following within this repo (if you haven't it already done for transcription)
docker compose up --build -d
docker exec -it training bash
This will build and enter the training container and also export training metrics via tensorboard at http://127.0.0.1:6006
From inside the container, you now need to generate your traindata for the training process
./3_gen_traindata.sh
And now, you are ready for training. Simply run
./4_train.sh
inside the container.
In the case, you need to stop training, you just have to change the path to the checkpoint by setting the CHCKPOINT
variable in ./4_train.sh
.
Infere the final model
After training has finished (either it flattened of or you hit the max epoch limit), you need to export the model to the onnx format.
First, edit 5_export.sh
and set the name and also the checkpoint (generally the last trained checkpoint by 4_train.sh
, you want to export the model from
From still inside the training docker container, run this command
./5_export.sh
This will generate a <model_name>.onnx
and <model_name>.onnx.json
file. The later one needs to be adjusted: Open it in a file editor and and navigate to the line where it reads
"dataset": "",
and place replace "" with this models name (here: "<modelname>")
"dataset": "de_DE-glados-high"
These two files can now be used by piper
Model tree for systemofapwne/piper-de-glados
Base model
rhasspy/piper-voices