Thank you!
I am researching since days to find a better (a much better) SER model than the standard one from Speechbrain. Happy that I found this one! :-)
By the way: Moving to cuda isn't possible, I think?
Thank you for the positive feedback!
If you have a GPU with enough VRAM and cuda/pytorch installed, you should be able to run on GPU with a simple: "model = model.cuda()", after loading the model.
Also, be sure to move your data to GPU as well for example:
# load model
model = model.cuda()
# load data
with torch.no_grad():
wavs = wavs.cuda(non_blocking=True).float()
mask = mask.cuda(non_blocking=True).float()
pred = model(wavs, mask)
Thanks again, 3loi! In the meantime I solved it with
mask = torch.ones(1, len(norm_wav)).to(device)
wavs = torch.tensor(norm_wav).unsqueeze(0).to(device)
what seems to work.
Do you think I could use runpod.io to run this model? Curious to hear your thoughts.
I am unfamiliar with runpod.io. But, the model should be able to run on any cloud computing service, assuming its setup correctly. So, I don't see any reason why it shouldn't.
It seems they have 24GB up to 192GB VRAM GPU cloud computing service, which is more than enough. I am able to run this model on a RTX 3090 with 24GB VRAM just fine, with single file inference.