Fine-tuned model: FransXav/ConvTasNet-IF-Itera-SepNoisy8k-FT

Model ini adalah versi fine-tuned dari JorisCos/ConvTasNet_Libri2Mix_sepnoisy_8k.

Description:

Model ini di-fine-tuning oleh peneliti dari Teknik Informatika, Institut Teknologi Sumatera (ITERA). Proses fine-tuning menggunakan skrip yang tersedia di repositori GitHub proyek. Model dilatih pada dataset custom yang terdiri dari campuran audio vokal berbahasa Indonesia dengan beragam noise.

Fine-tuning config:

# Konfigurasi yang digunakan selama fine-tuning
data:
  root: "data/processed/"
  sample_rate: 8000
  segment_seconds: 4
  num_workers: 4

training:
  project_name: "itera-speech-separation-ft"
  model_name: "ConvTasNet-ITERA-FT" # Nama yang digunakan selama training
  epochs: 50
  batch_size: 8
  learning_rate: 0.0005
  gradient_clip_val: 0.5
  precision: "16-mixed"
  early_stopping_patience: 5

model:
  freeze_encoder_decoder: false

remix:
  dynamic: true
  snr_low: 0.0
  snr_high: 10.0

Results

Evaluasi pada test set internal kami menunjukkan hasil sebagai berikut:

si_sdr:
    baseline_score: -30.2842
    fine_tuned_score: -24.9016
    improvement: +5.3826

License Notice

This work, "[NAMA_USERNAME_ANDA]/itera-informatics-convtasnet-ft", is a derivative of JorisCos/ConvTasNet_Libri2Mix_sepnoisy_8k. The original work is a derivative of:

The original work is licensed under Attribution-ShareAlike 3.0 Unported by Joris Cosentino.

This derivative work is licensed under the MIT License by the project authors at Institut Teknologi Sumatera.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for FransXav/ConvTasNet-IF-Itera-SepNoisy8k-FT

Finetuned
(1)
this model