jaeyong2
/

diarize-speaker-segmentation-merged

Automatic Speech Recognition

Model card Files Files and versions Community

jaeyong2 commited on 6 days ago

Commit

18c81e4

·

verified ·

1 Parent(s): 28c855d

Update README.md

Files changed (1) hide show

README.md +24 -32

README.md CHANGED Viewed

@@ -1,39 +1,31 @@
-# Fixed Speaker Segmentation Model
-이 모델은 `jaeyong2/speaker-segmentation-merge`에서 키 매핑 문제를 해결한 버전입니다.
-## 문제 해결
-- 원본 모델: 키에 `model.` 접두사 없음
-- 현재 모델: 키에 `model.` 접두사 있음
-- 해결: 접두사 매핑으로 100% 키 매칭 성공
-## 사용법
-```python
-from diarizers import SegmentationModel
-import torch
-# 모델 로드
-model = SegmentationModel()
-state_dict = torch.load('pytorch_model.bin', map_location='cpu')
-model.load_state_dict(state_dict)
-# 추론
-model.eval()
-with torch.no_grad():
-    # 오디오 입력: (batch_size, audio_length)
-    audio = torch.randn(1, 16000)  # 1초 오디오 예시
-    output = model(audio)
-    print(f"Output shape: {output.shape}")
 ```
-## 모델 상세
-- 총 파라미터: 54개 레이어
-- 아키텍처: SincNet + LSTM + Linear + Classifier
-- 입력: 원시 오디오 파형
-- 출력: 화자 분할 결과
-## 원본 모델
-- Repository: jaeyong2/speaker-segmentation-merge
-- 키 매핑 100% 완료
-- 모든 사전훈련 가중치 보존

+---
+pipeline_tag: automatic-speech-recognition
+---
+# How to use
+```
+# instantiate the pipeline
+from pyannote.audio import Pipeline
+from diarizers import SegmentationModel
+from pyannote.audio import Pipeline
+import torch
+device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
+# diarizers를 통해 모델 로드
+segmentation_model = SegmentationModel().from_pretrained('jaeyong2/speaker-segmentation-merged')
+# pyannote 호환 형식으로 변환
+model3 = segmentation_model.to_pyannote_model()
+pipeline = Pipeline.from_pretrained(
+  "pyannote/speaker-diarization-3.1",
+  use_auth_token=<auth_token>)
+pipeline._segmentation.model = model3
+# run the pipeline on an audio file
+diarization = pipeline("output.wav")
+# dump the diarization output to disk using RTTM format
+with open("audio.rttm", "w") as rttm:
+    diarization.write_rttm(rttm)
 ```