|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: Qwen/Qwen2.5-7B |
|
|
library_name: peft |
|
|
tags: |
|
|
- text-to-speech |
|
|
- ssml |
|
|
- qwen2.5 |
|
|
- lora |
|
|
- peft |
|
|
language: |
|
|
- en |
|
|
- fr |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Qwen2.5-7B SSML LoRA Adapter |
|
|
|
|
|
This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen2.5-7B for converting plain text to SSML (Speech Synthesis Markup Language) with appropriate pause predictions. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: Qwen/Qwen2.5-7B |
|
|
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) |
|
|
- **Task**: Text-to-SSML conversion with pause prediction |
|
|
- **Languages**: English, French (and others supported by base model) |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
|
|
|
# Load base model and tokenizer |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"Qwen/Qwen2.5-7B", |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto" |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B") |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(base_model, "jonahdvt/qwen-ssml-lora") |
|
|
|
|
|
# Prepare input |
|
|
instruction = "Convert text to SSML with pauses:" |
|
|
text = "Hello, how are you today? I hope everything is going well." |
|
|
formatted_input = f"### Task:\n{instruction}\n\n### Text:\n{text}\n\n### SSML:\n" |
|
|
|
|
|
# Generate |
|
|
inputs = tokenizer(formatted_input, return_tensors="pt").to(model.device) |
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=256, |
|
|
temperature=0.7, |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
ssml_output = response.split("### SSML:\n")[-1] |
|
|
print(ssml_output) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **LoRA Rank**: 8 |
|
|
- **LoRA Alpha**: 16 |
|
|
- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
|
|
- **Training Epochs**: 5 |
|
|
- **Batch Size**: 1 (with gradient accumulation) |
|
|
- **Learning Rate**: 3e-4 |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the Apache 2.0 license, same as the base Qwen2.5-7B model. |
|
|
|