Could you share the training recipe breifly?
Hi, I'm exploring your models and am very impressed with its performance.
I was hoping you might be able to share some more details about your model's backbone and training recipe.
Understanding the architecture and training process would be very helpful for a research project I'm working on and would like to try based on your work :)
Thanks in advance!
Hello, and thank you very much for your interest in PIXIE.
The backbone of PIXIE-Spell-Preview is based on the Qwen3, and we adopted a two-phase training recipe to optimize performance for retrieval tasks.
Phase 1 – Weakly Supervised Pre-training
- We first trained the model on a large-scale, weakly supervised dataset, allowing it to acquire a broad understanding of the relationships between queries and documents. This phase focused on building the model’s general representation learning ability for retrieval.
Phase 2 – Supervised Fine-tuning with High-quality Data
- In the second phase, we fine-tuned the model with carefully curated, high-quality labeled datasets. This step enabled the model to more precisely distinguish relevant documents from irrelevant ones, ensuring stronger accuracy and consistency in real-world retrieval scenarios.
Through this approach, the model first learns general retrieval patterns from large-scale data and then is refined with high-quality supervision, maximizing task-specific performance.
I hope this explanation is helpful for your research project, and I wish you great success with your experiments. Thank you again for your interest!