Model Card for AQuilt

AQuilt is an open-source model for translating unlabeled text into instruction-tuning data for specific tasks. It also integrates a logic module and a self-inspection module to further enhance the quality of the synthetic data.

Model Details

Model Description

AQuilt supports 10 different task types and is bilingual in Chinese and English. In our paper, we demonstrate that AQuilt also has task generalization ability. Users can customize task types as needed by adding an instruction prefix.

To further enhance the quality of synthetic data, AQuilt includes a Logic module and a Self-Inspection module. Note that the Self-Inspection requires loading the corresponding Lora Adapter: https://huggingface.co/xiapk7/AQuilt_Eval_lora

Model type: Qwen2ForCausalLM
Language(s) (NLP): English, Chinese
License: Apache 2.0
Finetuned from model: Qwen2.5-7B-Base

Model Sources

Repository: https://github.com/Krueske/AQuilt
Paper Link: AQuilt: Weaving Logic and Self-Inspection into Low-Cost, High-Relevance Data Synthesis for Specialist LLMs

Basic Usage

Please refer to https://github.com/Krueske/AQuilt. You can use the following script to generate synthetic instruction data from unlabeled text.

CUDA_VISIBLE_DEVICES=0 python ./dataGen.py \
  --model_path /path/to/AQuilt \
  --eval_lora_path /path/to/AQuilt_eval_lora \
  --eval true \
  --input_file input.txt \
  --output_file output.json \
  --task_type "natural language inference" \
  --language "en" \
  --task_predix "" \
  --num_gen_per_text 1 \
  --temperature 0.7 \
  --top_p 0.95 \
  --seed 42

Parameter Explanation:

--model_path: Path to AQuilt model
--eval_lora_path: Path to LoRA adapters for Self-Inspection when --eval=true
--eval: Enable Self-Inspection mode (true/false)
--input_file: Text file containing raw unlabeled data
--output_file: JSON output file for generated instructions
--task_type: Target task type
--language: Task language (en/zh)
--task_prefix: If you want to generate customized tasks, you can add task_prefix to fulfill your requirements.
--num_gen_per_text: Number of instructions to generate per input unlabeled data
--temperature: Creativity control
--top_p: Nucleus sampling threshold
--seed: Random seed for reproducibility

📄 Input File Format (input.txt)

Plain text file with one unlabeled text per line:

Direct pressure applied on the inner ear cannot …… the other group.
We evaluate the outcome vesicoureteral reflux (VUR) in …… respectively (p>0.05).
Radio frequency ablation (RFA) has been used as a …… border of ablation.
To investigate the rates of Staphylococcus aureus carriage …… was documented for 7 (50%).
Lichen aureus is localized variant of persistent pigmented purpuric dermatitis that …… intraepithelial vessels.

💾 Output Format (output.json)

JSON file with structured instructions:

[
  {
        "context": "Direct pressure applied on the inner ear cannot …… the other group.",
        "task_type": "natural language inference",
        "qa_pair": {
            "question": "Does direct pressure applied on the inner ear during scuba diving induce hearing loss, based on the findings that perforating the eardrums of guinea pigs before diving resulted in no variation in auditory threshold and cochlear spontaneous activity, even during extreme diving conditions?Yes, no, or maybe?",
            "thinking_steps": "1. Understand the Question: The question asks whether direct pressure on the inner ear during scuba diving causes hearing loss, based on experimental findings with guinea pigs. 2. Analyze the Text: The text describes an experiment where guinea pigs with perforated eardrums were subjected to diving conditions, including extreme depths and durations. No changes in auditory threshold or cochlear spontaneous activity were observed in this group. 3. Logical Reasoning: Since the experiment did not observe any hearing loss or changes in auditory function in the group with perforated eardrums, it suggests that direct pressure on the inner ear does not induce hearing loss. 4. Choose the Best Answer: The text supports the conclusion that direct pressure on the inner ear does not induce hearing loss, as evidenced by the lack of variation in auditory parameters in the experimental group.",
            "answer": "No"
        },
        "analysis_steps": "1. The question is clear and specific, asking whether direct pressure on the inner ear during scuba diving induces hearing loss based on experimental findings. 2. The response is relevant and directly addresses the question by referencing the experimental findings with guinea pigs. 3. The response provides a logical reasoning process, explaining how the lack of observed changes in auditory parameters supports the conclusion that direct pressure does not induce hearing loss. 4. The grammar and expression in the response are fluent and clear, with no spelling errors. 5. The response provides a comprehensive solution by not only answering the question but also explaining the reasoning behind the conclusion, which helps the user understand the context and implications of the findings.",
        "score": 4
    },
]

⚠️ Important Notes

When using eval=true, you must provide eval_lora_path
The types of tasks you can choose include: single choice question answering, multi choice question answering, close question answering, open question answering, text summarization, text generation, natural language inference, text classification, extractive question answering, natural language understanding, as well as their corresponding Chinese versions.
If you want to generate customized tasks (by adding task_prefix), it is recommended that the task types be adjusted to close question answering or open question answering.
The scoring range for Self-Inspection is from 1 to 5, with 2 points meeting the basic quality requirements.

Training Details

Training Data

We've built a training dataset of about 700k scale: https://huggingface.co/datasets/xiapk7/AQuilt_trainingset. It has two phases of training data. Phase one focuses on fine-tuning for task-specific instruction generation from unlabeled data. Phase two centers on training AQuilt's Self-Inspection ability.

Training hyperparameters:

We use the following hyperparameters:

LoRA rank (r): 64
LoRA scaling factor (alpha): 4
LoRA dropout: 0
Optimizer: AdamW
Learning rate scheduler: cosine
Max. learning rate: 1e-04
Min. learning rate: 0
Weight decay: 0.1
Dropout: 0
Effective batch size: 16
Epoch: 2

📜 Citation

If you find this model useful, please cite:

@misc{ke2025aquiltweavinglogicselfinspection,
      title={AQuilt: Weaving Logic and Self-Inspection into Low-Cost, High-Relevance Data Synthesis for Specialist LLMs}, 
      author={Xiaopeng Ke and Hexuan Deng and Xuebo Liu and Jun Rao and Zhenxi Song and Jun Yu and Min Zhang},
      year={2025},
      eprint={2507.18584},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.18584}, 
}

xiapk7
/

AQuilt

You need to agree to share your contact information to access this model