|
## A Lossless Syntax Tree Generator with Zero-shot Error Correction |
|
|
|
- We follow [jam](https://huggingface.co/apcl/jam)'s pretraining procedure and use the same data to pretrain except we also use srcml to pretrain the models. |
|
- In the finetuning stage, we finetune our models for 3 epochs. |
|
- Our [GitHub repo](https://github.com/apcl-research/autorepair) contains the code for reproduction using the same [data](https://huggingface.co/datasets/apcl/autorepair). |
|
|
|
|
|
## Pretrained model parameters |
|
| Hyperparameter | Description | Value | |
|
| ----------- | ----------- |------------| |
|
|e | embedding dimensions | 1024 | |
|
|L | number of layers | 24 | |
|
|h | attention heads | 16 | |
|
|c | block size / context length | 256 | |
|
|b | batch size | 4 | |
|
|a | accumulation steps | 32 | |
|
|r | learning rate | 3e-5 | |
|
|y | weight decay | 1e-5 | |
|
|iter | iterations | 570000 | |
|
|
|
## Model files |
|
|
|
| Filename | Description | |
|
| ------- | ------- | |
|
|ckpt.pt|A model file for finetuning| |
|
|ckpt_base.pt | A model file for generating syntax tree with the error correction in zero-shot setting| |
|
|ckpt_finetune.pt | A model finetuned with the syntatic error dataset | |
|
|
|
- Note that you can adjust the batch size and accumulation steps based on your GPU memory. But, the batch size * accumulation steps should be 128. |
|
- If you finetune your models with multiple GPUs, you can turn down accumulation steps. For example, if you finetune with 2 GPUs, you will need to half the accumulation steps. |
|
|
|
|