Mir-2002 commited on
Commit
2786a65
·
verified ·
1 Parent(s): b1a76c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -59,7 +59,28 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
59
  # b (int): The second number.
60
 
61
  ```
 
62
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  # Hyperparameters
64
 
65
  MAX_SOURCE_LENGTH = 256 <br>
 
59
  # b (int): The second number.
60
 
61
  ```
62
+ # Fine tuning
63
 
64
+ In fine tuning the model, i used the special token `<tdec>`. According to CodeT5+'s paper:
65
+
66
+ " Specifically, when the input is a text
67
+ sample, we prepend a [CDec] token to the input
68
+ sequence to the decoder. In this case, the decoder
69
+ operates under code generation functionality. Alternatively, when the input is a code sample, we
70
+ prepend a [TDec] token to the input sequence to
71
+ the decoder. The decoder operates under text generation functionality in this case. This type of Causal
72
+ LM has been shown to be an effective learning
73
+ objective to close the pretrain-finetune gap for generative downstream tasks"
74
+
75
+ Generally speaking, the `<tdec>` token was appended to the target (the docstring) to signal to the decoder that it is in a text generation functionality. A sample row looks like this:
76
+
77
+ ```
78
+ <s><tdec> Creates a task that to retry a previously abandoned task.
79
+
80
+ Returns:
81
+ Task: a task that was abandoned but should be retried or None if there are
82
+ no abandoned tasks that should be retried.</s>
83
+ ```
84
  # Hyperparameters
85
 
86
  MAX_SOURCE_LENGTH = 256 <br>