Update README.md
Browse files
README.md
CHANGED
@@ -59,7 +59,28 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
59 |
# b (int): The second number.
|
60 |
|
61 |
```
|
|
|
62 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
# Hyperparameters
|
64 |
|
65 |
MAX_SOURCE_LENGTH = 256 <br>
|
|
|
59 |
# b (int): The second number.
|
60 |
|
61 |
```
|
62 |
+
# Fine tuning
|
63 |
|
64 |
+
In fine tuning the model, i used the special token `<tdec>`. According to CodeT5+'s paper:
|
65 |
+
|
66 |
+
" Specifically, when the input is a text
|
67 |
+
sample, we prepend a [CDec] token to the input
|
68 |
+
sequence to the decoder. In this case, the decoder
|
69 |
+
operates under code generation functionality. Alternatively, when the input is a code sample, we
|
70 |
+
prepend a [TDec] token to the input sequence to
|
71 |
+
the decoder. The decoder operates under text generation functionality in this case. This type of Causal
|
72 |
+
LM has been shown to be an effective learning
|
73 |
+
objective to close the pretrain-finetune gap for generative downstream tasks"
|
74 |
+
|
75 |
+
Generally speaking, the `<tdec>` token was appended to the target (the docstring) to signal to the decoder that it is in a text generation functionality. A sample row looks like this:
|
76 |
+
|
77 |
+
```
|
78 |
+
<s><tdec> Creates a task that to retry a previously abandoned task.
|
79 |
+
|
80 |
+
Returns:
|
81 |
+
Task: a task that was abandoned but should be retried or None if there are
|
82 |
+
no abandoned tasks that should be retried.</s>
|
83 |
+
```
|
84 |
# Hyperparameters
|
85 |
|
86 |
MAX_SOURCE_LENGTH = 256 <br>
|