banglagov
/

banBERT-Base

Feature Extraction

text-embeddings-inference

Model card Files Files and versions

banglagov commited on Jan 27

Commit

e2e1d05

·

verified ·

1 Parent(s): 11547fc

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ learns contextualized word embeddings by predicting missing words within sentenc
 process known as masked language modeling. This allows BERT to understand words in the
 context of their surrounding words, leading to more meaningful and context-aware embeddings.
-This model is based on the BERT-Base architecture with 12 layers, 768 hidden size, 12 attention heads, and 110 million parameters. The model was trained on a corpus of 39 GB Bangla text data with a vocabulary size of 50k tokens. The model was trained for 1 million steps with a batch size of 440 and a learning rate of 5e-5. The model was trained on two NVIDIA GeForce A40 GPUs.
 ## How to use
@@ -57,6 +57,11 @@ print(outputs)
 ```
 ## Results
 | **Metric**          | **Train Loss** | **Eval Loss** | **Perplexity** | **NER**  | **POS**  | **Shallow Parsing** | **QA**  |

 process known as masked language modeling. This allows BERT to understand words in the
 context of their surrounding words, leading to more meaningful and context-aware embeddings.
+This model is based on the BERT-Base architecture with 12 layers, 768 hidden size, 12 attention heads, and 110 million parameters.
 ## How to use
 ```
+## Training Details
+The model was trained on a corpus of 36 GB Bangla text data with a vocabulary size of 50k tokens. The model was trained for 1 million steps with a batch size of 440 and a learning rate of 5e-5. The model was trained on two NVIDIA GeForce A40 GPUs.
 ## Results
 | **Metric**          | **Train Loss** | **Eval Loss** | **Perplexity** | **NER**  | **POS**  | **Shallow Parsing** | **QA**  |