starmage520
/

Coderbert_finetuned_detect_vulnerability_on_MSR

Text Classification

Text Classification

Inference Endpoints

Model card Files Files and versions Community

starmage520 commited on Dec 2, 2023

Commit

c2cbb97

·

1 Parent(s): b543426

Update README.md

Files changed (1) hide show

README.md +21 -0

README.md CHANGED Viewed

@@ -1,3 +1,24 @@
 ---
 license: mit
 ---

 ---
 license: mit
+tags:
+- Text Classification
+- Transformers
+- PyTorch
+- JAX
+- MSR
+- English
+- roberta
+- Inference Endpoints
+metrics:
+- accuracy
+pipeline_tag: text-classification
 ---
+I finetuned a RobertaForSequenceClassification model which is initialized
+from CodeBert [https://huggingface.co/microsoft/codebert-base] to judge whether a code is vulnerable or not.
+I selected balanced samples from MSR dataset [https://github.com/ZeoVan/MSR_20_Code_vulnerability_CSV_Dataset] for training, validation, and testing.
+The "func_before" is used for code classification. All the data is in the file "msr.csv".
+Funcs shorter than 50 or longer than 512 (The CodeBert window size) are dropped.
+Test Reulsts:
+acc 0.7022935779816514, f1 0.6482384823848238,  precision 0.7920529801324503, recall 0.5486238532110091