Fix model format with proper config and weights

Files changed (7) hide show

README.md CHANGED Viewed

@@ -6,18 +6,13 @@ library_name: transformers
 pipeline_tag: text-classification
 tags:
 - medical
-- converted-from-space
-- medicoder
-source_space: sshan95/medicoder-ai-v2
 ---
-# medicoder-ai-v2-model
-This model was converted from the Hugging Face Space: [sshan95/medicoder-ai-v2](https://huggingface.co/spaces/sshan95/medicoder-ai-v2)
-## Model Description
-MediCoder AI v2 converted from Space to proper model repository for easier integration and deployment.
 ## Usage
@@ -26,14 +21,10 @@ from transformers import AutoTokenizer, AutoModelForSequenceClassification
 tokenizer = AutoTokenizer.from_pretrained("sshan95/medicoder-ai-v2-model")
 model = AutoModelForSequenceClassification.from_pretrained("sshan95/medicoder-ai-v2-model")
-# Your inference code here
 ```
-## Original Space
-This model was originally deployed as a Space at: https://huggingface.co/spaces/sshan95/medicoder-ai-v2
-## Conversion
-Converted from Space to Model repository for better accessibility and integration.

 pipeline_tag: text-classification
 tags:
 - medical
+- icd10
+- multilabel-classification
 ---
+# MediCoder AI v2 - Fixed Model
+This is a properly formatted version of the MediCoder model for medical code classification.
 ## Usage
 tokenizer = AutoTokenizer.from_pretrained("sshan95/medicoder-ai-v2-model")
 model = AutoModelForSequenceClassification.from_pretrained("sshan95/medicoder-ai-v2-model")
 ```
+## Configuration
+- **Labels**: 25,719 ICD-10 codes
+- **Architecture**: BERT-based
+- **Task**: Multi-label medical code classification

config.json ADDED Viewed

The diff for this file is too large to render. See raw diff

model.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:592d9849e5d1488ccc511c77c11444975a10fa9572c77749ebf7ff110ec7da6f
+size 517064148

special_tokens_map.json ADDED Viewed

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff