Update README.md
Browse files
README.md
CHANGED
@@ -35,7 +35,7 @@ Academic and Personal Use. You may use the Work for academic research and person
|
|
35 |
Commercial Use. You may not use the Work for commercial purposes without prior written authorization from the Contributor(s). Any commercial use authorized by the Contributor(s) must not involve charging fees above the model inference cost without express written permission from the Contributor(s).
|
36 |
Medical Application. The Work is provided for academic research purposes only and not for commercial use. It must not be used in clinical practice or in any scenario with potential medical intent without permission. The capabilities of this Traditional Chinese Medicine (TCM) Language Model, including syndrome classification and prescription generation, are experimental and not intended for clinical diagnosis or treatment. Outputs are for internal reference and testing only and should not be considered as medical advice. All medical diagnoses and treatments should be performed by experienced physicians through a standardized clinical process.
|
37 |
Distribution. Redistribution of the Work or derivative works must comply with all the terms and conditions of this License.
|
38 |
-
|
39 |
## Training data
|
40 |
|
41 |
#### 1.1 Multi-task Therapeutic Behavior Decomposition Instruction Construction Strategy
|
@@ -51,14 +51,14 @@ Human memory and understanding require the construction of various scenarios and
|
|
51 |
|
52 |
#### 1.2 Regular TCM Instruction Data Construction Strategy
|
53 |
In addition, we have also added instructions based on the content of Chinese medicine ancient books, noun explanations, symptom synonyms, antonyms, syndromes, symptoms, treatment methods, etc. In order to form a control experiment, we only use one instruction template to represent data for this part, and the number of this part of the data is 80,000, which is significantly more than the number of instructions constructed by the above strategy. The following is the specific number of instructions and tokens information.
|
54 |
-
Data Source and Instruction Quantity Table:
|
55 |
|
56 |
```
|
57 |
{
|
58 |
"instruction": "请回答以下有关于中医疾病名词解释的相关问题:",
|
59 |
"input": "(肺风)粉刺属于哪个分类?",
|
60 |
"output": "因肺风、胃热或肝瘀所致。以面及背部见黑头或白头粉刺、丘疹、脓疱、结节、囊肿及疤痕为主要表现的皮肤疾病。"
|
61 |
-
}
|
62 |
```
|
63 |
|
64 |
# Train Details & Inference Capability Statement
|
@@ -67,22 +67,21 @@ Our model, a meticulously fine-tuned version of Qwen1.5-1.8B-Chat, has been opti
|
|
67 |
|
68 |
## Disclaimer
|
69 |
This research is for academic research use only, commercial use is not allowed without permission, and it is not to be used in medical scenarios or scenarios with potential medical intent for clinical practice. This large language model for Traditional Chinese Medicine is still in the laboratory testing stage. The emerging syndrome classification and prescription generation capabilities at this stage are still rudimentary, and it does not yet have a highly reliable clinical diagnostic and therapeutic capability for gynecology and other clinical specialties. The output results are for internal reference testing only. Real medical diagnosis and decision-making still need to be issued by experienced physicians through a strictly regulated diagnostic and therapeutic process.
|
70 |
-
|
71 |
## Collaboration
|
72 |
Data processing and annotation is one of the important steps in training the model. We sincerely welcome Traditional Chinese Medicine practitioners with strong TCM thinking and innovative spirit to join us. We will also declare corresponding data contributions. We look forward to the day when we can achieve a reliable General Artificial Intelligence for Traditional Chinese Medicine, allowing the ancient Chinese medicine to blend with modern technology and shine anew. This is also the ultimate mission of this project. If interested, please send an email to [email protected].
|
73 |
|
74 |
## Team Introduction
|
75 |
-
|
76 |
-
|
77 |
## Citation
|
78 |
If you find this work useful in your research, please cite our repository:
|
79 |
```
|
80 |
@misc{CMLM-ZhongJing,
|
81 |
-
author = {
|
82 |
-
title = {CMLM-ZhongJing:
|
83 |
year = {2023},
|
84 |
-
publisher = {
|
85 |
journal = {GitHub Repository},
|
86 |
howpublished = {\url{https://github.com/pariskang/CMLM-ZhongJing}}
|
87 |
-
}
|
88 |
```
|
|
|
35 |
Commercial Use. You may not use the Work for commercial purposes without prior written authorization from the Contributor(s). Any commercial use authorized by the Contributor(s) must not involve charging fees above the model inference cost without express written permission from the Contributor(s).
|
36 |
Medical Application. The Work is provided for academic research purposes only and not for commercial use. It must not be used in clinical practice or in any scenario with potential medical intent without permission. The capabilities of this Traditional Chinese Medicine (TCM) Language Model, including syndrome classification and prescription generation, are experimental and not intended for clinical diagnosis or treatment. Outputs are for internal reference and testing only and should not be considered as medical advice. All medical diagnoses and treatments should be performed by experienced physicians through a standardized clinical process.
|
37 |
Distribution. Redistribution of the Work or derivative works must comply with all the terms and conditions of this License.
|
38 |
+
|
39 |
## Training data
|
40 |
|
41 |
#### 1.1 Multi-task Therapeutic Behavior Decomposition Instruction Construction Strategy
|
|
|
51 |
|
52 |
#### 1.2 Regular TCM Instruction Data Construction Strategy
|
53 |
In addition, we have also added instructions based on the content of Chinese medicine ancient books, noun explanations, symptom synonyms, antonyms, syndromes, symptoms, treatment methods, etc. In order to form a control experiment, we only use one instruction template to represent data for this part, and the number of this part of the data is 80,000, which is significantly more than the number of instructions constructed by the above strategy. The following is the specific number of instructions and tokens information.
|
54 |
+
Data Source and Instruction Quantity Table:
|
55 |
|
56 |
```
|
57 |
{
|
58 |
"instruction": "请回答以下有关于中医疾病名词解释的相关问题:",
|
59 |
"input": "(肺风)粉刺属于哪个分类?",
|
60 |
"output": "因肺风、胃热或肝瘀所致。以面及背部见黑头或白头粉刺、丘疹、脓疱、结节、囊肿及疤痕为主要表现的皮肤疾病。"
|
61 |
+
}
|
62 |
```
|
63 |
|
64 |
# Train Details & Inference Capability Statement
|
|
|
67 |
|
68 |
## Disclaimer
|
69 |
This research is for academic research use only, commercial use is not allowed without permission, and it is not to be used in medical scenarios or scenarios with potential medical intent for clinical practice. This large language model for Traditional Chinese Medicine is still in the laboratory testing stage. The emerging syndrome classification and prescription generation capabilities at this stage are still rudimentary, and it does not yet have a highly reliable clinical diagnostic and therapeutic capability for gynecology and other clinical specialties. The output results are for internal reference testing only. Real medical diagnosis and decision-making still need to be issued by experienced physicians through a strictly regulated diagnostic and therapeutic process.
|
70 |
+
|
71 |
## Collaboration
|
72 |
Data processing and annotation is one of the important steps in training the model. We sincerely welcome Traditional Chinese Medicine practitioners with strong TCM thinking and innovative spirit to join us. We will also declare corresponding data contributions. We look forward to the day when we can achieve a reliable General Artificial Intelligence for Traditional Chinese Medicine, allowing the ancient Chinese medicine to blend with modern technology and shine anew. This is also the ultimate mission of this project. If interested, please send an email to [email protected].
|
73 |
|
74 |
## Team Introduction
|
75 |
+
Led by the non-profit organization FulPhil-医哲未来 (Future Medicine Philosophy), the CMLM (Chinese Medicine Language Models) initiative on HuggingFace is dedicated to advancing healthcare AI by integrating traditional Chinese medicine with state-of-the-art machine learning. Our mission includes curating valuable medical datasets, developing AI models for medical assistance, and ensuring ethical AI use in healthcare, fostering collaboration between global experts in Chinese and Western medicine and AI.
|
76 |
+
|
77 |
## Citation
|
78 |
If you find this work useful in your research, please cite our repository:
|
79 |
```
|
80 |
@misc{CMLM-ZhongJing,
|
81 |
+
author = {Liu Lin Ju Shi},
|
82 |
+
title = {CMLM-ZhongJing-2-1_8b: A State-of-the-Art Edge Computing Language Model for Traditional Chinese Medicine},
|
83 |
year = {2023},
|
84 |
+
publisher = {FulPhil-医哲未来 (Future Medicine Philosophy).},
|
85 |
journal = {GitHub Repository},
|
86 |
howpublished = {\url{https://github.com/pariskang/CMLM-ZhongJing}}
|
|
|
87 |
```
|