MiaoshouAI
/

Florence-2-base-PromptGen-v1.5

Safetensors

florence2

custom_code

Model card Files Files and versions Community

MiaoshouAI commited on Sep 5, 2024

Commit

79ed3c5

verified ·

1 Parent(s): dfb68da

Update README.md

Browse files

Files changed (1) hide show

README.md +20 -10

README.md CHANGED Viewed

@@ -2,8 +2,8 @@
 license: mit
 ---
 # Florence-2-base-PromptGen v.15
-This is a major version update for PromptGen. In this new version, two new caption instructions are added: \<GENERATE_TAGS\> and \<MIXED_CAPTION\>
-Also, you will notice a much improved accuracy improved by a new set of training dataset for this version. This version no longer uses Civitai Data due to lora trigger words and inaccurate tags caused for misinterpretation.
 # About PromptGen
 Florence-2-base-PromptGen is a model trained for [MiaoshouAI Tagger for ComfyUI](https://github.com/miaoshouai/ComfyUI-Miaoshouai-Tagger).
@@ -14,19 +14,29 @@ Most vision models today are trained mainly for general vision recognition purpo
 Florence-2-base-PromptGen is trained on such a purpose as aiming to improve the tagging experience and accuracy of the prompt and tagging job. The model is trained based on images and cleaned tags from Civitai so that the end result for tagging the images are the prompts you use to generate these images.
 You won't get annoying captions like "This is image is about a girl..." or
 ## Instruction prompt:
-\<GENERATE_TAGS\> generate prompt as danbooru style tags
-\<CAPTION\> a one line caption for the image
-\<DETAILED_CAPTION\> a structured caption format which detects the position of the subjects in the image
-\<MORE_DETAILED_CAPTION\> a very detailed description for the image
-\<MIXED_CAPTION\> a Mixed caption style of more detailed caption and tags, this is extremely useful for FLUX model when using T5XXL and CLIP_L together. A new node in MiaoshouTagger ComfyUI is added to support this instruction.
 ## Version History:
 For version 1.5, you will notice the following
 1. \<GENERATE_PROMPT\> is deprecated and replace by \<GENERATE_TAGS\>
-2. A new mode for \<MIXED_CAPTION\>
-2. A much improve accuracy for \<DETAILED_CAPTION\> and \<MORE_DETAILED_CAPTION\>
-3. Improved ability for recognizing watermarks on images.
 ## How to use:

 license: mit
 ---
 # Florence-2-base-PromptGen v.15
+This is a major version upgrade for PromptGen. In this new version, two new caption instructions are added: \<GENERATE_TAGS\> and \<MIXED_CAPTION\>
+You'll also notice significantly improved accuracy with this version, thanks to a new training dataset. It no longer relies on Civitai Data, avoiding the issues of lora trigger words and inaccurate tags from misinterpretation.
 # About PromptGen
 Florence-2-base-PromptGen is a model trained for [MiaoshouAI Tagger for ComfyUI](https://github.com/miaoshouai/ComfyUI-Miaoshouai-Tagger).
 Florence-2-base-PromptGen is trained on such a purpose as aiming to improve the tagging experience and accuracy of the prompt and tagging job. The model is trained based on images and cleaned tags from Civitai so that the end result for tagging the images are the prompts you use to generate these images.
 You won't get annoying captions like "This is image is about a girl..." or
+## Features:
+* Describes image in much detail when using \<MORE_DETAILED_CAPTION\> instruction.
+  <img style="width:70%; hight:70%" src="https://msdn.miaoshouai.com/miaoshou/bo/2024-09-05_12-40-34.png" />
+* When using \<DETAILED_CAPTION\> instruction, it creates a structured caption with infomation on subject's position and also reads the text from the image, which can be super useful when recreate a scene.
+  <img style="width:70%; hight:70%" src="https://msdn.miaoshouai.com/miaoshou/bo/2024-09-05_13-07-54.png" />
+* Memory efficient compare to other models! This is a really light weight caption model that allows you to use a little more than 1G of VRAM and produce lightening fast and high quality image captions.
+  <img style="width:70%; hight:70%" src="https://msdn.miaoshouai.com/miaoshou/bo/2024-09-05_12-56-39.png" />
+* Designed to handle image captions for Flux model for both T5XXL CLIP and CLIP_L, the Miaoshou Tagger new node called "Flux CLIP Text Encode" which eliminates the need to run two separate tagger tools for caption creation. You can easily populate both CLIPs in a single generation, significantly boosting speed when working with Flux models.
+  <img style="width:70%; hight:70%" src="https://msdn.miaoshouai.com/miaoshou/bo/2024-09-05_14-11-02.png" />
 ## Instruction prompt:
+\<GENERATE_TAGS\> generate prompt as danbooru style tags<br>
+\<CAPTION\> a one line caption for the image<br>
+\<DETAILED_CAPTION\> a structured caption format which detects the position of the subjects in the image<br>
+\<MORE_DETAILED_CAPTION\> a very detailed description for the image<br>
+\<MIXED_CAPTION\> a mixed caption style of more detailed caption and tags, this is extremely useful for FLUX model when using T5XXL and CLIP_L together. A new node in MiaoshouTagger ComfyUI is added to support this instruction.<br>
 ## Version History:
 For version 1.5, you will notice the following
 1. \<GENERATE_PROMPT\> is deprecated and replace by \<GENERATE_TAGS\>
+2. A new instruction for \<MIXED_CAPTION\>
+3. A much improve accuracy for \<DETAILED_CAPTION\> and \<MORE_DETAILED_CAPTION\>
+4. Improved ability for recognizing watermarks on images.
 ## How to use: