File size: 3,291 Bytes
a2a30ce db490d7 ca67a0d db490d7 f10f3d2 1807ea4 db490d7 8bb0749 db490d7 707634e 6c6b419 806cfb4 707634e d1761d2 806cfb4 d1761d2 806cfb4 f9a3a38 3d573c1 d1761d2 806cfb4 3d573c1 d1761d2 b775658 13adbac d1761d2 13adbac d1761d2 13adbac d1761d2 13adbac 9648b1b 51573aa 806cfb4 c817644 3dd8202 c817644 d1761d2 3dd8202 d1761d2 3e71bef d1761d2 292bdf5 d1761d2 292bdf5 d1761d2 c817644 13adbac 5942394 d1761d2 3dd8202 d1761d2 5942394 3dd8202 131872c a1b1441 806cfb4 7d1ee04 5942394 131872c a1b1441 0eaee52 a1b1441 5942394 806cfb4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
---
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- embedder
- embedding
- models
- GGUF
- Bert
- Nomic
- Gist
- BGE
- text-embeddings-inference
misc:
- text-embeddings-inference
language:
- en
- de
architecture:
- GIST
---
# All models tested with ALLM(AnythingLLM) with LM as server
They work more or less (sometimes the results are more truthful if the “chat with document only” option is used)
My short impression:
- nomic-embed-text
- mxbai-embed-large
- mug-b-1.6
- Ger-RAG-BGE-M3 (german)
Working well, all other its up to you!
Short hints for using:
Set your (Max Tokens)context-lenght 16000t main-model, set your embedder-model (Max Embedding Chunk Length) 1024t,set (Max Context Snippets) usual 14,
but in ALLM its cutting all in 1024 character parts, so aprox two times or bit more ~30!
-> Ok what that mean!
You can receive 14-snippets a 1024t (14336t) from your document ~10000words and 1600t left for the answer ~1000words (2 pages)
You can play and set for your needs, eg 8-snippets a 2048t, or 28-snippets a 512t ...
8000t ~0.8GB VRAM usage /
16000t ~1.5GB VRAM usage /
32000t ~3GB VRAM usage
...
How embedding and search works for now
You have a txt/pdf file maybe 90000words(~300pages). You ask the model lets say "what is described in chapter called XYZ in relation to person ZYX".
Now it searches for keywords or similar semantic terms in the document. if it has found them, lets say word and meaning around “XYZ and ZYX” ,
now a piece of text 1024token around this word “XYZ/ZYX” is cut out at this point.
This text snippet is then used for your answer. If, for example, the word “XYZ” occurs 100 times in one file, not all 100 are found (usually only 4 to 32 snippet are used)
A question for "summary of the document" is most time not useful, if the document has an introduction or summaries its searching there if you have luck.
If the documents small like 10-20 Pages, its better you copy the whole text inside the prompt.
...
My impression is that at the moment only the first results of the embedder is used (dont ask what happend if you use more than one document at once) so it should be some kind of ranking.
If the embedder has found 100 snippets, only the snippets with the highest rank should be sent to the model.... but seems thats not happend
...
#
#
...
on discord (sevenof9)
...
#
#
...
# (ALL licenses and terms of use go to original author)
...
avemio/German-RAG-BGE-M3-MERGED-x-SNOWFLAKE-ARCTIC-HESSIAN-AI (German, English)
maidalun1020/bce-embedding-base_v1 (English and Chinese)
maidalun1020/bce-reranker-base_v1 (English, Chinese, Japanese and Korean)
BAAI/bge-reranker-v2-m3 (English and Chinese)
BAAI/bge-reranker-v2-gemma (English and Chinese)
avsolatorio/GIST-large-Embedding-v0 (English)
ibm-granite/granite-embedding-278m-multilingual (English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese)
Labib11/MUG-B-1.6 (?)
mixedbread-ai/mxbai-embed-large-v1 (multi)
nomic-ai/nomic-embed-text-v1.5 (English, multi)
Snowflake/snowflake-arctic-embed-l-v2.0 (English, multi)
intfloat/multilingual-e5-large-instruct (100 languages)
|