Upload folder using huggingface_hub
Browse files- README.md +23 -1
- SETUP.md +1 -1
- USAGE_EXAMPLES.md +1 -1
README.md
CHANGED
@@ -125,6 +125,26 @@ model-index:
|
|
125 |
|
126 |
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) specifically for **Indonesian language** text embedding tasks. It maps Indonesian sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
127 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
128 |
## 🇮🇩 **Specialized for Indonesian Language**
|
129 |
|
130 |
This model is optimized for Indonesian text understanding across multiple domains including:
|
@@ -175,12 +195,14 @@ First install the Sentence Transformers library:
|
|
175 |
pip install -U sentence-transformers
|
176 |
```
|
177 |
|
|
|
|
|
178 |
Then you can load this model and run inference.
|
179 |
```python
|
180 |
from sentence_transformers import SentenceTransformer
|
181 |
|
182 |
# Download from the 🤗 Hub
|
183 |
-
model = SentenceTransformer("asmud/nomic-embed-indonesian")
|
184 |
# Run inference with Indonesian text
|
185 |
sentences = [
|
186 |
'search_query: Apa itu kecerdasan buatan?',
|
|
|
125 |
|
126 |
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) specifically for **Indonesian language** text embedding tasks. It maps Indonesian sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
127 |
|
128 |
+
## 🚀 Quick Start
|
129 |
+
|
130 |
+
```python
|
131 |
+
from sentence_transformers import SentenceTransformer
|
132 |
+
|
133 |
+
# Load the model (requires trust_remote_code=True)
|
134 |
+
model = SentenceTransformer("asmud/nomic-embed-indonesian", trust_remote_code=True)
|
135 |
+
|
136 |
+
# Indonesian text examples
|
137 |
+
texts = [
|
138 |
+
"search_query: Apa itu kecerdasan buatan?",
|
139 |
+
"search_document: Kecerdasan buatan adalah teknologi yang memungkinkan mesin belajar",
|
140 |
+
"classification: Produk ini sangat berkualitas (sentimen: positif)"
|
141 |
+
]
|
142 |
+
|
143 |
+
# Generate embeddings
|
144 |
+
embeddings = model.encode(texts)
|
145 |
+
print(f"Embedding shape: {embeddings.shape}") # (3, 768)
|
146 |
+
```
|
147 |
+
|
148 |
## 🇮🇩 **Specialized for Indonesian Language**
|
149 |
|
150 |
This model is optimized for Indonesian text understanding across multiple domains including:
|
|
|
195 |
pip install -U sentence-transformers
|
196 |
```
|
197 |
|
198 |
+
⚠️ **Important**: This model requires `trust_remote_code=True` due to custom model architecture.
|
199 |
+
|
200 |
Then you can load this model and run inference.
|
201 |
```python
|
202 |
from sentence_transformers import SentenceTransformer
|
203 |
|
204 |
# Download from the 🤗 Hub
|
205 |
+
model = SentenceTransformer("asmud/nomic-embed-indonesian", trust_remote_code=True)
|
206 |
# Run inference with Indonesian text
|
207 |
sentences = [
|
208 |
'search_query: Apa itu kecerdasan buatan?',
|
SETUP.md
CHANGED
@@ -74,7 +74,7 @@ After uploading, verify the model works:
|
|
74 |
from sentence_transformers import SentenceTransformer
|
75 |
|
76 |
# Load the uploaded model
|
77 |
-
model = SentenceTransformer("asmud/nomic-embed-indonesian")
|
78 |
|
79 |
# Test Indonesian text
|
80 |
texts = [
|
|
|
74 |
from sentence_transformers import SentenceTransformer
|
75 |
|
76 |
# Load the uploaded model
|
77 |
+
model = SentenceTransformer("asmud/nomic-embed-indonesian", trust_remote_code=True)
|
78 |
|
79 |
# Test Indonesian text
|
80 |
texts = [
|
USAGE_EXAMPLES.md
CHANGED
@@ -7,7 +7,7 @@ from sentence_transformers import SentenceTransformer
|
|
7 |
from sklearn.metrics.pairwise import cosine_similarity
|
8 |
import numpy as np
|
9 |
|
10 |
-
model = SentenceTransformer("asmud/nomic-embed-indonesian")
|
11 |
|
12 |
# Indonesian search example
|
13 |
query = "search_query: Bagaimana cara memasak rendang?"
|
|
|
7 |
from sklearn.metrics.pairwise import cosine_similarity
|
8 |
import numpy as np
|
9 |
|
10 |
+
model = SentenceTransformer("asmud/nomic-embed-indonesian", trust_remote_code=True)
|
11 |
|
12 |
# Indonesian search example
|
13 |
query = "search_query: Bagaimana cara memasak rendang?"
|