SentenceTransformer based on yahyaabd/allstats-search-mini-v1-1-mnrl

This is a sentence-transformers model finetuned from yahyaabd/allstats-search-mini-v1-1-mnrl on the bps-sts-dataset-v1 dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yahyaabd/allstats-search-mini-v1-1-mnrl-sts-fold-3")
# Run inference
sentences = [
    'Rata-rata pengeluaran untuk konsumsi rokok per kapita di Provinsi Sumatera Barat termasuk yang tertinggi secara nasional.',
    'Prevalensi perokok dewasa di Indonesia masih tinggi meskipun berbagai kampanye anti-rokok telah dilakukan.',
    'Produksi kopi robusta dari dataran tinggi Semendo, Sumsel, dikenal memiliki kualitas yang baik.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric sts-dev-fold-1 sts-dev-fold-2 sts-dev-fold-3
pearson_cosine 0.8581 0.8988 0.9341
spearman_cosine 0.8517 0.896 0.9279

Training Details

Training Dataset

bps-sts-dataset-v1

  • Dataset: bps-sts-dataset-v1 at 5c8f96e
  • Size: 1,972 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 6 tokens
    • mean: 20.73 tokens
    • max: 38 tokens
    • min: 10 tokens
    • mean: 20.87 tokens
    • max: 42 tokens
    • min: 0.0
    • mean: 0.51
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    bagaimana capaian Tujuan Pembangunan Berkelanjutan di Indonesia? Laporan Pencapaian Indikator Tujuan Pembangunan Berkelanjutan (TPB/SDGs) Indonesia, Edisi 2024 0.8
    Jumlah sekolah negeri jenjang SMP di Kota Bandar Lampung adalah 30 sekolah. Laju deforestasi di Provinsi Kalimantan Tengah masih mengkhawatirkan. 0.0
    Laju pertumbuhan penduduk Kota Makassar, Sulawesi Selatan, periode 2020-2024 adalah 1,5% per tahun. Populasi di ibukota Provinsi Sulsel, Makassar, bertambah rata-rata 1,5% setiap tahunnya antara 2020 dan 2024. 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

bps-sts-dataset-v1

  • Dataset: bps-sts-dataset-v1 at 5c8f96e
  • Size: 986 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 986 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 7 tokens
    • mean: 20.47 tokens
    • max: 39 tokens
    • min: 8 tokens
    • mean: 20.56 tokens
    • max: 49 tokens
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    Jumlah perpustakaan umum di Indonesia tahun 2022 sebanyak 170.000 unit. Minat baca masyarakat Indonesia masih perlu ditingkatkan melalui berbagai program literasi. 0.4
    Indeks Kualitas Tutupan Lahan nasional mengalami sedikit perbaikan pada tahun 2024. Terjadi peningkatan minor pada IKTL secara nasional di tahun 2024. 0.8
    berapa banyak sih anak muda yang nganggur di Jakarta? Tingkat Pengangguran Terbuka (TPT) Penduduk Usia 15-24 Tahun di Provinsi DKI Jakarta, Agustus 2024 0.8
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 1e-05
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • label_smoothing_factor: 0.01
  • eval_on_start: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.01
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: True
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss sts-dev-fold-1_spearman_cosine sts-dev-fold-2_spearman_cosine sts-dev-fold-3_spearman_cosine
0 0 - 0.0527 0.7789 - -
0.1613 10 0.0538 0.0513 0.7842 - -
0.3226 20 0.0576 0.0465 0.8011 - -
0.4839 30 0.0555 0.0422 0.8171 - -
0.6452 40 0.0511 0.0397 0.8269 - -
0.8065 50 0.0438 0.0384 0.8332 - -
0.9677 60 0.0443 0.0374 0.8374 - -
1.1290 70 0.0393 0.0366 0.8405 - -
1.2903 80 0.0355 0.0361 0.8430 - -
1.4516 90 0.0379 0.0357 0.8445 - -
1.6129 100 0.0338 0.0354 0.8461 - -
1.7742 110 0.0358 0.0350 0.8473 - -
1.9355 120 0.0327 0.0348 0.8488 - -
2.0968 130 0.0303 0.0346 0.8496 - -
2.2581 140 0.0336 0.0345 0.8502 - -
2.4194 150 0.03 0.0344 0.8510 - -
2.5806 160 0.0272 0.0342 0.8512 - -
2.7419 170 0.0331 0.0341 0.8515 - -
2.9032 180 0.03 0.0341 0.8517 - -
-1 -1 - - 0.8517 - -
0 0 - 0.0270 - 0.8872 -
0.1613 10 0.0327 0.0269 - 0.8876 -
0.3226 20 0.0364 0.0268 - 0.8890 -
0.4839 30 0.0302 0.0268 - 0.8905 -
0.6452 40 0.0318 0.0269 - 0.8911 -
0.8065 50 0.0313 0.0264 - 0.8918 -
0.9677 60 0.0313 0.0261 - 0.8928 -
1.1290 70 0.0285 0.0258 - 0.8931 -
1.2903 80 0.0274 0.0257 - 0.8934 -
1.4516 90 0.0297 0.0258 - 0.8936 -
1.6129 100 0.0235 0.0260 - 0.8939 -
1.7742 110 0.0246 0.0257 - 0.8944 -
1.9355 120 0.0227 0.0255 - 0.8948 -
2.0968 130 0.0211 0.0254 - 0.8951 -
2.2581 140 0.0231 0.0253 - 0.8954 -
2.4194 150 0.0253 0.0252 - 0.8958 -
2.5806 160 0.0218 0.0252 - 0.8959 -
2.7419 170 0.025 0.0252 - 0.8960 -
2.9032 180 0.0183 0.0252 - 0.8961 -
-1 -1 - - - 0.8960 -
0 0 - 0.0169 - - 0.9283
0.1613 10 0.0257 0.0169 - - 0.9281
0.3226 20 0.0256 0.0170 - - 0.9273
0.4839 30 0.023 0.0171 - - 0.9271
0.6452 40 0.023 0.0173 - - 0.9270
0.8065 50 0.0284 0.0170 - - 0.9273
0.9677 60 0.0246 0.0169 - - 0.9276
1.1290 70 0.0208 0.0168 - - 0.9281
1.2903 80 0.0228 0.0168 - - 0.9278
1.4516 90 0.0182 0.0169 - - 0.9278
1.6129 100 0.0189 0.0169 - - 0.9279
1.7742 110 0.0179 0.0169 - - 0.9280
1.9355 120 0.0212 0.0169 - - 0.9280
2.0968 130 0.0155 0.0168 - - 0.9280
2.2581 140 0.0202 0.0168 - - 0.9279
2.4194 150 0.0199 0.0168 - - 0.9279
2.5806 160 0.0169 0.0168 - - 0.9280
2.7419 170 0.0162 0.0168 - - 0.9279
2.9032 180 0.0163 0.0168 - - 0.9279
-1 -1 - - - - 0.9279
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 3.4.0
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
6
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yahyaabd/allstats-search-mini-v1-1-mnrl-sts-fold-3

Dataset used to train yahyaabd/allstats-search-mini-v1-1-mnrl-sts-fold-3

Evaluation results