SentenceTransformer based on yahyaabd/allstats-search-mini-v1-1-mnrl

This is a sentence-transformers model finetuned from yahyaabd/allstats-search-mini-v1-1-mnrl on the bps-pub-cosine-pairs dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yahyaabd/allstats-search-mini-v1-1-mnrl-v3")
# Run inference
sentences = [
    'Statistik penduduk berdasarkan kelompok umur dan jenis kelamin',
    'Direktori Perusahaan Industri Pengolahan Skala Kecil Buku II Hasil Se 2006',
    'Indikator Ekonomi Desember 2004',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.9663 0.9698
spearman_cosine 0.856 0.8591

Training Details

Training Dataset

bps-pub-cosine-pairs

  • Dataset: bps-pub-cosine-pairs at 3347a5e
  • Size: 8,126 training samples
  • Columns: query, title, and score
  • Approximate statistics based on the first 1000 samples:
    query title score
    type string string float
    details
    • min: 4 tokens
    • mean: 11.04 tokens
    • max: 30 tokens
    • min: 5 tokens
    • mean: 13.02 tokens
    • max: 43 tokens
    • min: 0.1
    • mean: 0.55
    • max: 0.9
  • Samples:
    query title score
    Nilai Tukar Nelayan Statistik Hotel dan Akomodasi Lainnya di Indonesia 2013 0.1
    Berapa angka statistik pertambangan non migas Indonesia periode 2012? Statistik Pertambangan Non Minyak dan Gas Bumi 2011-2015 0.9
    Bagaimana situasi angkatan kerja Indonesia di bulan Februari 2021? Keadaan Angkatan Kerja di Indonesia Februari 2021 0.9
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

bps-pub-cosine-pairs

  • Dataset: bps-pub-cosine-pairs at 3347a5e
  • Size: 1,019 evaluation samples
  • Columns: query, title, and score
  • Approximate statistics based on the first 1000 samples:
    query title score
    type string string float
    details
    • min: 4 tokens
    • mean: 11.19 tokens
    • max: 31 tokens
    • min: 5 tokens
    • mean: 13.24 tokens
    • max: 44 tokens
    • min: 0.1
    • mean: 0.56
    • max: 0.9
  • Samples:
    query title score
    Sosek Desember 2021 Laporan Bulanan Data Sosial Ekonomi Desember 2021 0.9
    Ekspor Indonesia menurut SITC 2019-2020 Statistik Perdagangan Luar Negeri Indonesia Ekspor Menurut Kode SITC, 2019-2020 0.9
    Pengeluaran konsumsi penduduk Indonesia Maret 2018 Pengeluaran untuk Konsumsi Penduduk Indonesia, Maret 2018 0.9
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 1e-05
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • label_smoothing_factor: 0.01
  • eval_on_start: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.01
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: True
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
0 0 - 0.0372 0.8428 -
0.0394 10 0.0437 0.0367 0.8430 -
0.0787 20 0.0382 0.0351 0.8436 -
0.1181 30 0.0392 0.0327 0.8447 -
0.1575 40 0.0343 0.0304 0.8460 -
0.1969 50 0.0286 0.0287 0.8469 -
0.2362 60 0.0289 0.0271 0.8480 -
0.2756 70 0.0272 0.0257 0.8492 -
0.3150 80 0.0289 0.0243 0.8501 -
0.3543 90 0.0232 0.0228 0.8509 -
0.3937 100 0.0251 0.0216 0.8515 -
0.4331 110 0.0202 0.0205 0.8520 -
0.4724 120 0.0229 0.0198 0.8525 -
0.5118 130 0.0195 0.0191 0.8531 -
0.5512 140 0.0191 0.0185 0.8533 -
0.5906 150 0.0238 0.0179 0.8536 -
0.6299 160 0.0193 0.0175 0.8538 -
0.6693 170 0.0174 0.0171 0.8540 -
0.7087 180 0.0189 0.0169 0.8541 -
0.7480 190 0.0192 0.0167 0.8542 -
0.7874 200 0.0161 0.0164 0.8543 -
0.8268 210 0.0173 0.0160 0.8545 -
0.8661 220 0.0143 0.0156 0.8547 -
0.9055 230 0.0119 0.0155 0.8547 -
0.9449 240 0.0183 0.0154 0.8548 -
0.9843 250 0.0149 0.0152 0.8548 -
1.0236 260 0.0157 0.0147 0.8550 -
1.0630 270 0.0141 0.0146 0.8550 -
1.1024 280 0.0127 0.0146 0.8550 -
1.1417 290 0.0163 0.0144 0.8550 -
1.1811 300 0.012 0.0142 0.8550 -
1.2205 310 0.0138 0.0140 0.8551 -
1.2598 320 0.0112 0.0139 0.8551 -
1.2992 330 0.0119 0.0136 0.8552 -
1.3386 340 0.0115 0.0133 0.8553 -
1.3780 350 0.0109 0.0131 0.8553 -
1.4173 360 0.0157 0.0129 0.8553 -
1.4567 370 0.0119 0.0129 0.8553 -
1.4961 380 0.0129 0.0129 0.8553 -
1.5354 390 0.0094 0.0127 0.8554 -
1.5748 400 0.0142 0.0127 0.8554 -
1.6142 410 0.0115 0.0125 0.8555 -
1.6535 420 0.0135 0.0123 0.8555 -
1.6929 430 0.01 0.0122 0.8556 -
1.7323 440 0.0109 0.0121 0.8556 -
1.7717 450 0.0148 0.0119 0.8557 -
1.8110 460 0.0126 0.0117 0.8558 -
1.8504 470 0.0104 0.0116 0.8558 -
1.8898 480 0.0095 0.0116 0.8559 -
1.9291 490 0.0098 0.0115 0.8558 -
1.9685 500 0.0118 0.0115 0.8558 -
2.0079 510 0.0092 0.0114 0.8558 -
2.0472 520 0.0113 0.0114 0.8558 -
2.0866 530 0.0103 0.0113 0.8558 -
2.1260 540 0.0107 0.0112 0.8558 -
2.1654 550 0.009 0.0111 0.8558 -
2.2047 560 0.0095 0.0110 0.8559 -
2.2441 570 0.0091 0.0110 0.8559 -
2.2835 580 0.008 0.0110 0.8559 -
2.3228 590 0.0108 0.0109 0.8559 -
2.3622 600 0.008 0.0110 0.8559 -
2.4016 610 0.008 0.0109 0.8559 -
2.4409 620 0.0082 0.0109 0.8560 -
2.4803 630 0.0084 0.0108 0.8560 -
2.5197 640 0.0076 0.0108 0.8560 -
2.5591 650 0.01 0.0107 0.8560 -
2.5984 660 0.0101 0.0107 0.8560 -
2.6378 670 0.0089 0.0107 0.8560 -
2.6772 680 0.01 0.0107 0.8560 -
2.7165 690 0.0097 0.0106 0.8560 -
2.7559 700 0.0092 0.0106 0.8560 -
2.7953 710 0.0085 0.0106 0.8560 -
2.8346 720 0.0119 0.0106 0.8560 -
2.8740 730 0.0096 0.0106 0.8560 -
2.9134 740 0.008 0.0106 0.8560 -
2.9528 750 0.0078 0.0106 0.8560 -
2.9921 760 0.0093 0.0106 0.856 -
-1 -1 - - - 0.8591
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.4.0
  • Transformers: 4.48.1
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
76
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yahyaabd/allstats-search-mini-v1-1-mnrl-v3

Dataset used to train yahyaabd/allstats-search-mini-v1-1-mnrl-v3

Evaluation results