matrix-game-embeddings-ft-v1 / README.md

CalebMaresca

Add new SentenceTransformer model

4debbcb verified 4 months ago

preview code

raw

history blame contribute delete

31.8 kB

metadata

tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:370
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-l
widget:
  - source_sentence: ' What is the proposed alternative to imposing random events purely by chance on players?'
    sentences:
      - >-
        half a day). 

        • They are perceived to be new and innovative (despite being around
        since 1987). 

        • They are easy to transport, requiring only pen and paper – with
        perhaps a few maps and 

        counters. 

        • They work well in multi-domain, multi-agency contexts allowing all
        Actors to participate 

        equally. 

        A few Words of Warning 

        • The fact that a Matrix Game requires little infrastructure can be a
        problem – it just 

        doesn't look sexy and the strengths that it can be done quickly with the
        minimum of 

        fuss, can be reduced by efforts to make it look cool/expensive. 

        • The non-quantitative nature of the game can frustrate analysts. 

        • Matrix Games require an experience facilitator to run them.
      - >-
        inform the other players of their stated intentions. In many cases these
        are not really 

        "arguments" as part of the game, so shouldn't count as their action for
        the turn, unless they 

        wish to specify a measurable effect (such as increasing their approval
        ratings). 

        Trade Agreements 

        In some games, trade forms a very important part of the game narrative.
        In most cases this 

        can be treated simply as part of the normal ebb and flow of the argument
        process. 

        However, in some circumstances, particularly when timescales are long,
        trade can require 

        greater attention as to the nuances of the economic benefits and
        impacts. In these cases, it 

        may be necessary to get the two sides to make additional arguments as to
        what they expect
      - >-
        possible throughout the game, having “random events” happen completely
        at random is 

        problematic. An Actor may be disadvantaged purely by chance, more than
        once during the 

        game, which can reduce their immersion and engagement. The narrative
        develops during 

        the game based on the decisions of the players and their reactions to
        the decisions of other 

        players. Having random events imposed on them by chance breaks this
        “cause and effect” 

        cycle and degrades the game flow. 

        The alternative is to give the random event to the participants. They
        will then make a 

        decision as to how this can contribute to the narrative being developed
        by the players. They
  - source_sentence: ' What is the primary purpose of the game described in the context?'
    sentences:
      - "If you are using voting systems, either as Diceless Adjudication or as Estimative Probability, \nyou should take great care to ensure that the players are being as professional as possible, \nand not merely \"voting for themselves\" in a competitive manner. Many players can be quite \nvery competitive, so it may be necessary to not allow them to vote on their argument – and \nequally it may be necessary to keep an eye on players who are in direct competition. The \nintention is to develop a narrative, generating insights – rather than trying to win at all \ncosts. \n \n7 An example is https://www.turningtechnologies.eu/turningpoint/ \n8 An example is https://www.polleverywhere.com/\n\fVersion 15 \nPage 14 of 52  \n© Tom Mouat 2019, 2020, 2022, 2023"
      - >-
        spend the time piling markers on counters. Tracks can be generic (in
        that they simply record 

        the number of plusses or minuses applied) or they might have specific
        "trigger levels" (in 

        that when the morale of the infantry is reduced to -3, the "raw" units
        will desert and return 

        to their homes. 

        It can also be useful to have a "Press" actor whose job it is to record
        the results of 

        arguments (both visible to the public and those not), as well as putting
        the "Press spin" on 

        the events. This role can be useful in looking after the "Consequence
        Management" 

        elements mentioned earlier. 

        The Components (and Characters) Affect the Game 

        When participants are thinking on their feet, what they can see will
        affect what they argue
      - >-
        materials, a short game, and small numbers of participants. If they want
        to conduct a "deep 

        dive", this isn't the appropriate game - the purpose is to identify the
        insights – so make a 

        note and move on. The "deep dive" should follow later or in a different
        type of game. You 

        should, therefore, make sure you include this point in your introductory
        briefing so that the 

        players are clear from the outset. 

        When dealing with dominant people, who continually interrupt and
        dominate the 

        Arguments, you need to take a harder line. You should interrupt them
        when they interrupt 

        another player making a point. Point out to them that they had their
        chance. This isn't a
  - source_sentence: ' Why should Big Projects or Long-Term Plans require no more than three successful arguments in the game?'
    sentences:
      - "much on this single thing.  \nThis does not mean that arguments have to only be about things that can happen within \nthe turn length of the game. It is possible to make \"long term\" arguments like anything else. \nIf, in a Baltic game with week-long turns, you want to argue that an electricity cable \nbetween Sweden and Lithuania is to be built with the aim of reducing Lithuania's \ndependence on Russian energy, this would be judged as normal. It just would not come to \n \n9 I am indebted to Prof Rex Brynen for this suggestion.\n\fVersion 15 \nPage 23 of 52  \n© Tom Mouat 2019, 2020, 2022, 2023 \nfruition in the length of the game – but, assuming the argument was successful, it would"
      - "games.\n\fVersion 15 \nPage 36 of 52  \n© Tom Mouat 2019, 2020, 2022, 2023 \nWhy I like Matrix Games \n• Designing a Matrix Game can be done quickly with the minimum of fuss. \n• Participating in a Matrix Game does not require an understanding of complex and \nunfamiliar rules. \n• Matrix games can cover a wide variety of possible scenarios, including conceptual \nconflicts like Cyber. \n• They are especially good in the non-kinetic, effects based, domain. \n• Matrix games deal with qualitative outputs so are especially useful for non-analysts. \n• The games work best with small groups, increasing immersion and buy-in to the game. \n• Matrix games are extremely inexpensive (and they work best with short sessions lasting \nhalf a day)."
      - >-
        protection: Its hidden location, its boundary fence, and the security
        guards, all of which 

        must be overcome by successful arguments before the base can be
        penetrated. 

        As a rule of thumb, nothing should have more than 3 levels of protection
        as it will simply 

        take too long and dominate the game to the exclusion of everything
        else. 

        Big Projects or Long-Term Plans 

        Depending on the level of the game, some actions and events represent
        such a large 

        investment in time and effort that they require multiple arguments in
        order to bring them 

        to fruition. As a rule of thumb, a Big Project should also take no more
        than 3 successful 

        arguments (like protected and hidden things above); otherwise, the game
        is focussed too
  - source_sentence: ' Which associations related to wargaming and simulation are mentioned in the context?'
    sentences:
      - >-
        out their objectives and explain why they though they succeeded or
        failed can be most 

        instructive. Also, if you then ask the assembled group "who won?" and
        they all agree, then 

        this can be a very powerful indicator of things that might need to be
        looked at more closely 

        as a result of the game. 

        Finally, the insights from the game can take a little time to come out.
        They might not be 

        immediately obvious, so taking time to consider what happened in the
        game and whether 

        individual events are noteworthy, is very useful. I am continually
        surprised at the predictive 

        power of such a simple game. 
         
         
         
        11 See: Game theory, simulated interaction, and unaided judgement for
        forecasting decisions in conflicts. Kesten C. Green.
      - >-
        gaming vignettes


        job opportunities/positions vacant


        latest links


        methodology


        not-so-serious


        playtesters needed


        reader survey


        request for proposals


        scholarships and fellowships


        simulation and game reports


        simulation and game reviews


        simulation and gaming debacles


        simulation and gaming history


        simulation and gaming ideas


        simulation and gaming journals


        simulation and gaming materials


        simulation and gaming miscellany


        simulation and gaming news


        simulation and gaming publications


        simulation and gaming software


        Archives


        M T W T F S S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
        23 24 25 26 27 28 29 30


        Associations


        Australian Defence Force Wargaming Group


        Connections Netherlands


        Connections North (Canada)
      - >-
        Senior Officers, Dominant People and Contentious Arguments 

        It is not uncommon in a Matrix Game that the participants want to
        "debate" the arguments. 

        To a limited extent this is ok, but as stated elsewhere, the game needs
        to move at a pace, 

        creating an immersive narrative and forcing the players to have to live
        with the 

        consequences of their earlier decisions.  

        It can happen that a Senior Officer, used to "seminar wargames", will
        interrupt when you 

        want to move on and say "wait a minute - this is a really valuable
        debate - let's just dig 

        down..." You should try to point out that this is not that sort of game
        - Matrix Games are to 

        gain an insight and understanding in a specific way. Short notice,
        minimal preparation and
  - source_sentence: ' Why is it important for player roles in a Matrix Game to operate at broadly similar levels?'
    sentences:
      - >-
        The Basic Rule.   The basic rule is as follows: 1 x 6-Sided Dice = 1 x
        Combat Unit The size of that Combat Unit will, of course, vary from game
        to game. In the boarding action it may be as little as 5-10 men; in a
        Map Game, it could be as much as an entire Brigade, or even a Corps.


        The Method.   The dice on the opposing sides are rolled as follows: Roll
        the Dice. Line them up, Highest vs Highest If one side has more dice
        than the other, any dice that are extra, and score less than the lowest
        dice of the side with the fewer dice, are ignored.
      - >-
        Matrix Game Checklist
        ....................................................................................
        38 

        Sample Spendable Bonus Cards
        ......................................................................
        40 

        Sample Random Events
        ...................................................................................
        41 

        Sample Voting Cards for Diceless Adjudication
        ............................................... 43 

        Sample Estimative Probability Cards
        ............................................................... 44 

        Sample Turn Order Cards
        ................................................................................
        45 

        Sample Markers for Matrix Games for Effects and Conventional Forces
        ........ 46
      - >-
        When you are designing a Matrix Game it is worth thinking about the
        level at which the players roles will be operating in the game. In is
        usually better, and produces a more balanced game, when the level on
        which the player roles are operating are broadly similar. It would be
        difficult to get a balanced game if 3 of the players are playing
        Generals in command of vast Armies, and another player is playing a
        simple individual soldier.


        Levels of Protection and Hidden Things.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.9347826086956522
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9347826086956522
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.33333333333333337
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.1999999999999999
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999995
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.9347826086956522
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.97023760333851
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9601449275362318
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.960144927536232
            name: Cosine Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: Snowflake/snowflake-arctic-embed-l
Maximum Sequence Length: 512 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("CalebMaresca/matrix-game-embeddings-ft-v1")
# Run inference
sentences = [
    ' Why is it important for player roles in a Matrix Game to operate at broadly similar levels?',
    'When you are designing a Matrix Game it is worth thinking about the level at which the players roles will be operating in the game. In is usually better, and produces a more balanced game, when the level on which the player roles are operating are broadly similar. It would be difficult to get a balanced game if 3 of the players are playing Generals in command of vast Armies, and another player is playing a simple individual soldier.\n\nLevels of Protection and Hidden Things.',
    'Matrix Game Checklist .................................................................................... 38 \nSample Spendable Bonus Cards ...................................................................... 40 \nSample Random Events ................................................................................... 41 \nSample Voting Cards for Diceless Adjudication ............................................... 43 \nSample Estimative Probability Cards ............................................................... 44 \nSample Turn Order Cards ................................................................................ 45 \nSample Markers for Matrix Games for Effects and Conventional Forces ........ 46',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.9348
cosine_accuracy@3	1.0
cosine_accuracy@5	1.0
cosine_accuracy@10	1.0
cosine_precision@1	0.9348
cosine_precision@3	0.3333
cosine_precision@5	0.2
cosine_precision@10	0.1
cosine_recall@1	0.9348
cosine_recall@3	1.0
cosine_recall@5	1.0
cosine_recall@10	1.0
cosine_ndcg@10	0.9702
cosine_mrr@10	0.9601
cosine_map@100	0.9601

Training Details

Training Dataset

Unnamed Dataset

Size: 370 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 370 samples:
sentence_0 sentence_1
type string string
details
min: 11 tokens
mean: 20.19 tokens
max: 34 tokens

min: 8 tokens
mean: 150.83 tokens
max: 512 tokens

	sentence_0	sentence_1
type	string	string
details	min: 11 tokens mean: 20.19 tokens max: 34 tokens	min: 8 tokens mean: 150.83 tokens max: 512 tokens

Samples:

sentence_0	sentence_1
`What distinguishes "established facts" from other types of facts in the game briefings or play?`	Forces soldiers are going to be much more effective in combat than untrained protestors; and "established facts" which are facts that have been specifically mentioned in the game briefings or have become established during play as the result of successful arguments. The latter can be immediately deployed as supporting reasons (Pros and Cons), but the former need to have been argued successfully in order for them to be specifically included. Many inexperienced players will make vast all-encompassing arguments full of assumptions that are not reasonable. For example: It is not a reasonable assumption that unarmed Protestors could fight off trained Police. It is reasonable to assume that the Police are
`Why is it unreasonable to assume that unarmed protestors could fight off trained police according to the context?`	Forces soldiers are going to be much more effective in combat than untrained protestors; and "established facts" which are facts that have been specifically mentioned in the game briefings or have become established during play as the result of successful arguments. The latter can be immediately deployed as supporting reasons (Pros and Cons), but the former need to have been argued successfully in order for them to be specifically included. Many inexperienced players will make vast all-encompassing arguments full of assumptions that are not reasonable. For example: It is not a reasonable assumption that unarmed Protestors could fight off trained Police. It is reasonable to assume that the Police are
`What was the outcome of the initial Russian attack against the German units, and how did it affect the ammunition status of both sides?`	The Russians succeed in pushing back one of the German units and forcing and already depleted unit to use up ammunition, (but are pushed back themselves and 2 units use a lot of ammo (one of which becomes combat ineffective on -3)). Overall, as the success is matched by failure, the line itself holds. The Russians attack again, the next day: Initial Dice Throw: RUSSIAN: 6 5 5 4 2 4 GERMAN: 1 2 4 4 Lined Up and Modified: RUSSIAN: 5 4 3 3 2 1 (two of the Russians = -2) GERMAN: 4 3 3 2 (one of the Germans = -1) Result of Third Day; lose: (one of the Germans = +0) RUSSIAN: 5 4 3 3 GERMAN: 3 lose: 4 3 2

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "MultipleNegativesRankingLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 10
per_device_eval_batch_size: 10
num_train_epochs: 10
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 10
per_device_eval_batch_size: 10
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 10
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
tp_size: 0
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Epoch	Step	cosine_ndcg@10
1.0	37	0.9273
1.3514	50	0.9490
2.0	74	0.9462
2.7027	100	0.9527
3.0	111	0.9527
4.0	148	0.9783
4.0541	150	0.9811
5.0	185	0.9622
5.4054	200	0.9622
6.0	222	0.9702
6.7568	250	0.9622
7.0	259	0.9622
8.0	296	0.9702
8.1081	300	0.9702
9.0	333	0.9702
9.4595	350	0.9702
10.0	370	0.9702

Framework Versions

Python: 3.13.2
Sentence Transformers: 4.1.0
Transformers: 4.51.3
PyTorch: 2.7.0+cu126
Accelerate: 1.6.0
Datasets: 3.6.0
Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}