Skill Assignment SentenceTransformer based on Alibaba-NLP/gte-large-en-v1.5
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and has been fine-tuned to match essay texts with relevant skills for pedadogical evaluation.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-large-en-v1.5
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset: 11779 triplets (anchor, positive, negative) consisting of (essay text, relevant skill, irrelevant skill)
- Training Loss: Triplet loss
- Final evaluation: 100% accuracy using the Triplet Evaluator with 0 margin on 619 validation triplets.
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference, to find matching skills for a given essay. The essay should be in plain text, and the skills should ideally be of the form "Short skill name: detailed skill description"
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("dpanea/skill-assignment-transformer")
# Prepare data
essay_text = ['Fighter Jet\nGreetings my fellow friends. I am going to talk about my greatest passion fighter jets...']
skills = [
'Noun Consistency Skills: I can use nouns, pronouns, plurals and tenses accurately and consistently throughout.',
'Adventurous Vocabulary Skills: I can select from a range of known adventurous vocabulary. (tier 2 and tier 3 words).',
'Descriptive Language Skills: I can use appropriate, interesting and varied word choice (adjectives, adverbs and descriptive phrases).',
'Dialogue Tagging Skills: I can use dialogue tags successfully (eg correct positioning, new line for new speaker).',
'Spell Words: I can spell commonly used words accurately.',
...
]
# Get embeddings
essay_embedding = model.encode(essay_text)
skill_embeddings = model.encode(skills)
# Get the k most relevant skills for the given essay
from sentence_transformers.util import cos_sim
similarities = cos_sim(essay_embedding, skill_embeddings).flatten()
top_indices = np.argsort(similarities)[-k:][::-1]
top_skills = [all_skill_texts[i] for i in top_indices]
Training Details
Training Dataset
Unnamed Dataset
- Size: 11,779 training samples
- Columns:
Essay text
,Relevant skill
, andIrrelevant skill
- Approximate statistics based on the first 1000 samples:
Essay text Relevant skill Irrelevant skill type string string string details - min: 124 tokens
- mean: 615.96 tokens
- max: 1566 tokens
- min: 7 tokens
- mean: 19.72 tokens
- max: 69 tokens
- min: 6 tokens
- mean: 19.55 tokens
- max: 53 tokens
- Samples:
Essay text Relevant skill Irrelevant skill 2024 POETRY FEATURE ARTICLE – SCAFFOLD - blank
Name:
Song Chosen: SET IT ALL FREE
Poem Chosen: STILL, I RISE
Common theme: These form together to give the message of overcoming challenges and rising above difficulties with confidence and strength.
]
THIS Scaffold could be submitted as your draft.
HEADLINE: It needs to be strong, catchy and stimulate the reader. Try for ‘ear appeal’ or ‘brain appeal’ if you can. Possibly use alliteration or a pun. Just use the title of your poem until you can think of a title for the article. FOCUS BLUB: A brief, gripping sentence or two that lets readers know more specifically what the article is about. It gives a sense of the style of your piece. / Voiceworks - Whispers Of Wisdom Discover the themes of resilience and empowerment in Scarlett Johanssons “set it all Free” and mya Angelou’s “still I rise” I will explore how these works help us to overcome adversity and embrace our true strength...Emotionally Engaging Language: I can evoke an emotional response through emotive language.
Reference Formatting Skills: Formats the reference list/bibliography correctly.
Why is there no fuel for the next 500 kilometers? We need fuel and there is no way to turn back.This is such a bad time.We need fuel and i am gonna rage quit and drive us off the bridge if we can't get fuel any time soon pull over it's my turn, to drive you have been driving for the last hour and i want t go speeding, down this hill and get to the fuel station quicker, you drive way to slow and it is annoying me.Ok fine i'm pulling over.Finally ok i see that red car coming ,he wants to race and im racing him.ya i beat him but now we only have enough fuel for the next 200 km and the next fuel station is 250 km away i will drive until we run out of fuel then we will have to push and i'm paying for the fuel don't even think about paying for the fuel little brother.Ok time to push.No i am not pushing the car and you can not make me just because u are 1 year older than me does no mean can boss me around.Fine i will push lazy boy.What Why is the gas station shut down and the next one is 300k...
Essay Organization Skills: Essay Writing
Case Evaluation Skills: Does the student include discerning evaluation of ideas to support their case for positive change?
What is the artefact? the artefact is a gold armband. What are the features of the artefacts? the features on the arte fact it's a gold amband it looks like it beendigging to look like a snake rap around ur arm. you can see the snake scale's and and snake head on the amberd. Question 2 What aspect of Ancient Roman society does this artefact represent? the artefacts represent partion partion partian partian head tate were the richest people in human society it tells us that partions were the richest people in Aome Home society. patients were on of social What does the artefact tell us about Ancient Roman society? pyramid. they had all theexpertsn suf and they had Slaves How does this artefact give us an understanding about Ancient Roman society? the artefact gives us a understanding their were rich people and Cparthers) they had a late more money then all the others people in home society. 7
Spelling Visuals: Spelling visual - 4
Event Setting Visualization Skills: I can use technical vocabulary, contemporary language and images to create a sense of the event and the setting
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 4per_device_eval_batch_size
: 4multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 4per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsehub_revision
: Nonegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseliger_kernel_config
: Noneeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss |
---|---|---|
0.0340 | 100 | - |
0.0679 | 200 | - |
0.1019 | 300 | - |
0.1358 | 400 | - |
0.1698 | 500 | 1.7346 |
0.2037 | 600 | - |
0.2377 | 700 | - |
0.2716 | 800 | - |
0.3056 | 900 | - |
0.3396 | 1000 | 0.8428 |
0.3735 | 1100 | - |
0.4075 | 1200 | - |
0.4414 | 1300 | - |
0.4754 | 1400 | - |
0.5093 | 1500 | 0.4421 |
0.5433 | 1600 | - |
0.5772 | 1700 | - |
0.6112 | 1800 | - |
0.6452 | 1900 | - |
0.6791 | 2000 | 0.3366 |
0.7131 | 2100 | - |
0.7470 | 2200 | - |
0.7810 | 2300 | - |
0.8149 | 2400 | - |
0.8489 | 2500 | 0.2568 |
0.8829 | 2600 | - |
0.9168 | 2700 | - |
0.9508 | 2800 | - |
0.9847 | 2900 | - |
1.0 | 2945 | - |
1.0187 | 3000 | 0.1666 |
1.0526 | 3100 | - |
1.0866 | 3200 | - |
1.1205 | 3300 | - |
1.1545 | 3400 | - |
1.1885 | 3500 | 0.1027 |
1.2224 | 3600 | - |
1.2564 | 3700 | - |
1.2903 | 3800 | - |
1.3243 | 3900 | - |
1.3582 | 4000 | 0.0657 |
1.3922 | 4100 | - |
1.4261 | 4200 | - |
1.4601 | 4300 | - |
1.4941 | 4400 | - |
1.5280 | 4500 | 0.0788 |
1.5620 | 4600 | - |
1.5959 | 4700 | - |
1.6299 | 4800 | - |
1.6638 | 4900 | - |
1.6978 | 5000 | 0.0648 |
1.7317 | 5100 | - |
1.7657 | 5200 | - |
1.7997 | 5300 | - |
1.8336 | 5400 | - |
1.8676 | 5500 | 0.0413 |
1.9015 | 5600 | - |
1.9355 | 5700 | - |
1.9694 | 5800 | - |
2.0 | 5890 | - |
2.0034 | 5900 | - |
2.0374 | 6000 | 0.0293 |
2.0713 | 6100 | - |
2.1053 | 6200 | - |
2.1392 | 6300 | - |
2.1732 | 6400 | - |
2.2071 | 6500 | 0.0158 |
2.2411 | 6600 | - |
2.2750 | 6700 | - |
2.3090 | 6800 | - |
2.3430 | 6900 | - |
2.3769 | 7000 | 0.0183 |
2.4109 | 7100 | - |
2.4448 | 7200 | - |
2.4788 | 7300 | - |
2.5127 | 7400 | - |
2.5467 | 7500 | 0.0079 |
2.5806 | 7600 | - |
2.6146 | 7700 | - |
2.6486 | 7800 | - |
2.6825 | 7900 | - |
2.7165 | 8000 | 0.007 |
2.7504 | 8100 | - |
2.7844 | 8200 | - |
2.8183 | 8300 | - |
2.8523 | 8400 | - |
2.8862 | 8500 | 0.0057 |
2.9202 | 8600 | - |
2.9542 | 8700 | - |
2.9881 | 8800 | - |
3.0 | 8835 | - |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 4.1.0
- Transformers: 4.53.0
- PyTorch: 2.1.0+cu118
- Accelerate: 1.8.1
- Datasets: 3.6.0
- Tokenizers: 0.21.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- Downloads last month
- 23
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for dpanea/skill-assignment-transformer
Base model
Alibaba-NLP/gte-large-en-v1.5