SentenceTransformer based on allenai/specter2_aug2023refresh_base

This is a sentence-transformers model finetuned from allenai/specter2_aug2023refresh_base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: allenai/specter2_aug2023refresh_base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("m7n/discipline-tuned_specter_2_019")
# Run inference
sentences = [
    'Abstract The movement toward open education is requiring educators to expand and update their practice in order to keep up with the new demands being placed on them. This study explored how educators can engage in meaningful learning opportunities, which will facilitate the creation of expertise and knowledge, through the use of open education resources (OER). The article describes the design of the instrument employed to measure workplace learning through OER activity of adult educators ( n = ) and to report its internal reliability and convergent validity. Results indicate engagement with OER promote three levels of learning, each connected to the different types of knowledge educators require to integrate OER into their teaching practice.',
    "Abstract Predictors of academic success at university are of great interest to educators, researchers and policymakers. With more students studying online, it is important to understand whether traditional predictors of academic outcomes in facetoface settings are relevant to online learning. This study modelled selfregulatory and demographic predictors of subject grades in online and facetoface undergraduate students. Predictors were effort regulation, grade goal, academic selfefficacy, performance selfefficacy, age, sex, socioeconomic status (SES) and firstinfamily status. A multigroup path analysis indicated that the models were significantly different across learning modalities. For facetoface students, none of the model variables significantly predicted grades. For online students, only performance selfefficacy significantly predicted grades (small effect). Findings suggest that learner characteristics may not function in the same way across learning modes. Further factor analytic and hierarchical research is needed to determine whether selfregulatory predictors of academic success continue to be relevant to modern student cohorts. Practitioner Notes What is already known about this topic Selfregulatory and demographic variables are important predictors of university outcomes like grades. It is unclear whether the relationships between predictor variables and outcomes are the same across learning modalities, as research findings are mixed. What this paper adds Models predicting university students' grades by demographic and selfregulatory predictors differed significantly between facetoface and online learning modalities. Performance selfefficacy significantly predicted grades for online students. No selfregulatory variables significantly predicted grades for facetoface students, and no demographic variables significantly predicted grades in either cohort. Overall, traditional predictors of grades showed no/small unique effects in both cohorts. Implications for practice and/or policy The learner characteristics that predict success may not be the same across learning modalities. Approaches to enhancing success in facetoface settings are not automatically applicable to online settings. Selfregulatory variables may not predict university outcomes as strongly as previously believed, and more research is needed.",
    'Abstract OVERVIEW: This paper provides an overview of some fundamental aspects of electrochemical oxidation and gives updated information on the application of this technology to wastewater treatment. In recent years, electrochemical oxidation has gained increasing interest due to its outstanding technical characteristics for eliminating a wide variety of pollutants normally present in wastewaters such as refractory organic matter, nitrogen species and microorganisms. IMPACT: The strict disposal limits and health quality standards set by legislation may be met by applying electrochemical oxidation. However, treatment costs have to be cut down before fullscale application of this technology. Deployment of electrochemical oxidation in combination with other technologies and the use of renewable sources to power this process are two steps in this direction. APPLICATIONS: Effluents from landfill and a wide diversity of industrial effluents including the agroindustry, chemical, textile, tannery and food industry, have been effectively treated by this technology. Its high efficiency together with its disinfection capabilities makes electrooxidation a suitable technology for water reuse programs. Copyright ©️ Society of Chemical Industry',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

  • Datasets: specter_2_ and discipline-tuned_specter_2_019
  • Evaluated with TripletEvaluator
Metric specter_2_ discipline-tuned_specter_2_019
cosine_accuracy 0.9548 0.974

Triplet

Metric Value
cosine_accuracy 0.9739

Training Details

Training Dataset

Unnamed Dataset

  • Size: 43,494 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 83 tokens
    • mean: 232.13 tokens
    • max: 512 tokens
    • min: 83 tokens
    • mean: 231.05 tokens
    • max: 512 tokens
    • min: 79 tokens
    • mean: 226.87 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    Nowadays consumers are bombarded with different ads and the sheer abundance of advertisements causes marketers to be increasingly concerned with advertising effectiveness. Consequently, marketers and advertising companies exploring advertising effectiveness are always looking for more effective and newer communication media and evaluation methods of advertising effectiveness that technological development could provide. This study aims to incorporate AIDA model as hierarchy effect models for measuring the effectiveness of the TV advertisements for electric conservation in Isfahan city. Specifically this study aimed to evaluate the effects of TV advertisement on audience's attention, interest, desire for action and eventually changes made in conservation behavior of audience. The study revealed that the electric conservation TV advertisements were effective. In fact, TV advertisement was successful in taking attention of audience, creating interest and desire for action, and eventually ... E-retailing is entering into the Indian retail scenario in a noticeable way and online grocery retailing holds a promise of acceptance by the Indian customers. This paper attempts to discover the market potential of online grocery retailing in India and consumers' perception towards its different aspects. Confirmatory factor analysis proposes that there are five underlying dimensions (convenience, value for money, variety, loyalty and ambient factors) governing the selection of mode for grocery purchase. Thereafter Binary-Logistic Regression has been employed to analyze the impact of these five broad perceptual dimensions upon the acceptance/rejection of online grocery retailing. The respondents accorded the highest importance to the factors value for money and convenience. The study suggested that issues like meeting customer expectations and preferences in terms of delivering value for money, quick and convenient purchasing, smooth delivery process, and reducing risk perceptions are ... Conserved proteins preferentially expressed in synaptic terminals of the nervous system are likely to play a significant role in brain function. We have previously identified and molecularly characterized the Sap00 gene which codes for a novel synapse associated protein of kDa in Drosophila. Sequence comparison identifies homologous proteins in numerous species including C. elegans, fish, mouse and human. First hints as to the function of this novel protein family can be obtained by generating mutants for the Sap00 gene in Drosophila.Attempts to eliminate the Sap00 gene through targeted mutagenesis by homologous recombination were unsuccessful. However, several mutants were generated by transposon remobilization after an appropriate insertion line had become available from the Drosophila P-element screen of the Bellen/Hoskins/Rubin/Spradling labs. Characterization of various deletions in the Sap00 gene due to imprecise excision of the P-element identified three null mutants and three h...
    General Agreement on Tariffs and Trade (GATT) was notable in largely excluding agriculture whereas the World Trade Organization (WTO) brought agriculture into the world trade rules. This article aims to evaluate the impacts of trade on agriculture production and productivity, especially the changes between the GATT and WTO periods. Using a panel of countries from , this article derives not only spillover effects that were overlooked, but also provides more accurate productivity than was estimated with bias in literature for both periods. We find that trade hindered agriculture production and productivity in the GATT period but improved agriculture production and productivity in the WTO period. Nigeria is arguably the largest importer of dairy products in Africa. Available statistics shows that up to % of the total dairy products consumed in the country are imported; and that about % of the entire dairy market is controlled by FrieslandCampina WAMCO (FCW). The purpose of this study is to examine the basis for the prevailing import orientation in the dairy industry since . Is the orientation traceable to operations of multinational companies or the institutional and governance challenges in the country? Using triangulated data collected from FCW official reports and other relevant sources, and a content analytical technique, the study finds that the problem in the industry is multifaceted. Central to the challenges are persistent institutional and infrastructural defects, as well as faulty integration designs adopted by FCW. Based on this, the paper recommends that reversing the current trend requires government's policies that dis-incentivizes importation. However, such polic... Questionnaires were mailed to all persons who had been granted the B.S. degree in psychology at Iowa State University during the period to . After one follow-up letter, ( %) returns had been received, these coming from about twice as many males as fema0es.l Respondents were in different states, the District of Columbia, and Ontario. Iowa accounted for almost one-fourth, with Iowa and bordering states accounting for about one-half. The questionnaire called for information concerning advanced training and education, vocational experiences, extent of use of psychological training, membership in professional organizations, professional and/or trade publications read regularly, and relative value of the psychology courses taken. Additional comments were invited on the psychology program and the total undergraduate curriculum at Iowa State. Forty-five per cent (mostly males) had taken, or were taking, graduate work in psychology. This agrees with Gustav's ( ) findings but differs from Dole's...
    As well as being the name of the physical symptom of shivering, shuddering, or goosebumps, the Greek word phrike names an emotion that is particularly associated with automatic responses to sudden visual or auditory stimuli. This makes it especially at home in a number of specialized (ritual and other) scenarios, and helps explain its recurrent role in the ancient Greek aesthetics and literary theory, a role that illustrates the importance of the visual and the physical in ancient theories of audiences' emotional responses to the portrayal of suffering in both dramatic performance and non-dramatic narrative. Oedipus Tyrannos (Sophocles, BCE) is examined psychoanalytically considering the elements of the story as unfolding in such a way as to leave what Freud ( ) called, "gaps unfilled and riddles unanswered" in the history, that is to say, devoid of a truly complex portrayal of human motivation and feeling. The quality of innocence that is dependent on simultaneously knowing and not knowing, both external reality and the internal realities that accompany it, is lost during Oedipus' zealous investigation. Elements of the history, as Sophocles presents it, reveal gaps and riddles that become resolved as the play moves inexorably to its tragic conclusion, with the identification of Oedipus as the parricidal polluter. The solicitations of supernatural consultation from the Delphian oracle betoken a disowned knowing of a frightening or a shameful aspect of human nature without actually acknowledging that knowledge. As the investigation proceeds, the play devolves virtually into a play within a ... Objcetive To study the advantages of treating large serious bed sores by gentamicin and infrared rays.Methods First clean the surface of the wound of the bed sores and trim the dead tissue and siuns.Then wash it with gentamicin liquid, light it with -watt infrared rays for minutes. Cover it with a wet gentamicin gauze, then bind it up with a vaseline gauze and dressing. Do this once a day.Results The treatment of bed sores by gentamicin and infrared rays could obviously shorten the treating time (P0.00),with high cure rate (P0.00) and low death rate (P0.00).Conclusion The gentamicin and infrared rays to treat bed sores can control local infections,accelerate the reproduction of the tissue and resumption,which is more effctive than the traditional treatment.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.6
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 2,174 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 88 tokens
    • mean: 231.0 tokens
    • max: 512 tokens
    • min: 87 tokens
    • mean: 225.6 tokens
    • max: 512 tokens
    • min: 77 tokens
    • mean: 229.89 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    In response to mounting community concern about the use of workplace agreements to erode employee conditions, the Howard Government has introduced substantial changes to the legal framework. These changes include a new Fairness Test and the rebadging of the two main institutions responsible for supervising agreement-making. This article outlines the main elements of these reforms and considers the extent to which the redesigned 'safety net' will offer protection to employees who enter into federal workplace agreements. In Canada, the question of the meaning of the constitutional guarantee of freedom of association has been raised most frequently and persistently in the labour relations context The dual nature of the core labour rights and freedoms of collective organization, collective bargaining, and collective withdrawal of labour as both fundamental human rights and key components of economic policy makes it very difficult for courts to grapple with them in the constitutional context. The Supreme Court of Canada has been badly divided over the scope of collective activities by workers and unions protected by constitutional guarantees of freedom of association over the past years. Precedents have been overturned, and reasons that appeared in dissent subsequently figured in majority decisions. Changes in the composition of the bench as well as the economic and political climate have undermined the achievement of a principled consensus over the constitutional interpretation of section (d) in the cont... It is shown that the presence of an arbitrary body buried near a dielectric highly rough random surface produces a remarkable enhanced backscattering peak in the angular distribution of mean scattered intensity. This is in contrast with the distribution that the dielectric rough surface yields in the absence of the body. In order for the peak to appear, the surface must be very rough and the contrast between the dielectric constants of the body and the medium in which it is immersed must be at least. We illustrate the results with a two-dimensional ( -D) calculation of a cylinder in front of a -D rough profile immersed in a dielectric medium. Different cases have been addressed in order to investigate the dependence of the backscattering enhancement on several physical parameters such as the width of the incident beam; the size, position, and optical constant of the buried cylinder; and the surface correlation function, as well as the difficult task of performing averages that resemble...
    Issues surrounding white privilege have been in continuous debate. In Japan, the subject of white privilege is also not straightforward. Past research has been conducted about white privileged males in Japanese universities. We decided to take a different standpoint and examine the presence of white privilege in Japan through the alternative voices of non-Japanese Asian female university English teachers. By interviewing and analyzing their experiences and identities, we were able to examine incidences of white privilege that happened and influenced their lives as non-Japanese Asian female English teachers in Japan. We hope that our work generates interest and attention to the current gender and racial imbalance of native-speaker university English teachers in Japanan issue that directly or indirectly relates to all students, teachers, administrators and policy makers. The study aims to explore the register variation in Chinese English and language variation between Chinese English and American English. A corpus-based and comparative methodology was used to analyse the discourse features of Chinese English in the use of the lexical items perhaps and maybe. The major findings of the study can be stated as follows: ) the more formal word perhaps is used more frequently than the informal word maybe in all the four genres in Chinese English. This shows that the text of Chinese English is generally in a more formal style. ) In the Chinese English text, the ratios of the standard frequency of perhaps to maybe are greater than those in American English in the all the four genres. This indicates that the text in Chinese English is generally in a more formal style than that in American English. ) In the Chinese English text, the informal word maybe is used less frequently than in the American English text. This is a sign that Chinese English is more formal th... To present a case of a -year-old male patient with primary enuresis refractory to conservative treatment.Radiologic and urodynamic tests revealed posterior urethral valves that were treated by transurethral fulguration. The patient was cured of both enuresis and infravesical obstruction and remains disease-free years after the operation with no impact on his sexual function.Posterior urethral valves are very rarely diagnosed in adolescents and adults. Very few cases have been published in the literature. To our knowledge, the case described herein is the first case presenting with persistent primary enuresis.
    Abstract Reduced temperature and increased bulk density associated with conservation tillage systems cause lower seed germination, seedling emergence, and early growth rates resulting in reduced plant stands. Prediction of the influence of soil condition on seed imbibition through simple soil measurements would help make agronomic decisions such as planting date and/or density. Our objectives were to evaluate the influence of soil waterfilled pore space on winter wheat ( Triticum aestivum L.) seed imbibition and to assess the possibility of describing the relationship through simple mathematical models. We measured the rate of water uptake by heatkilled wheat seeds at three levels of waterfilled pore space (WFPS: , , and ) and temperature ( T : , , and K) and two levels of bulk density ( b : and Mg m ) in a Sharpsburg silty clay loam topsoil. The model proposed in by Blacklow to estimate seed water content ( s ) after imbibing water for time t , s(t) = ( m + ot ) ( m s( ) ) e qt , was ... Abstract Camellia oleifera Abel. is an important woody oil plant that could solve the disparity between the supply and demand of edible oil in China. Although numerous excellent clones have been developed and introduced to enhance production, the interactions between high yield clones and soil ecosystem sustainability remains poorly understood. The highyielding period of a C. oleifera plant is approximately yr; therefore, appropriate clone selection is crucial. We evaluated the differences between four major clones based on soil nutrient and microbial community structure, following cultivation for yr to infer the influence of clone selection on soil sustainable utilization. The results showed significant differences in soil nutrient status and rhizosphere microorganism populations among clones. The bacterial communities in the XL0 plots had the highest species richness. According to the results, clone XL0 was found to be suitable for sustainable cultivation considering the high organic... AbstractIn the wake of the present effort to shape American education to produce large numbers of scientists and engineers, education in the arts is becoming an increasingly peripheral interest of the schools. The time that public school pupils have at their disposal is limited, and so is the money available for financing schools. To a public rightly concerned with these limitations, educators in the arts must be able to present convincing arguments to show that the arts are in fact worth teaching. And to this same public, weary and perhaps disillusioned with what it takes to be a chaos of relativistic aesthetic standards, educators must be able to say that taste is not wholly an individual matter, but that some standards do in fact exist and are available for teaching. Without being able to argue in some way for the importance of education in the arts and for the existence of aesthetic standards, educators who propose that the schools teach the arts are in effect asking the public to ...
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.6
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 12
  • per_device_eval_batch_size: 12
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 1
  • warmup_ratio: 0.2
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 12
  • per_device_eval_batch_size: 12
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss specter_2__cosine_accuracy discipline-tuned_specter_2_019_cosine_accuracy
0 0 - - 0.9548 -
0.0028 10 0.4852 - - -
0.0138 50 0.4828 0.4496 0.9609 -
0.0276 100 0.3925 0.3053 0.9667 -
0.0414 150 0.2343 0.1939 0.9675 -
0.0552 200 0.1748 0.1473 0.9718 -
0.0690 250 0.1333 0.1148 0.9726 -
0.0828 300 0.11 0.1008 0.9703 -
0.0966 350 0.1109 0.0944 0.9692 -
0.1103 400 0.0946 0.0918 0.9725 -
0.1241 450 0.1052 0.0897 0.9707 -
0.1379 500 0.0942 0.0836 0.9721 -
0.1517 550 0.0813 0.0870 0.9654 -
0.1655 600 0.091 0.0901 0.9691 -
0.1793 650 0.0942 0.0874 0.9690 -
0.1931 700 0.0825 0.0958 0.9617 -
0.2069 750 0.0971 0.0850 0.9733 -
0.2207 800 0.0872 0.0806 0.9702 -
0.2345 850 0.0801 0.0824 0.9682 -
0.2483 900 0.0851 0.0809 0.9695 -
0.2621 950 0.0914 0.0790 0.9708 -
0.2759 1000 0.0847 0.0799 0.9720 -
0.2897 1050 0.0895 0.0754 0.9717 -
0.3034 1100 0.0756 0.0802 0.9706 -
0.3172 1150 0.0814 0.0786 0.9694 -
0.3310 1200 0.0997 0.0744 0.9734 -
0.3448 1250 0.0943 0.0762 0.9730 -
0.3586 1300 0.0805 0.0782 0.9718 -
0.3724 1350 0.079 0.0748 0.9732 -
0.3862 1400 0.0818 0.0755 0.9737 -
0.4 1450 0.0671 0.0729 0.9734 -
0.4138 1500 0.0567 0.0737 0.9720 -
0.4276 1550 0.0747 0.0746 0.9726 -
0.4414 1600 0.0793 0.0735 0.9717 -
0.4552 1650 0.0812 0.0762 0.9739 -
0.4690 1700 0.075 - - 0.9740

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.49.0.dev0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
6
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for m7n/discipline-tuned_specter_2_019

Finetuned
(6)
this model

Evaluation results