metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:46338
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-m-v2.0
widget:
- source_sentence: >-
What role does ESMA play in the development of guidelines and regulatory
technical standards related to cooperation arrangements with third
countries as mentioned in the text?
sentences:
- >-
If a planned change is implemented notwithstanding the first and second
subparagraphs, or if an unplanned change has taken place pursuant to
which the AIFM’s management of the AIF would no longer comply with this
Directive or the AIFM otherwise would no longer comply with this
Directive, the competent authorities of the home Member State of the
AIFM shall take all due measures in accordance with Article 46,
including, if necessary, the express prohibition of marketing of the
AIF.
If the changes are acceptable because they do not affect the compliance
of the AIFM’s management of the AIF with this Directive, or the
compliance by the AIFM with this Directive otherwise, the competent
authorities of the home Member State of the AIFM shall, without delay,
inform ESMA in so far as the changes concern the termination of the
marketing of certain AIFs or additional AIFs marketed and, if
applicable, the competent authorities of the host Member States of the
AIFM of those changes.
11.
The Commission shall adopt, by means of delegated acts in accordance
with Article 56 and subject to the conditions of Articles 57 and 58,
measures regarding the cooperation arrangements referred to in point (a)
of paragraph 2 in order to design a common framework to facilitate the
establishment of those cooperation arrangements with third countries.
12.
In order to ensure uniform application of this Article, ESMA may develop
guidelines to determine the conditions of application of the measures
adopted by the Commission regarding the cooperation arrangements
referred to in point (a) of paragraph 2.
13.
ESMA shall develop draft regulatory technical standards to determine the
minimum content of the cooperation arrangements referred to in point (a)
of paragraph 2 so as to ensure that both the competent authorities of
the home and the host Member States receive sufficient information in
order to be able to exercise their supervisory and investigatory powers
under this Directive.
Power is delegated to the Commission to adopt the regulatory technical
standards referred to in the first subparagraph in accordance with
Article 10 to 14 of Regulation (EU) No 1095/2010.
14.
- >-
(23) This Regulation should also apply to Union institutions, bodies,
offices and agencies when acting as a provider or deployer of an AI
system.
- >-
An operator that is a natural person or a microenterprise may mandate
the next operator or trader further down the supply chain that is not a
natural person or a microenterprise to act as an authorised
representative. Such next operator or trader further down the supply
chain shall not place or make available relevant products on the market
or export them without submitting the due diligence statement pursuant
to Article 4(2) on behalf of that operator. In such cases, the operator
that is a natural person or a microenterprise shall retain
responsibility for compliance of the relevant product with Article 3,
and shall communicate to that next operator or trader further down the
supply chain all information necessary to confirm that due
- source_sentence: >-
A review is scheduled for June 2019 to determine if the regulations
regarding hazardous substances should be broadened, based on practical
experiences. Additionally, the Commission aims to promote alternatives to
animal testing by reassessing testing requirements, potentially leading to
amendments that prioritize health and environmental safety.
sentences:
- >-
18 June 1994, until such plant and machinery is disposed of; (b) in the
case of the maintenance of plant and machinery already in service within
a Member State on 18 June 1994. For the purposes of point (a) Member
States may, on grounds of human health protection and environmental
protection, prohibit within their territory the use of such plant or
machinery before it is disposed of. 25. Monomethyl-dichloro-diphenyl
methane Trade name: Ugilec 121 Ugilec 21 Shall not be placed on the
market, or used, as a substance or in mixtures. Articles containing the
substance shall not be placed on the market. 26.
Monomethyl-dibromo-diphenyl methane bromobenzylbromotoluene, mixture of
isomers Trade name: DBBT CAS No 99688-47-8 Shall not be placed on
- >-
(35) | The fight against litter is a shared effort between competent
authorities, producers and consumers. Public authorities, including the
Union institutions, should lead by example.
- >-
7.
By 1 June 2013 the Commission shall carry out a review to assess whether
or not, taking into account latest developments in scientific knowledge,
to extend the scope of Article 60(3) to substances identified under
Article 57(f) as having endocrine disrupting properties. On the basis of
that review the Commission may, if appropriate, present legislative
proposals.
8.
By 1 June 2019, the Commission shall carry out a review to assess
whether or not to extend the scope of Article 33 to cover other
dangerous substances, taking into account the practical experience in
implementing that Article. On the basis of that review, the Commission
may, if appropriate, present legislative proposals to extend that
obligation.
9.
In accordance with the objective of promoting non-animal testing and the
replacement, reduction or refinement of animal testing required under
this Regulation, the Commission shall review the testing requirements of
Section 8.7 of Annex VIII by 1 June 2019. On the basis of this review,
while ensuring a high level of protection of health and the environment,
the Commission may propose an amendment in accordance with the procedure
referred to in Article 133(4).
Article 139
Repeals
Directive 91/155/EEC shall be repealed.
Directives 93/105/EC and 2000/21/EC and Regulations (EEC) No 793/93 and
(EC) No 1488/94 shall be repealed with effect from 1 June 2008.
Directive 93/67/EEC shall be repealed with effect from 1 August 2008.
Directive 76/769/EEC shall be repealed with effect from 1 June 2009.
References to the repealed acts shall be construed as references to this
Regulation.
Article 140
Amendment of Directive 1999/45/EC
Article 14 of Directive 1999/45/EC shall be deleted.
Article 141
Entry into force and application
1.
This Regulation shall enter into force on 1 June 2007.
2.
Titles II, III, V, VI, VII, XI and XII as well as Articles 128 and 136
shall apply from 1 June 2008.
3.
Article 135 shall apply from 1 August 2008.
4.
Title VIII and Annex XVII shall apply from 1 June 2009.
This Regulation shall be binding in its entirety and directly applicable
in all Member States.
LIST OF ANNEXES
ANNEX I GENERAL PROVISIONS FOR ASSESSING SUBSTANCES AND PREPARING
CHEMICAL SAFETY REPORTS ANNEX II REQUIREMENTS FOR THE COMPILATION OF
SAFETY DATA SHEETS ANNEX III CRITERIA FOR SUBSTANCES REGISTERED IN
QUANTITIES BETWEEN 1 AND 10 TONNES ANNEX IV EXEMPTIONS FROM THE
OBLIGATION TO REGISTER IN ACCORDANCE WITH ARTICLE 2(7)(a) ANNEX V
EXEMPTIONS FROM THE OBLIGATION TO REGISTER IN ACCORDANCE WITH ARTICLE
2(7)(b) ANNEX VI INFORMATION REQUIREMENTS REFERRED TO IN ARTICLE 10
ANNEX VII STANDARD INFORMATION REQUIREMENTS FOR SUBSTANCES MANUFACTURED
OR IMPORTED IN QUANTITIES OF ONE TONNE OR MORE ANNEX VIII STANDARD
INFORMATION REQUIREMENTS FOR SUBSTANCES MANUFACTURED OR IMPORTED IN
QUANTITIES OF 10 TONNES OR MORE ANNEX IX STANDARD INFORMATION
REQUIREMENTS FOR SUBSTANCES MANUFACTURED OR IMPORTED IN QUANTITIES OF
100 TONNES OR MORE ANNEX X STANDARD INFORMATION REQUIREMENTS FOR
SUBSTANCES MANUFACTURED OR IMPORTED IN QUANTITIES OF 1 000 TONNES OR
MORE ANNEX XI GENERAL RULES FOR ADAPTATION OF THE STANDARD TESTING
REGIME SET OUT IN ANNEXES VII TO X ANNEX XII GENERAL PROVISIONS FOR
DOWNSTREAM USERS TO ASSESS SUBSTANCES AND PREPARE CHEMICAL SAFETY
REPORTS ANNEX XIII CRITERIA FOR THE IDENTIFICATION OF PERSISTENT,
BIOACCUMULATIVE AND TOXIC SUBSTANCES, AND VERY PERSISTENT AND VERY
BIOACCUMULATIVE SUBSTANCES ANNEX XIV LIST OF SUBSTANCES SUBJECT TO
AUTHORISATION ANNEX XV DOSSIERS ANNEX XVI SOCIO-ECONOMIC ANALYSIS ANNEX
XVII RESTRICTIONS ON THE MANUFACTURE, PLACING ON THE MARKET AND USE OF
CERTAIN DANGEROUS SUBSTANCES, MIXTURES AND ARTICLES
ANNEX I
GENERAL PROVISIONS FOR ASSESSING SUBSTANCES AND PREPARING CHEMICAL
SAFETY REPORTS
0. INTRODUCTION
▼M51
- source_sentence: >-
What actions must the Commission take if the economic operator does not
provide commitments or if the provided commitments are deemed
inappropriate or insufficient to address the distortion?
sentences:
- >-
2.
Where the economic operator concerned does not offer commitments or
where the Commission considers that the commitments referred to in
paragraph 1 are neither appropriate nor sufficient to fully and
effectively remedy the distortion, the Commission shall adopt an
implementing act in the form of a decision prohibiting the award of the
contract to the economic operator concerned (‘decision prohibiting the
award of the contract’). That implementing act shall be adopted in
accordance with the advisory procedure referred to in Article 48(2).
Following that decision, the contracting authority or contracting entity
shall reject the tender.
3.
- >-
6,5 8,9 (1) The values for biogas production from manure include
negative emissions for emissions saved from raw manure management. The
value of esca considered is equal to – 45 g CO2eq/MJ manure used in
anaerobic digestion. (2) Maize whole plant means maize harvested as
fodder and ensiled for preservation. (3) Transport of agricultural raw
materials to the transformation plant is, according to the methodology
provided in the Commission's report of 25 February 2010 on
sustainability requirements for the use of solid and gaseous biomass
sources in electricity, heating and cooling, included in the
‘cultivation’ value. The value for transport of maize silage accounts
for 0,4 g CO2eq/MJ biogas.
- >-
reduction in the consumption of lightweight plastic carrier bags. It
should be possible for Member States, while observing the general rules
laid down in the TFEU and acting in accordance with this Regulation, to
adopt provisions which go beyond the minimum waste prevention targets
set out in this Regulation. When implementing such measures, Member
States should be aware of the risk of a shift from heavier to lighter
packaging materials and should prioritise measures that minimise that
risk.
- source_sentence: >-
The content provides a comprehensive overview of numerous chemical
substances, including their structural formulas and potential
applications. It emphasizes the significance of specific compounds like
acrylamide and thioacetamide, while also addressing mixtures derived from
coal tar. The information reflects the intricate nature of chemical
synthesis and the importance of understanding the properties and uses of
these compounds in various industrial contexts.
sentences:
- >-
2.
Each Member State shall ensure that a producer as defined in Article
3(1)(f)(iv) and established on its territory, which sells EEE to another
Member State in which it is not established, appoints an authorised
representative in that Member State as the person responsible for
fulfilling the obligations of that producer, pursuant to this Directive,
on the territory of that Member State.
3.
Appointment of an authorised representative shall be by written mandate.
Article 18
Administrative cooperation and exchange of information
- >-
(a) display to customers and potential customers, in a visible manner,
the labels provided in accordance with Article 32(1), point (b) or (c);
(b) make reference to the information included on the labels provided in
accordance with Article 32(1), point (b) or (c), in visual
advertisements or in technical promotional material for a specific
model, in accordance with the applicable delegated acts adopted pursuant
to Article 4; and --- --- (c) not provide or display other labels,
marks, symbols or inscriptions that are likely to mislead or confuse
customers and potential customers with regard to the information
included on the label regarding ecodesign requirements. --- ---
Article 32
Obligations related to labels
- >-
[2] 612-196-00-0 202-441-6 [1] 221-627-8 [2] 95-69-2 [1] 3165-93-3 [2]
►M5 — ◄ 2,4,5-Trimethylaniline [1] 2,4,5-trimethylaniline hydrochloride
[2] 612-197-00-6 205-282-0 [1] -[2] 137-17-7 [1] 21436-97-5 [2] ►M5 — ◄
4,4'-Thiodianiline [1] and its salts 612-198-00-1 205-370-9 [1] 139-65-1
[1] ►M5 — ◄ 4,4'-Oxydianiline [1] and its salts p-Aminophenyl ether [1]
612-199-00-7 202-977-0 [1] 101-80-4 [1] ►M5 — ◄ 2,4-Diaminoanisole [1]
4-methoxy-m-phenylenediamine 2,4-diaminoanisole sulphate [2]
612-200-00-0 210-406-1 [1] 254-323-9 [2] 615-05-4 [1] 39156-41-7 [2] N,
N,N',N'-tetramethyl-4,4'-methylendianiline 612-201-00-6 202-959-2
101-61-1 C.I. Basic Violet 3 with ≥ 0,1 % of Michler's ketone (EC No
202-027-5) 612-205-00-8 208-953-6 548-62-9 ►M5 — ◄ 6-Methoxy-m-toluidine
p-cresidine 612-209-00-X 204-419-1 120-71-8 ►M5 — ◄
[▼M14](./../../../legal-content/EN/AUTO/?uri=celex:32012R0109
"32012R0109: INSERTED") Biphenyl-3,3′,4,4′-tetrayltetraamine;
Diaminobenzidine 612-239-00-3 202-110-6 91-95-2
(2-chloroethyl)(3-hydroxypropyl)ammonium chloride 612-246-00-1 429-740-6
40722-80-3 3-Amino-9-ethyl carbazole; 9-Ethylcarbazol-3-ylamine
612-280-00-7 205-057-7 132-32-1
[▼M49](./../../../legal-content/EN/AUTO/?uri=celex:32018R0675
"32018R0675: INSERTED") Reaction products of paraformaldehyde and
2-hydroxypropylamine (ratio 3:2); [formaldehyde released from
3,3′-methylenebis[5-methyloxazolidine]; formaldehyde released from
oxazolidin]; [MBO] 612-290-00-1 — — Reaction products of
paraformaldehyde with 2-hydroxypropylamine (ratio 1:1); [formaldehyde
released from
α,α,α-trimethyl-1,3,5-triazine-1,3,5(2H,4H,6H)-triethanol]; [HPT]
612-291-00-7 — — Methylhydrazine 612-292-00-2 200-471-4 60-34-4
[▼C1](./../../../legal-content/EN/AUTO/?uri=celex:32006R1907R%2801%29
"32006R1907R(01): REPLACED") Ethyleneimine; aziridine 613-001-00-1
205-793-9 151-56-4 2-Methylaziridine; propyleneimine 613-033-00-6
200-878-7 75-55-8 ►M5 — ◄ Captafol (ISO);
1,2,3,6-tetrahydro-N-(1,1,2,2-tetrachloroethylthio) phthalimide
613-046-00-7 219-363-3 2425-06-1 Carbadox (INN); methyl
3-(quinoxalin-2-ylmethylene)carbazate 1,4-dioxide;
2-(methoxycarbonylhydrazonomethyl) quinoxaline 1,4-dioxide 613-050-00-9
229-879-0 6804-07-5 A mixture of:
1,3,5-tris(3-aminomethylphenyl)-1,3,5-(1H,3H,5H)-triazine-2,4,6-trione;
a mixture of oligomers of
3,5-bis(3-aminomethylphenyl)-1-poly[3,5-bis(3-aminomethylphenyl)-2,4,6-trioxo-1,3,5-(1H,3H,5H)-triazin-1-yl]-1,3,5-(1H,3H,5H)-triazine-2,4,6-trione
613-199-00-X 421-550-1 —
[▼M14](./../../../legal-content/EN/AUTO/?uri=celex:32012R0109
"32012R0109: INSERTED") Quinoline 613-281-00-5 202-051-6 91-22-5
[▼C1](./../../../legal-content/EN/AUTO/?uri=celex:32006R1907R%2801%29
"32006R1907R(01): REPLACED") Acrylamide 616-003-00-0 201-173-7 79-06-1
[▼M69](./../../../legal-content/EN/AUTO/?uri=celex:32021R2204
"32021R2204: INSERTED") Butanone oxime; ethyl methyl ketoxime; ethyl
methyl ketone oxime 616-014-00-0 202-496-6 96-29-7
[▼C1](./../../../legal-content/EN/AUTO/?uri=celex:32006R1907R%2801%29
"32006R1907R(01): REPLACED") Thioacetamide 616-026-00-6 200-541-4
62-55-5 A mixture of:
N-[3-hydroxy-2-(2-methylacryloylamino-methoxy)propoxymethyl]-2-methylacrylamide;
N-[2,3-Bis-(2-methylacryloylamino-methoxy)propoxymethyl]-2-methylacrylamide;
methacrylamide;
2-methyl-N-(2-methyl-acryloylaminomethoxymethyl)-acrylamide;
N-2,3-dihydroxypropoxymethyl)-2-methylacrylamide 616-057-00-5 412-790-8
— [▼M14](./../../../legal-content/EN/AUTO/?uri=celex:32012R0109
"32012R0109: INSERTED")
N-[6,9-dihydro-9-[[2-hydroxy-1-(hydroxymethyl)ethoxy]methyl]-6-oxo-1H-purin-2-yl]acetamide
616-148-00-X 424-550-1 84245-12-5
[▼M69](./../../../legal-content/EN/AUTO/?uri=celex:32021R2204
"32021R2204: INSERTED") N-(hydroxymethyl)acrylamide; methylolacrylamide;
[NMA] 616-230-00-5 213-103-2 924-42-5
[▼C1](./../../../legal-content/EN/AUTO/?uri=celex:32006R1907R%2801%29
"32006R1907R(01): REPLACED") Distillates (coal tar), benzole fraction;
Light oil (A complex combination of hydrocarbons obtained by the
distillation of coal tar. It consists of hydrocarbons having carbon
numbers primarily in the range of C4 to C10 and distilling in the
approximate range of 80 to 160 °C.) 648-001-00-0 283-482-7 84650-02-2
Tar oils, brown-coal; Light oil (The distillate from lignite tar boiling
in the range of approximately 80 to 250 °C. Composed primarily of
aliphatic and aromatic hydrocarbons and monobasic phenols.) 648-002-00-6
302-674-4 94114-40-6 J Benzol forerunnings (coal); Light oil
redistillate, low boiling
- source_sentence: >-
How does the new Eurostat methodology differ in scope from the indicators
used in this Directive for calculating energy consumption?
sentences:
- >-
(29) The methodology for calculation of primary energy consumption and
final energy consumption is aligned with the new Eurostat methodology,
but the indicators used for the purpose of this Directive have a
different scope, in that they exclude ambient energy and include energy
consumption in international aviation for the targets in primary energy
consumption and final energy consumption. The use of new indicators also
implies that any changes in energy consumption of blast furnaces are now
only reflected in primary energy consumption.
- >-
(92) InvestEU is the Union flagship programme to boost investment,
especially the green and digital transition, by providing financing and
technical assistance, for instance through blending mechanisms. Such an
approach contributes to crowd in additional public and private capital.
Moreover, Member States are encouraged to contribute to the InvestEU
Member State compartment to support financial products available to
net-zero technology manufacturing, without prejudice to applicable State
aid rules.
- >-
be used, filled or transported through the system; --- --- (iii) specify
the terms and conditions for proper handling and packaging use; --- ---
(iv) specify detailed requirements for packaging reconditioning; --- ---
(v) specify the requirements for packaging collection; --- --- (vi)
specify the requirements for packaging storage; --- --- (vii) specify
the requirements for packaging filling or uploading; --- --- (viii)
specify rules to ensure the effective and efficient collection of
reusable packaging, including by providing for incentives for end users
to return the packaging to the collection points or grouped collection
system; --- --- (ix) specify rules to ensure equal and fair access to
the re-use system, including for vulnerable
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-v2.0
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.7136198860693941
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9243915069911963
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9589159330226135
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.981874676333506
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7136198860693941
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.30813050233039874
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1917831866045227
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09818746763335057
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.7136198860693941
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9243915069911963
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9589159330226135
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.981874676333506
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.8626251072928146
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8227635844026309
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8236564067385257
name: Cosine Map@100
SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-v2.0
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m-v2.0. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-m-v2.0
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: GteModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'How does the new Eurostat methodology differ in scope from the indicators used in this Directive for calculating energy consumption?',
'(29) The methodology for calculation of primary energy consumption and final energy consumption is aligned with the new Eurostat methodology, but the indicators used for the purpose of this Directive have a different scope, in that they exclude ambient energy and include energy consumption in international aviation for the targets in primary energy consumption and final energy consumption. The use of new indicators also implies that any changes in energy consumption of blast furnaces are now only reflected in primary energy consumption.',
'(92) InvestEU is the Union flagship programme to boost investment, especially the green and digital transition, by providing financing and technical assistance, for instance through blending mechanisms. Such an approach contributes to crowd in additional public and private capital. Moreover, Member States are encouraged to contribute to the InvestEU Member State compartment to support financial products available to net-zero technology manufacturing, without prejudice to applicable State aid rules.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.7136 |
cosine_accuracy@3 | 0.9244 |
cosine_accuracy@5 | 0.9589 |
cosine_accuracy@10 | 0.9819 |
cosine_precision@1 | 0.7136 |
cosine_precision@3 | 0.3081 |
cosine_precision@5 | 0.1918 |
cosine_precision@10 | 0.0982 |
cosine_recall@1 | 0.7136 |
cosine_recall@3 | 0.9244 |
cosine_recall@5 | 0.9589 |
cosine_recall@10 | 0.9819 |
cosine_ndcg@10 | 0.8626 |
cosine_mrr@10 | 0.8228 |
cosine_map@100 | 0.8237 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 46,338 training samples
- Columns:
query_text
anddoc_text
- Approximate statistics based on the first 1000 samples:
query_text doc_text type string string details - min: 9 tokens
- mean: 39.44 tokens
- max: 311 tokens
- min: 7 tokens
- mean: 233.15 tokens
- max: 1900 tokens
- Samples:
query_text doc_text The regulation's applicability extends to various stakeholders involved in AI systems, including providers, deployers, importers, and manufacturers, regardless of their location. It specifically addresses high-risk AI systems and outlines the limitations of its scope, particularly concerning national security and military applications. Additionally, it clarifies that it does not interfere with the responsibilities of member states regarding national security or the operations of public authorities and international organizations in specific contexts.
(180) The European Data Protection Supervisor and the European Data Protection Board were consulted in accordance with Article 42(1) and (2) of Regulation (EU) 2018/1725 and delivered their joint opinion on 18 June 2021,
HAVE ADOPTED THIS REGULATION:
CHAPTER I
GENERAL PROVISIONS
Article 1
Subject matter`
1. The purpose of this Regulation is to improve the functioning of the internal market and promote the uptake of human-centric and trustworthy artificial intelligence (AI), while ensuring a high level of protection of health, safety, fundamental rights enshrined in the Charter, including democracy, the rule of law and environmental protection, against the harmful effects of AI systems in the Union and supporting innovation.
2. This Regulation lays down:
(a) harmonised rules for the placing on the market, the putting into service, and the use of AI systems in the Union; (b) prohibitions of certain AI practices; --- --- (c) specific requirements for high-risk AI systems and oblig...How should loans with unknown use of proceeds be allocated in terms of sectors and alignment metrics?
instruments. For loans whose use of proceeds is known, the value shall be included for the relevant sector and alignment metric. For loans whose use of proceeds is unknown, the gross carrying amount of the exposure shall be allocated to the relevant sectors and alignment metrics based on the counterparties’ activity distribution, including by counterparties’ turnover by activity. Institutions shall add a row in the template for each relevant combination of sectors disclosed in column (b) and alignment metrics included in column (d). ---
What measures must AIFMs implement to ensure they do not rely solely on credit ratings for assessing the creditworthiness of AIFs' assets?
▼M1
The measures specifying the risk-management systems referred to in point (a) of the first subparagraph shall ensure that the AIFMs are prevented from relying solely or mechanistically on credit ratings, as referred to in the first subparagraph of paragraph 2, for assessing the creditworthiness of the AIFs’ assets.
▼B
Article 16
Liquidity management
1.
AIFMs shall, for each AIF that they manage which is not an unleveraged closed- ended AIF, employ an appropriate liquidity management system and adopt procedures which enable them to monitor the liquidity risk of the AIF and to ensure that the liquidity profile of the investments of the AIF complies with its underlying obligations. - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepslearning_rate
: 2e-05num_train_epochs
: 4warmup_ratio
: 0.1fp16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 8per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | cosine_ndcg@10 |
---|---|---|---|
-1 | -1 | - | 0.7763 |
0.0863 | 500 | 0.2343 | - |
0.1726 | 1000 | 0.1259 | 0.814 |
0.2589 | 1500 | 0.1027 | - |
0.3452 | 2000 | 0.0757 | 0.8288 |
0.4316 | 2500 | 0.0617 | - |
0.5179 | 3000 | 0.0651 | 0.8288 |
0.6042 | 3500 | 0.0863 | - |
0.6905 | 4000 | 0.06 | 0.8376 |
0.7768 | 4500 | 0.0579 | - |
0.8631 | 5000 | 0.0593 | 0.8342 |
0.9494 | 5500 | 0.0485 | - |
1.0357 | 6000 | 0.0465 | 0.8384 |
1.1220 | 6500 | 0.0276 | - |
1.2084 | 7000 | 0.0353 | 0.8392 |
1.2947 | 7500 | 0.0335 | - |
1.3810 | 8000 | 0.0292 | 0.8436 |
1.4673 | 8500 | 0.0276 | - |
1.5536 | 9000 | 0.0404 | 0.8485 |
1.6399 | 9500 | 0.0476 | - |
1.7262 | 10000 | 0.0265 | 0.8601 |
1.8125 | 10500 | 0.017 | - |
1.8988 | 11000 | 0.0217 | 0.8549 |
1.9852 | 11500 | 0.0329 | - |
2.0715 | 12000 | 0.0207 | 0.8577 |
2.1578 | 12500 | 0.0199 | - |
2.2441 | 13000 | 0.015 | 0.8544 |
2.3304 | 13500 | 0.0143 | - |
2.4167 | 14000 | 0.0117 | 0.8574 |
2.5030 | 14500 | 0.0204 | - |
2.5893 | 15000 | 0.0141 | 0.8595 |
2.6756 | 15500 | 0.0123 | - |
2.7620 | 16000 | 0.0211 | 0.8538 |
2.8483 | 16500 | 0.0207 | - |
2.9346 | 17000 | 0.0134 | 0.8562 |
3.0209 | 17500 | 0.0276 | - |
3.1072 | 18000 | 0.0106 | 0.8552 |
3.1935 | 18500 | 0.0129 | - |
3.2798 | 19000 | 0.0157 | 0.8582 |
3.3661 | 19500 | 0.0164 | - |
3.4524 | 20000 | 0.0192 | 0.8614 |
3.5388 | 20500 | 0.0138 | - |
3.6251 | 21000 | 0.0141 | 0.8601 |
3.7114 | 21500 | 0.0109 | - |
3.7977 | 22000 | 0.0178 | 0.8605 |
3.8840 | 22500 | 0.0088 | - |
3.9703 | 23000 | 0.0255 | 0.8626 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.15
- Sentence Transformers: 4.0.2
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu126
- Accelerate: 0.26.0
- Datasets: 3.5.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}