suhwan3's picture
Upload fine-tuned model
2c55ed2 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:2752
  - loss:TripletLoss
base_model: sentence-transformers/all-MiniLM-L12-v2
widget:
  - source_sentence: >-
      The First Trust Financials AlphaDEX ETF (FXO) employs a strategic
      management approach aimed at delivering investment results that align with
      the StrataQuant® Financials Index. The ETF focuses primarily on large- and
      mid-cap U.S. financial stocks, investing at least 90% of its net assets in
      securities derived from the Russell 1000® Index. Utilizing the AlphaDEX®
      selection methodology, FXO identifies and targets stocks poised to
      generate positive alpha by applying a multi-factor, quantitative model.
      This model assesses potential outperformers on a risk-adjusted basis,
      which facilitates the selection of securities that are then tiered and
      equal-weighted, leading to a mid-cap bias and occasional tilts toward
      non-financial sectors. The ETF undergoes a reconstitution and rebalancing
      process on a quarterly basis, with the objective of outperforming
      traditional passive indices, thereby enhancing returns for investors while
      maintaining a focus on the financial sector.
    sentences:
      - >-
        The SPDR S&P Software & Services ETF (XSW) employs a strategic
        management approach aimed at closely tracking the performance of the S&P
        Software & Services Select Industry Index. By utilizing a sampling
        strategy, XSW invests a minimum of 80% of its total assets in securities
        that fall within this index, which represents a focused segment of the
        broader S&P Total Market Index, specifically targeting the software and
        services sectors. To address the concentration risks often associated
        with large-cap companies in the software industry, XSW adopts an
        equal-weighted methodology. This approach mitigates the influence of
        larger firms and allows for greater exposure to smaller, growth-oriented
        companies. Consequently, the ETF encompasses a diverse array of software
        and services firms, with a particular emphasis on the services sector.
        The index undergoes quarterly rebalancing, ensuring that the portfolio
        remains diversified and aligned with its investment objectives, thereby
        providing investors with a balanced exposure to this dynamic industry.
      - >-
        The Direxion Energy Bull 2X Shares (ERX) ETF is strategically designed
        to provide investors with 200% of the daily performance of the S&P
        Energy Select Sector Index. This index encompasses large-cap U.S. energy
        companies, focusing on sectors such as oil, gas, consumable fuels, and
        energy equipment and services. To achieve its leveraged exposure, the
        fund allocates at least 80% of its net assets into financial instruments
        like swap agreements and securities that directly track the performance
        of the index. As a non-diversified and market-cap-weighted fund, ERX is
        concentrated in a limited number of dominant firms within the energy
        sector. The ETF is primarily intended for short-term trading, as it
        rebalances daily to maintain its leverage. Investors should be aware
        that the returns of ERX can be volatile and unpredictable over longer
        time frames due to factors like compounding and path dependency, making
        it suitable for those with a high risk tolerance and a short investment
        horizon.
      - >-
        The Direxion Financial Bull 3X Shares ETF (FAS) is strategically
        designed to deliver 300% of the daily performance of the Financials
        Select Sector Index, utilizing a 3x leveraged exposure framework. This
        ETF is managed with a focus on short-term tactical opportunities,
        employing daily rebalancing to align with the index's movements. FAS
        allocates at least 80% of its net assets in a range of financial
        instruments, including swap agreements, direct securities of the index,
        and ETFs that mirror the index's composition. The targeted sectors
        encompass a broad spectrum of the financial industry, such as financial
        services, insurance, banking, capital markets, mortgage real estate
        investment trusts (REITs), and consumer finance. Given its
        non-diversified nature and reliance on leverage, FAS is primarily
        suitable for investors seeking short-term gains and is not recommended
        for long-term holding due to the potential compounding effects and path
        dependency associated with leveraged investments.
  - source_sentence: >-
      The ProShares Big Data Refiners ETF (DAT) aims to track the performance of
      the FactSet Big Data Refiners Index, focusing on global companies involved
      in managing, storing, using, and analyzing large structured and
      unstructured datasets. The fund invests at least 80% of its assets in
      index components or similar instruments, targeting companies that derive
      at least 75% of their revenue from big data activities, with adjustments
      if fewer than 25 companies meet this threshold. It employs a
      market-cap-weighted approach, capping individual securities at 4.5%, and
      includes firms from developed and emerging markets with a minimum market
      cap of $500 million and a three-month average daily trading value of at
      least $1 million. The index is reconstituted and rebalanced semiannually
      in June and December, and the fund is non-diversified.
    sentences:
      - >-
        The Invesco S&P SmallCap Information Technology ETF (PSCT) is designed
        to replicate the investment performance of the S&P SmallCap 600 Capped
        Information Technology Index, allocating a minimum of 90% of its total
        assets to the securities within this index. This index, curated by S&P
        Dow Jones Indices, evaluates the performance of U.S. small-cap firms in
        the information technology sector, as categorized by the Global Industry
        Classification Standard. PSCT provides focused exposure to small-cap
        technology companies across various industries, including computer
        hardware, software, internet services, electronics, semiconductors, and
        communication technologies. The fund employs a market-cap-weighted
        approach, with individual security weights capped at 22.5% and the total
        weight of securities exceeding 4.5% limited to 45% of the portfolio. To
        preserve its focus on size, liquidity, and financial viability, the
        index is rebalanced quarterly, ensuring an adaptive investment strategy
        that aligns with evolving market conditions.
      - >-
        The ALPS Active REIT ETF (ticker: REIT) is a type of investment fund
        that aims to make money through both income from dividends and increases
        in the value of its investments. It primarily invests at least 80% of
        its money in stocks of U.S. Real Estate Investment Trusts (REITs), which
        are companies that own and manage real estate properties. The fund
        mainly focuses on common stocks of these REITs but also puts some money
        into other types of real estate-related stocks, like preferred stocks
        and companies that operate in real estate. The fund's managers use a
        special method to assess the true value of the properties and the REITs
        to make informed investment choices. It's important to note that this
        ETF is non-diversified, meaning it doesn't spread its investments across
        many different areas. Additionally, it changed its structure to a more
        transparent format on August 22, 2023.
      - >-
        The First Trust Amex Biotech Index ETF (FBT) aims to replicate the
        performance of the NYSE Arca Biotechnology Index by investing at least
        90% of its net assets in the index's securities. This equal-dollar
        weighted index comprises 30 leading biotechnology companies, offering
        exposure to firms involved in biological processes for product
        development and services. FBT's portfolio, reconstituted and rebalanced
        quarterly, provides a concentrated yet broad exposure to the biotech
        sector, potentially including pharmaceuticals and medical technology.
        The ETF's strategy ensures a diversified investment in the dynamic
        biotech industry, reflecting both price and yield movements before fees
        and expenses.
  - source_sentence: >-
      The First Trust Utilities AlphaDEX ETF (FXU) seeks to achieve investment
      results that correspond to the StrataQuant® Utilities Index, focusing on
      large- and mid-cap utility firms in the US. The fund invests at least 90%
      of its net assets in securities from the index, which is a modified
      equal-dollar weighted index derived from the Russell 1000® Index. FXU
      employs the AlphaDEX® selection methodology, using a quant-based model to
      select stocks based on growth and value metrics, aiming to generate
      positive alpha. This smart beta approach results in a portfolio with a
      significant tilt toward mid-caps and includes a notable allocation to
      telecom companies. The index is reconstituted and rebalanced quarterly,
      offering a strategic alternative to traditional market-like sector
      exposure.
    sentences:
      - >-
        The Goldman Sachs Future Consumer Equity ETF (GBUY) is an actively
        managed investment vehicle aimed at delivering long-term capital
        appreciation by allocating a minimum of 80% of its net assets to equity
        securities of both U.S. and international companies. This ETF
        strategically targets global equities that resonate with the evolving
        preferences and spending patterns of younger consumers, with a strong
        emphasis on key themes such as technology adoption and lifestyle
        choices. GBUY utilizes a fundamental investment approach, where the
        adviser plays a pivotal role in identifying companies with robust growth
        potential and attractive valuations, without limitations on market
        capitalization or geographic location. As a non-diversified fund, GBUY
        possesses the flexibility to adjust its thematic investments over time,
        ensuring responsiveness to the ever-changing landscape of consumer
        trends. This dynamic approach allows investors to gain exposure to
        innovative sectors that are shaping the future of consumer behavior.
      - >-
        The Fidelity MSCI Utilities Index ETF (FUTY) is strategically designed
        to mirror the performance of the MSCI USA IMI Utilities 25/50 Index,
        which encompasses the U.S. utilities sector. The management strategy
        emphasizes a market-cap-weighted approach, directing at least 80% of the
        fund's assets into securities that align with this index. While the ETF
        may not replicate every security within the index, it adheres to strict
        diversification guidelines mandated by the U.S. Internal Revenue Code.
        This includes a limit where no single issuer exceeds 25% of the fund's
        assets and the combined weight of issuers over 5% is capped at 50%. By
        focusing exclusively on the utilities sector, FUTY targets companies
        involved in essential services such as electric, gas, and water
        utilities, as well as renewable energy providers. This sector
        concentration allows for a nuanced investment strategy that can
        capitalize on the specific dynamics of the utilities market. FUTY
        competes with similar offerings, such as Vanguard's VPU, providing
        investors with liquidity and the potential for modest trading spreads.
      - >-
        The Global X U.S. Infrastructure Development ETF (PAVE) aims to
        replicate the performance of the Indxx U.S. Infrastructure Development
        Index by allocating a minimum of 80% of its assets to the index's
        underlying securities. This market-cap-weighted index targets
        U.S.-listed companies that generate over 50% of their revenue from
        domestic infrastructure development. PAVE encompasses a diverse range of
        sectors, including construction, engineering, raw materials production,
        industrial transportation, and heavy construction equipment, while
        deliberately excluding Master Limited Partnerships (MLPs), Real Estate
        Investment Trusts (REITs), and Business Development Companies (BDCs).
        The ETF employs a strategy of diversification through annual
        reconstitution and rebalancing, maintaining a single security cap of 3%
        and a minimum allocation of 0.3%. This approach ensures exposure to a
        balanced mix of large-, mid-, and small-cap companies, aligning with key
        investment themes in the U.S. infrastructure landscape.
  - source_sentence: >-
      The First Trust Nasdaq Transportation ETF (FTXR) seeks to replicate the
      performance of the Nasdaq US Smart Transportation TM Index by allocating a
      minimum of 90% of its net assets to the securities within the index. This
      non-diversified fund strategically targets 30 U.S. transportation
      companies, carefully selected for their liquidity and ranked based on key
      criteria such as growth, value, and volatility. The ETF encompasses a
      diverse range of sectors within transportation, including delivery,
      shipping, railroads, trucking, and airlines. The weighting of each stock
      in the portfolio is based on its growth potential, value proposition, and
      historical price stability, ensuring that no single investment exceeds 8%
      of total holdings. To maintain its strategic alignment, the index is
      reconstituted annually and rebalanced quarterly, reinforcing FTXR's focus
      on capturing essential trends in the transportation sector.
    sentences:
      - >-
        The WisdomTree Trust WisdomTree Bat ETF (WBAT) utilizes a passive
        management approach to replicate the performance of the WisdomTree
        Battery Value Chain and Innovation Index. This index provides
        comprehensive global exposure to firms primarily engaged in battery and
        energy storage solutions (BESS) and related innovations. The ETF
        strategically targets four critical sectors of the value chain: raw
        materials, manufacturing, enabling technologies, and emerging
        innovations. To qualify for inclusion, companies must generate at least
        50% of their revenue from these areas or from innovative activities. The
        index employs a multi-factor methodology, assessing companies based on
        their level of involvement in the sector and a composite risk score,
        while imposing a 3.5% cap on individual issuers to mitigate
        concentration risk. As a non-diversified fund, WBAT rebalances
        semi-annually, ensuring its alignment with the index's tier-weighted
        framework.
      - >-
        The Invesco Pharmaceuticals ETF (PJP) is an investment fund that focuses
        on U.S. pharmaceutical companies. These are businesses involved in
        making and selling medications. The goal of the ETF is to follow the
        performance of a specific index that tracks these pharmaceutical
        companies. 


        The fund puts at least 90% of its money into stocks from this index,
        which includes around 30 companies. To choose which stocks to invest in,
        it uses a special method that looks at factors like how well a stock's
        price is doing, how companies are performing financially, and their
        overall value. This approach often favors smaller and mid-sized
        companies rather than very large ones, which helps spread out the risk.


        The ETF is re-evaluated and adjusted every few months (in February, May,
        August, and November) to keep it aligned with the index. It is
        considered non-diversified, meaning it focuses on a specific area rather
        than a wide range of sectors. Before August 28, 2023, this ETF was
        called the Invesco Dynamic Pharmaceuticals ETF.
      - >-
        The Vanguard Real Estate ETF (VNQ) employs a strategic management
        approach aimed at generating substantial income and moderate long-term
        capital appreciation by closely tracking the MSCI US Investable Market
        Real Estate 25/50 Index. This index encompasses a diverse range of
        publicly traded equity Real Estate Investment Trusts (REITs) and other
        real estate-related entities within the United States. VNQ's investment
        strategy involves allocating nearly all of its assets to the stocks that
        comprise the index, meticulously maintaining each stock's proportional
        weighting to ensure alignment with index performance. The fund primarily
        targets the commercial REIT sector, displaying a notable bias toward
        this area over specialized REITs, which allows for focused exposure to
        income-generating properties such as office buildings, retail spaces,
        and industrial facilities. Despite the minor inconvenience of monthly
        holdings disclosure, VNQ is recognized for its efficient management
        practices, often resulting in actual costs that fall below its stated
        expense ratio. It is important to note that distributions from the fund
        are taxed as ordinary income, consistent with typical REIT investment
        structures.
  - source_sentence: >-
      The KraneShares Emerging Markets Consumer Technology ETF (KEMQ) aims to
      track the Solactive Emerging Market Consumer Technology Index, investing
      at least 80% of its net assets in instruments within or similar to its
      underlying index. This index comprises the equity securities of the 50
      largest companies by market capitalization, primarily from emerging and
      frontier markets, focusing on the consumer and technology sectors. KEMQ
      offers concentrated exposure to emerging market tech companies, selected
      by a committee and tier-weighted based on market cap. The largest 10
      securities are weighted at 3.5% each, the next 20 at 2.5% each, and the
      remaining 20 at 0.75% each. The index is reviewed and adjusted quarterly
      to ensure it reflects the most relevant market opportunities.
    sentences:
      - >-
        The First Trust Consumer Discretionary AlphaDEX® ETF (FXD) is designed
        to outperform the US consumer discretionary sector by tracking the
        StrataQuant® Consumer Discretionary Index. This index is a modified
        equal-dollar weighted benchmark that selects stocks from the Russell
        1000® using the innovative AlphaDEX® methodology. This approach
        incorporates both value and growth criteria to identify stocks with the
        potential for positive alpha. FXD strategically invests at least 90% of
        its net assets in these selected securities, resulting in notable
        mid-cap exposure and distinct industry tilts that differentiate it from
        traditional sector-focused investments. The fund employs a quasi-active
        selection process, reconstituted and rebalanced on a quarterly basis,
        making it an appealing choice for investors seeking higher returns
        rather than mere sector replication.
      - >-
        The Invesco S&P 500 Equal Weight Health Care ETF (RSPH) is an investment
        fund that aims to match the performance of a specific group of health
        care companies in the S&P 500. This ETF puts most of its money—at least
        90%—into stocks of these health care companies. The goal is to give
        investors a way to invest in the health care sector, which includes
        everything from pharmaceuticals to medical devices. 


        What makes this ETF special is its equal weight strategy. This means
        that each company in the fund has the same importance in the performance
        of the ETF, regardless of how big or small it is. This approach helps to
        spread risk, as it prevents any one company from having too much
        influence on how the ETF performs. Overall, RSPH offers a balanced way
        to invest in health care stocks without being overly dependent on a few
        large companies.
      - >-
        The SPDR S&P Global Infrastructure ETF (GII) employs a strategic
        management approach aimed at closely tracking the S&P Global
        Infrastructure Index. To achieve this, the ETF allocates a minimum of
        80% of its assets to the securities included in the index and their
        related depositary receipts. The index comprises 75 of the largest
        publicly listed infrastructure companies worldwide, selected based on
        stringent investability criteria. GII specifically targets firms within
        the energy, transportation, and utility sectors, maintaining a
        diversified portfolio with a composition of 30 transportation companies,
        30 utility companies, and 15 energy companies. To enhance
        diversification and mitigate concentration risk, sector weights are
        capped at 40% for transportation and utilities, and 20% for energy.
        Furthermore, the fund limits the weight of any single security to a
        maximum of 5%. Within each sector, stocks are weighted according to
        market capitalization. GII undergoes substantial adjustments during its
        semi-annual rebalancing, ensuring alignment with the evolving market
        landscape while adhering to its investment strategy.
datasets:
  - suhwan3/stage1_v1
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/all-MiniLM-L12-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L12-v2 on the stage1_v1 dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'The KraneShares Emerging Markets Consumer Technology ETF (KEMQ) aims to track the Solactive Emerging Market Consumer Technology Index, investing at least 80% of its net assets in instruments within or similar to its underlying index. This index comprises the equity securities of the 50 largest companies by market capitalization, primarily from emerging and frontier markets, focusing on the consumer and technology sectors. KEMQ offers concentrated exposure to emerging market tech companies, selected by a committee and tier-weighted based on market cap. The largest 10 securities are weighted at 3.5% each, the next 20 at 2.5% each, and the remaining 20 at 0.75% each. The index is reviewed and adjusted quarterly to ensure it reflects the most relevant market opportunities.',
    'The First Trust Consumer Discretionary AlphaDEX® ETF (FXD) is designed to outperform the US consumer discretionary sector by tracking the StrataQuant® Consumer Discretionary Index. This index is a modified equal-dollar weighted benchmark that selects stocks from the Russell 1000® using the innovative AlphaDEX® methodology. This approach incorporates both value and growth criteria to identify stocks with the potential for positive alpha. FXD strategically invests at least 90% of its net assets in these selected securities, resulting in notable mid-cap exposure and distinct industry tilts that differentiate it from traditional sector-focused investments. The fund employs a quasi-active selection process, reconstituted and rebalanced on a quarterly basis, making it an appealing choice for investors seeking higher returns rather than mere sector replication.',
    'The SPDR S&P Global Infrastructure ETF (GII) employs a strategic management approach aimed at closely tracking the S&P Global Infrastructure Index. To achieve this, the ETF allocates a minimum of 80% of its assets to the securities included in the index and their related depositary receipts. The index comprises 75 of the largest publicly listed infrastructure companies worldwide, selected based on stringent investability criteria. GII specifically targets firms within the energy, transportation, and utility sectors, maintaining a diversified portfolio with a composition of 30 transportation companies, 30 utility companies, and 15 energy companies. To enhance diversification and mitigate concentration risk, sector weights are capped at 40% for transportation and utilities, and 20% for energy. Furthermore, the fund limits the weight of any single security to a maximum of 5%. Within each sector, stocks are weighted according to market capitalization. GII undergoes substantial adjustments during its semi-annual rebalancing, ensuring alignment with the evolving market landscape while adhering to its investment strategy.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

stage1_v1

  • Dataset: stage1_v1 at 9be9e9c
  • Size: 2,752 training samples
  • Columns: query, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    query positive negative
    type string string string
    details
    • min: 123 tokens
    • mean: 128.0 tokens
    • max: 128 tokens
    • min: 123 tokens
    • mean: 128.0 tokens
    • max: 128 tokens
    • min: 128 tokens
    • mean: 128.0 tokens
    • max: 128 tokens
  • Samples:
    query positive negative
    The Global X Aging Population ETF (AGNG) is a fund designed to invest in companies that benefit from the growing number of older people in the world. It focuses on businesses in developed countries that help improve and extend the lives of seniors. This includes companies that work in areas like biotechnology, medical devices, pharmaceuticals, senior living facilities, and healthcare services. The fund aims to support the aging population trend by investing over 80% of its money in these sectors.

    AGNG uses a special method to choose its investments, looking at a variety of businesses, including those in insurance and consumer products. The fund is updated once a year to make sure it stays balanced and diverse, meaning it spreads its investments across different kinds of companies. Before April 2021, it was called the Global X Longevity Thematic ETF and went by the ticker LNGR. This ETF is a way for investors to tap into the growing market of services and products for seniors.
    The Amplify High Income ETF (YYY) is a fund of funds that aims to replicate the performance of the ISE High Income™ Index by investing at least 80% of its net assets in securities of the index. This index comprises the top 60 U.S. exchange-listed closed-end funds (CEFs), selected and weighted based on yield, discount to NAV, and trading volume. YYY typically holds about 30 CEFs, with a maximum weight of 4.25% per fund at rebalance, and can include funds across major asset classes. The ETF's strategy focuses on acquiring discounted CEFs with high yields and sufficient liquidity to minimize trading costs. YYY's fee structure includes the expenses of its constituent funds. The fund was reorganized under Amplify ETFs in 2019, maintaining its investment objectives and index. The iShares Copper and Metals Mining ETF (ICOP) is strategically designed to replicate the performance of the STOXX Global Copper and Metals Mining Index, concentrating on equities from both U.S. and international companies primarily involved in copper and metal ore extraction. The fund commits at least 80% of its assets to the index's component securities, allowing for up to 20% allocation to derivatives such as futures, options, and swaps, as well as cash and equivalents. ICOP employs a market-capitalization weighted strategy, categorizing companies into three tiers based on their revenue exposure to copper mining: Tier 1 encompasses firms with over 50% revenue from copper, Tier 2 includes those with 25-50%, and Tier 3 comprises companies determined by market share. The index undergoes quarterly rebalancing, implementing caps of 8% on individual holdings and limiting those exceeding 4.5% to a combined weight of 45%. This non-diversified fund provides concentrated exposure specificall...
    The Global X Aging Population ETF (AGNG) seeks to track the performance of the Indxx Aging Population Thematic Index, investing over 80% of its assets in securities from developed markets that support the demographic trend of longer life spans. The fund targets companies involved in biotechnology, medical devices, pharmaceuticals, senior living facilities, and specialized healthcare services, focusing on enhancing and extending the lives of senior citizens. AGNG employs a proprietary research and analysis process, crossing traditional sector lines to include diverse businesses such as insurance and consumer products. The ETF is reconstituted and rebalanced annually, using a modified market-cap weighting with specific caps and floors to ensure diversification. Prior to April 2021, it was known as the Global X Longevity Thematic ETF under the ticker LNGR. The iShares Biotechnology ETF (IBB) aims to track the performance of the NYSE Biotechnology Index, which comprises U.S.-listed biotechnology companies. These companies are involved in the research and development of therapeutic treatments and the production of tools or systems for biotechnology processes, excluding those focused on mass pharmaceutical production. IBB invests at least 80% of its assets in the index's component securities and up to 20% in futures, options, swap contracts, cash, and equivalents. The fund employs a modified market-cap-weighted methodology, capping the five largest constituents at 8% and others at 4%. It is non-diversified, rebalances quarterly, and fully reconstitutes annually in December. Prior to June 21, 2021, it was known as the iShares Nasdaq Biotechnology ETF. The Invesco Global Clean Energy ETF (PBD) is designed to track the WilderHill New Energy Global Innovation Index, dedicating a minimum of 90% of its assets to securities within this index, which includes American Depositary Receipts (ADRs) and Global Depositary Receipts (GDRs). The index predominantly features companies committed to clean energy technologies, conservation, efficiency, and the advancement of renewable energy. While PBD is passively managed, it employs a strategy akin to active management by focusing on companies with significant capital appreciation potential, particularly emphasizing pure-play small- and mid-cap firms. The fund boasts a global diversification, with approximately half of its assets allocated internationally, while maintaining a limit of 5% on its largest holdings. The index undergoes quarterly rebalancing and reconstitution, ensuring a dynamic and varied portfolio that reflects the evolving landscape of the clean energy s...
    The Global X Aging Population ETF (AGNG) is strategically designed to track the performance of the Indxx Aging Population Thematic Index, focusing on the investment potential arising from the global demographic shift towards longer life spans. The ETF allocates over 80% of its assets to securities primarily in developed markets that are aligned with this trend. Target sectors include biotechnology, medical devices, pharmaceuticals, senior living facilities, and specialized healthcare services, all aimed at improving the quality of life for senior citizens. Additionally, AGNG incorporates a broader investment approach by including companies from diverse sectors such as insurance and consumer products, which are relevant to aging populations. The fund employs a proprietary research and analysis methodology that transcends traditional sector boundaries. It is reconstituted and rebalanced annually, utilizing a modified market-cap weighting approach that includes specific caps and floors to... The iShares U.S. Health Care Providers ETF (IHF) employs a strategy aimed at closely tracking the performance of the Dow Jones U.S. Select Health Care Providers Index. This ETF is managed by investing at least 80% of its assets in the securities of companies that constitute the index, which primarily includes U.S. firms operating within the healthcare services sector. The remaining 20% of the fund's assets may be allocated to various financial instruments such as futures, options, swaps, cash, and cash equivalents to enhance liquidity and manage risk. IHF strategically targets key sectors within the healthcare provider landscape, focusing on managed healthcare, healthcare facilities, and health insurance companies, while deliberately excluding pharmaceutical firms. This approach allows IHF to offer cap-weighted exposure tailored to the healthcare provider space, providing investors with a concentrated yet comprehensive investment vehicle that captures the dynamics of health insurance a... The First Trust Indxx NextG ETF (NXTG) seeks to replicate the performance of the Indxx 5G & NextG Thematic Index by investing at least 90% of its net assets in the index's securities. This index tracks global equities of companies that are significantly investing in the research, development, and application of fifth generation (5G) and next generation digital cellular technologies. NXTG includes mid- and large-cap companies from two main sub-themes: 5G infrastructure & hardware, which encompasses data center REITs, cell tower REITs, equipment manufacturers, network testing and validation equipment, and mobile phone manufacturers; and telecommunication service providers operating cellular and wireless communication networks with 5G access. Prior to May 29, 2019, NXTG was known as the First Trust NASDAQ Smartphone Index Fund (ticker FONE), focusing more broadly on the cellular phone industry.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.5
    }
    

Evaluation Dataset

stage1_v1

  • Dataset: stage1_v1 at 9be9e9c
  • Size: 688 evaluation samples
  • Columns: query, positive, and negative
  • Approximate statistics based on the first 688 samples:
    query positive negative
    type string string string
    details
    • min: 123 tokens
    • mean: 127.99 tokens
    • max: 128 tokens
    • min: 123 tokens
    • mean: 127.99 tokens
    • max: 128 tokens
    • min: 120 tokens
    • mean: 127.99 tokens
    • max: 128 tokens
  • Samples:
    query positive negative
    The Global X Aging Population ETF (AGNG) aims to replicate the performance of the Indxx Aging Population Thematic Index by investing over 80% of its assets in securities from developed markets that capitalize on the trend of increasing life expectancies. The fund primarily focuses on companies engaged in biotechnology, medical devices, pharmaceuticals, senior living facilities, and specialized healthcare services, all aimed at enhancing and extending the quality of life for senior citizens. AGNG employs a proprietary research methodology that transcends traditional sector boundaries, incorporating a diverse range of industries, including insurance and consumer products. The ETF is reconstituted and rebalanced annually, utilizing a modified market-cap weighting approach with specific caps and floors to maintain diversification. Previously known as the Global X Longevity Thematic ETF under the ticker LNGR until April 2021, AGNG continues to align its investments with key demographic shif... The SPDR S&P Biotech ETF (XBI) employs a strategic management approach aimed at closely tracking the performance of the S&P Biotechnology Select Industry Index through a sampling strategy. By investing a minimum of 80% of its total assets in the securities of this index, XBI focuses specifically on the biotechnology sector, which is a subset of the broader S&P Total Market Index. The ETF is distinguished by its equal-weighted methodology, which ensures diversified exposure across U.S. biotech stocks, particularly emphasizing small- and micro-cap companies. This approach mitigates single-name risk by reducing the influence of larger companies, resulting in a lower weighted-average market capitalization relative to its competitors. Additionally, the ETF's structure limits overlap with the pharmaceutical industry, allowing for a more concentrated investment in innovative biotech firms. The index undergoes quarterly rebalancing, which supports its commitment to maintaining a focused invest... The VanEck Mortgage REIT Income ETF (MORT) employs a strategic approach to replicate the performance of the MVIS® US Mortgage REITs Index, focusing on a diverse range of mortgage real estate investment trusts (REITs). By allocating at least 80% of its total assets to securities within this benchmark, MORT targets companies across various market capitalizations, including small-, medium-, and large-cap mortgage REITs. The ETF is managed with a market-cap-weighted strategy, ensuring that larger companies have a more significant influence on its performance. While MORT features a lower expense ratio compared to its peer, the iShares Mortgage Real Estate Capped ETF (REM), it does experience challenges with liquidity. The fund maintains a concentrated portfolio, heavily aligned with its top holdings, which allows for targeted exposure to the mortgage REIT sector. This management strategy positions MORT as a compelling choice for investors seeking specialized investments in the mortgage REIT...
    The Global X Aging Population ETF (AGNG) aims to replicate the performance of the Indxx Aging Population Thematic Index by investing over 80% of its assets in securities from developed markets that capitalize on the trend of increasing life expectancies. The fund primarily focuses on companies engaged in biotechnology, medical devices, pharmaceuticals, senior living facilities, and specialized healthcare services, all aimed at enhancing and extending the quality of life for senior citizens. AGNG employs a proprietary research methodology that transcends traditional sector boundaries, incorporating a diverse range of industries, including insurance and consumer products. The ETF is reconstituted and rebalanced annually, utilizing a modified market-cap weighting approach with specific caps and floors to maintain diversification. Previously known as the Global X Longevity Thematic ETF under the ticker LNGR until April 2021, AGNG continues to align its investments with key demographic shif... The Range Cancer Therapeutics ETF (CNCR) is designed to track the Range Oncology Therapeutics Index, targeting U.S. exchange-listed pharmaceutical and biotechnology stocks, as well as American Depository Receipts (ADRs) with market capitalizations exceeding $250 million. Launched in 2023 by Range Fund Holdings, CNCR strategically allocates a minimum of 80% of its assets to the securities within the index. This ETF provides equal-weighted exposure to companies engaged in the research, development, and commercialization of oncology drugs, placing a spotlight on smaller firms with significant growth potential. CNCR is particularly appealing to investors focused on the cancer therapeutics sector. The ETF, formerly known as the Loncar Cancer Immunotherapy ETF, broadened its investment scope in October 2023 by merging with the Loncar China BioPharma ETF, thereby enhancing its exposure to promising opportunities in the Chinese markets. The Invesco S&P 500 Equal Weight Energy ETF (RSPG) is designed to replicate the performance of the S&P 500® Equal Weight Energy Index by investing a minimum of 90% of its total assets in securities that compose this index. This index includes all companies within the S&P 500® Energy Index that fall under the energy sector, as defined by the Global Industry Classification Standard (GICS). As a large-cap sector fund, RSPG offers equal-weight exposure to a diverse array of U.S. energy companies across various sub-industries, enhancing portfolio diversification. The fund is rebalanced quarterly to ensure a minimum inclusion of 22 companies, and it may also incorporate leading firms from the S&P MidCap 400 Index if necessary to maintain this threshold. Notably, prior to June 7, 2023, RSPG was traded under the ticker RYE.
    The First Trust RBA American Industrial Renaissance ETF (AIRR) is designed to closely track the performance of the Richard Bernstein Advisors American Industrial Renaissance® Index. This passively managed fund allocates a minimum of 90% of its net assets to equity securities within the index, emphasizing small and mid-cap U.S. companies primarily in the industrial and community banking sectors. Key industries targeted include Commercial Services & Supplies, Construction & Engineering, Electrical Equipment, Machinery, and Banks. The index utilizes a multifactor selection approach, systematically excluding firms with more than 25% of sales from outside the U.S. and community banks situated outside traditional Midwestern manufacturing regions. A proprietary optimization method is applied for weighting, limiting the banking sector to 10% and individual issuers to 4%. The index undergoes quarterly reconstitution and rebalancing, maintaining a focus on companies with a favorable 12-month for... The Invesco Global Water ETF (PIO) aims to track the investment results of the NASDAQ OMX Global Water Index, investing at least 90% of its assets in securities within the index, including ADRs and GDRs. This index comprises global exchange-listed companies from the U.S., developed, and emerging markets that produce water conservation and purification products for homes, businesses, and industries. PIO employs a liquidity-weighted strategy, resulting in a concentrated portfolio dominated by large- to mid-cap firms. Eligible companies must participate in the Green Economy, as determined by SustainableBusiness.com LLC. The fund uses full replication to track its index, with quarterly rebalancing and annual reconstitution, while maintaining country and issuer diversification limits. The Jacob Funds Inc. Jacob Forward ETF (JFWD) is actively managed with a focus on achieving long-term capital growth by investing in equity securities of U.S. companies engaged in innovative and disruptive technologies. The fund primarily holds common stocks but may also include other equity securities like preferred stocks, rights, or warrants. It targets companies of all sizes, with a significant emphasis on those in the early stages of development, particularly within the healthcare and information technology sectors. JFWD employs a forward-looking investment strategy, selecting securities based on a qualitative and quantitative assessment of companies' potential for above-average growth. The fund may also gain up to 25% foreign market exposure through global operations of U.S. companies. Notably, JFWD is non-diversified and will be delisted, with its last trading day on December 23, 2024.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • bf16: True
  • dataloader_drop_last: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss
0.0581 10 0.4273 -
0.1163 20 0.3954 -
0.1744 30 0.2946 -
0.2326 40 0.2368 -
0.2907 50 0.1625 -
0.3488 60 0.1752 -
0.4070 70 0.1091 -
0.4651 80 0.1102 -
0.5233 90 0.0671 -
0.5814 100 0.0753 0.0678
0.6395 110 0.061 -
0.6977 120 0.0218 -
0.7558 130 0.0676 -
0.8140 140 0.0591 -
0.8721 150 0.0454 -
0.9302 160 0.0554 -
0.9884 170 0.0344 -
1.0523 180 0.0295 -
1.1105 190 0.0347 -
1.1686 200 0.032 0.0274
1.2267 210 0.0163 -
1.2849 220 0.0346 -
1.3430 230 0.0209 -
1.4012 240 0.0209 -
1.4593 250 0.0112 -
1.5174 260 0.0095 -
1.5756 270 0.016 -
1.6337 280 0.0123 -
1.6919 290 0.0173 -
1.75 300 0.0144 0.0171
1.8081 310 0.0182 -
1.8663 320 0.0223 -
1.9244 330 0.0103 -
1.9826 340 0.0071 -
2.0407 350 0.0085 -
2.0988 360 0.0045 -
2.1570 370 0.0058 -
2.2151 380 0.001 -
2.2733 390 0.0053 -
2.3314 400 0.0108 0.0093
2.3895 410 0.0017 -
2.4477 420 0.0024 -
2.5058 430 0.0075 -
2.5640 440 0.0022 -
2.6221 450 0.0044 -
2.6802 460 0.0001 -
2.7384 470 0.0022 -
2.7965 480 0.0016 -
2.8547 490 0.0078 -
2.9128 500 0.0 0.0045

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.1.0+cu118
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}