---
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- transformers
---

## LGAI-Embedding-Preview

we have trained the **LGAI-Embedding-Preview** model based on the [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) LLM model.

The initial goal is to reproduce the baseline model and check the workflow for uploading results:
  - [x] Checkpoint
  - [x] technical report


## MTEB
Inference is performed with in-context examples for MTEB evaluation.


## Model Information
- Model Size: 7B
- Embedding Dimension: 4096
- Max Input Tokens: 32k


## Requirements
```
transformers>=4.48.3
```


## Citation

If you find this repository useful, please consider citing it.


```
@misc{choi2025lgaiembeddingpreviewtechnicalreport,
      title={LGAI-EMBEDDING-Preview Technical Report}, 
      author={Jooyoung Choi and Hyun Kim and Hansol Jang and Changwook Jun and Kyunghoon Bae and Hyewon Choi and Stanley Jungkyu Choi and Honglak Lee and Chulmin Yun},
      year={2025},
      eprint={2506.07438},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.07438}, 
}
```