Adrian Lucas Malec's picture

1 10

Adrian Lucas Malec

adlumal

·

AI & ML interests

None yet

Recent Activity

posted an update 1 day ago

I benchmarked embedding APIs for speed, compared local vs hosted models, and tuned USearch for sub-millisecond retrieval on 143k chunks using only CPU. The post walks through the results, trade-offs, and what I learned about embedding API terms of service. The main motivation for using USearch is that CPU compute is cheap and easy to scale. Blog post: https://huggingface.co/blog/adlumal/lightning-fast-vector-search-for-legal-documents

published an article 1 day ago

How I Built Lightning-Fast Vector Search for Legal Documents

reacted to abdurrahmanbutler's post with ❤️ 5 days ago

🎉 I am excited to share news of a project my brother, Umar Butler, and I have been working on for what feels like an eternity now. 𝐈𝐧𝐭𝐫𝐨𝐝𝐮𝐜𝐢𝐧𝐠 𝐌𝐋𝐄𝐁 — 𝐭𝐡𝐞 𝐌𝐚𝐬𝐬𝐢𝐯𝐞 𝐋𝐞𝐠𝐚𝐥 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠 𝐁𝐞𝐧𝐜𝐡𝐦𝐚𝐫𝐤. A suite of 10 high-quality English legal IR datasets, designed by legal experts to set a new standard for comparing embedding models. Whether you’re exploring legal RAG on your home computer, or running enterprise-scale retrieval, apples-to-apples evaluation is crucial. That’s why we’ve open-sourced everything - including our 7 brand-new, hand-crafted retrieval datasets. All of these datasets are now live on Hugging Face. Any guesses which embedding model leads on legal retrieval? 𝐇𝐢𝐧𝐭: it’s not OpenAI or Google - they place 7th and 9th on our leaderboard. To do well on MLEB, embedding models must demonstrate both extensive legal domain knowledge and strong legal reasoning skills. https://huggingface.co/blog/isaacus/introducing-mleb

View all activity

Organizations

adlumal 's datasets

None public yet