Luis Vasquez

LuisVasquezBSC

AI & ML interests

NLP @ BSC

Recent Activity

updated a Space 22 days ago
langtech-annotators/redteam
liked a dataset about 2 months ago
langtech-languagemodeling/veritasQA
liked a Space 4 months ago
CohereLabsCommunity/m-rewardbench
View all activity

Organizations

Language Technologies Unit @ Barcelona Supercomputing Center's profile picture Plan de Tecnologรญas del Lenguaje - Gobierno de Espaรฑa's profile picture Projecte Aina's profile picture Language Modeling Group's profile picture Langtech Annotators's profile picture

LuisVasquezBSC's activity

New activity in projecte-aina/WikiCAT_ca 8 months ago
reacted to Tonic's post with ๐Ÿ‘€ 8 months ago
view post
Post
1873
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ Hey there folks ,

๐ŸฆŽSalamandra release by @mvillegas and team
@BSC_CNS BSC-LT is absolutely impressive so far !

perhaps the largest single training dataset of high quality text to date of 7.8 trillion tokens in 35 European languages and code.

the best part : the data was correctly licenced so it's actually future-proof!

the completions model is really creative and instruct fine tuned version is very good also.

now you can use such models for multi-lingual enterprise applications with further finetunes , long response generation, structured outputs (coding) also works.

check out ๐Ÿ‘‡๐Ÿป
the collection : BSC-LT/salamandra-66fc171485944df79469043a
the repo : https://github.com/langtech-bsc/salamandra
7B-Instruct demo : Tonic/Salamandra-7B