LatinAI (AI Developers from Latin America)

mariagrandury

authored 2 papers about 1 month ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper • 2504.07072 • Published Apr 9 • 9

It's the same but not the same: Do LLMs distinguish Spanish varieties?

Paper • 2504.20049 • Published Apr 8

Novaciano

updated a model 5 months ago

LatinAI/TinyBoludo-1b-GGUF

1B • Updated Feb 1 • 16 • 2

AlexBodner

posted an update 5 months ago

Post

1395

After today you NEED to know how Deepseek made its magic, check out this thread breaking down the paper: https://x.com/AlexBodner_/status/1883602267317927965

mmaguero

posted an update 5 months ago

Post

1568

🚀 Multidimensional Affective Analysis for Guarani/Jopara! 🌎

This project explored affective computing for low-resource languages, focusing on emotion recognition, humor detection, and offensive language identification in Guarani and Jopara (a code-switching mix of Guarani and Spanish).

Highlights:
🧵 Corpora:
- Emotion Recognition
- Humor Detection
- Offensive Language Identification
💻 Base Models for Fine-Tuning (trained on Guarani Wiki):
- From scratch: BERT-based tiny, small, base and large models
- Continuously pre-trained models: Multilingual-BERT and BETO
📓 Baseline Notebooks:
- Fine-tuning BERT-based models
- NCRF++ models via GitHub

💡 Check the repo!
https://github.com/mmaguero/guarani-multi-affective-analysis

📖 Check out the publication here:
- https://digibug.ugr.es/handle/10481/98843
- https://link.springer.com/article/10.1007/s12559-023-10165-0

#NLP #AffectiveComputing #LowResourceLanguages #Guarani #Jopara #SentimentAnalysis #AIForAll

AlexBodner

posted an update 5 months ago

Post

410

How does Deepseek R1 work?
I was wondering the same, so wrote a thread explaining from scratch everything that you need to know about it!
Breaking down the paper in: https://x.com/AlexBodner_/status/1883602267317927965

AlexBodner

posted an update 5 months ago

Post

1493

I just dropped a detailed guide on deploying ML models to Google Cloud Run with GPU support—completely serverless and auto-scaling. If you’re curious about seamlessly deploying your models to the cloud, give it a read! [https://medium.com/@alexbodner/deployment-of-serverless-machine-learning-models-with-gpus-using-google-cloud-cloud-run-573b836475b5]"

1 reply

·

AlexBodner

posted an update 6 months ago

Post

1520

Just published a post explaining Monte Carlo Tree Search: the magic behind AlphaZero and now used to tackle reasoning benchmarks with LLMs. Check it out because it's a must know nowadays!

https://x.com/AlexBodner_/status/1877789879398244382

1 reply

·

AlexBodner

posted an update 6 months ago

Post

483

🚀🤖𝐃𝐨 𝐀𝐧𝐝𝐫𝐨𝐢𝐝𝐬 𝐃𝐫𝐞𝐚𝐦 𝐨𝐟 𝐄𝐥𝐞𝐜𝐭𝐫𝐢𝐜 𝐌𝐚𝐫𝐢𝐨𝐬?

Discover how we replaced the classic game engine with DIAMOND, a Neural Network that predicts every frame based on actions, noise, and past states. From training on human and RL gameplay to generating surreal hallucinations, this project shows the potential of diffusion models in creating amazing simulations. 🎮

🧵 Dive into the full story in our Twitter thread:
👉 https://x.com/AlexBodner_/status/1871566560512643567
🌟 Don’t forget to follow and leave a star for more groundbreaking AI projects!

rmayormartins

updated a collection 7 months ago

NLP

Collection

3 items • Updated Nov 25, 2024

marlhex

in LatinAI/speech-accent-es-classifier 8 months ago

Español: Accento Costa Rica

1

#1 opened 8 months ago by

marlhex

asoria

posted an update 8 months ago

Post

2084

🚀 Exploring Topic Modeling with BERTopic 🤖

When you come across an interesting dataset, you often wonder:
Which topics frequently appear in these documents? 🤔
What is this data really about? 📊

Topic modeling helps answer these questions by identifying recurring themes within a collection of documents. This process enables quick and efficient exploratory data analysis.

I’ve been working on an app that leverages BERTopic, a flexible framework designed for topic modeling. Its modularity makes BERTopic powerful, allowing you to switch components with your preferred algorithms. It also supports handling large datasets efficiently by merging models using the BERTopic.merge_models approach. 🔗

🔍 How do we make this work?
Here’s the stack we’re using:

📂 Data Source ➡️ Hugging Face datasets with DuckDB for retrieval
🧠 Text Embeddings ➡️ Sentence Transformers (all-MiniLM-L6-v2)
⚡ Dimensionality Reduction ➡️ RAPIDS cuML UMAP for GPU-accelerated performance
🔍 Clustering ➡️ RAPIDS cuML HDBSCAN for fast clustering
✂️ Tokenization ➡️ CountVectorizer
🔧 Representation Tuning ➡️ KeyBERTInspired + Hugging Face Inference Client with Meta-Llama-3-8B-Instruct
🌍 Visualization ➡️ Datamapplot library
Check out the space and see how you can quickly generate topics from your dataset: datasets-topics/topics-generator

Powered by @MaartenGr - BERTopic

AlexBodner

posted an update 8 months ago

Post

2413

💾🧠How much VRAM will you need for training your AI model? 💾🧠
Check out this app where you convert:
Pytorch/tensorflow summary -> required VRAM
or
Parameter count -> required VRAM

Use it in: http://howmuchvram.com

And everything is open source! Ask for new functionalities or contribute in:
https://github.com/AlexBodner/How_Much_VRAM
If it's useful to you leave a star 🌟and share it to someone that will find the tool useful!

1 reply

·

Kukedlc

posted an update 8 months ago

Post

1442

Multilingual Audio Podcast Generator - Gemma 2 + Edge-TTS

Hello Friends, I want to share my latest kaggle notebook to create a podcast of papers (or any pdf) in more than 21 languages. Implements any LLM (use Gemma2-9b-it) and for the TTS edge-tts. I hope it will be useful for you to catch up with the papers that are coming out faster and faster every day!

https://www.kaggle.com/code/eugeniokukes/multilingual-audio-podcast-generator-gemma-tts

1 reply

·

mariagrandury

authored a paper 9 months ago

Evaluating Large Language Models with Tests of Spanish as a Foreign Language: Pass or Fail?

Paper • 2409.15334 • Published Sep 8, 2024 • 1

asoria

posted an update 9 months ago

Post

2615

📝 I wrote a tutorial on how to get started with the fine-tuning process using Hugging Face tools, providing an end-to-end workflow.

The tutorial covers creating a new dataset using the new SQL Console 🛢 and fine-tuning a model with SFT, guided by the Notebook Creator App 📙.

👉 You can read the full article here:
https://huggingface.co/blog/asoria/easy-fine-tuning-with-hf
asoria/auto-notebook-creator

asoria

posted an update 9 months ago

Post

981

🚀 Excited to share the latest update to the Notebook Creator Tool!

Now with basic fine-tuning support using Supervised Fine-Tuning! 🎯

How it works:
1️⃣ Choose your Hugging Face dataset and notebook type (SFT)
2️⃣ Automatically generate your training notebook
3️⃣ Start fine-tuning with your data!

Link to the app 👉 https://lnkd.in/e_3nmWrB
💡 Want to contribute with new notebooks? 👉https://lnkd.in/eWcZ92dS

AlexBodner

posted an update 10 months ago

Post

1630

💾🧠How much VRAM will you need for training your AI model? 💾🧠
Check out this app where you convert:
Pytorch/tensorflow summary -> required VRAM
or
Parameter count -> required VRAM

Use it in: http://howmuchvram.com

And everything is open source! Ask for new functionalities or contribute in:
https://github.com/AlexBodner/How_Much_VRAM
If it's useful to you leave a star 🌟and share it to someone that will find the tool useful!
More discussion in: https://x.com/AlexBodner_/status/1832054850294812679

AlexBodner

posted an update 10 months ago

Post

1849

💾🧠How much VRAM will you need for training your AI model? 💾🧠
Check out this app where you convert:
Pytorch/tensorflow summary -> required VRAM
or
Parameter count -> required VRAM

Use it in: http://howmuchvram.com

And everything is open source! Ask for new functionalities or contribute in:
https://github.com/AlexBodner/How_Much_VRAM
If it's useful to you leave a star 🌟and share it to someone that will find the tool useful!
More discussion in: https://x.com/AlexBodner_/status/1832054850294812679

1 reply

·

AlexBodner

posted an update 10 months ago

Post

365

💾🧠How much VRAM will you need for training your AI model? 💾🧠
Check out this app where you convert:
Pytorch/tensorflow summary -> required VRAM
or
Parameter count -> required VRAM

Use it in: http://howmuchvram.com

And everything is open source! Ask for new functionalities or contribute in:
https://github.com/AlexBodner/How_Much_VRAM
If it's useful to you leave a star 🌟and share it to someone that will find the tool useful!
More discussion in: https://x.com/AlexBodner_/status/1832054850294812679

AI Developers from Latin America

AI & ML interests

Recent Activity

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

It's the same but not the same: Do LLMs distinguish Spanish varieties?

LatinAI/TinyBoludo-1b-GGUF

NLP

Español: Accento Costa Rica

Evaluating Large Language Models with Tests of Spanish as a Foreign Language: Pass or Fail?

AI & ML interests

Recent Activity

Team members 39

LatinAI's activity

Español: Accento Costa Rica