🚀 Multidimensional Affective Analysis for Guarani/Jopara! 🌎
This project explored affective computing for low-resource languages, focusing on emotion recognition, humor detection, and offensive language identification in Guarani and Jopara (a code-switching mix of Guarani and Spanish).
Highlights: 🧵 Corpora: - Emotion Recognition - Humor Detection - Offensive Language Identification 💻 Base Models for Fine-Tuning (trained on Guarani Wiki): - From scratch: BERT-based tiny, small, base and large models - Continuously pre-trained models: Multilingual-BERT and BETO 📓 Baseline Notebooks: - Fine-tuning BERT-based models - NCRF++ models via GitHub
How does Deepseek R1 work? I was wondering the same, so wrote a thread explaining from scratch everything that you need to know about it! Breaking down the paper in: https://x.com/AlexBodner_/status/1883602267317927965
I just dropped a detailed guide on deploying ML models to Google Cloud Run with GPU support—completely serverless and auto-scaling. If you’re curious about seamlessly deploying your models to the cloud, give it a read! [https://medium.com/@alexbodner/deployment-of-serverless-machine-learning-models-with-gpus-using-google-cloud-cloud-run-573b836475b5]"
Just published a post explaining Monte Carlo Tree Search: the magic behind AlphaZero and now used to tackle reasoning benchmarks with LLMs. Check it out because it's a must know nowadays!
Discover how we replaced the classic game engine with DIAMOND, a Neural Network that predicts every frame based on actions, noise, and past states. From training on human and RL gameplay to generating surreal hallucinations, this project shows the potential of diffusion models in creating amazing simulations. 🎮
When you come across an interesting dataset, you often wonder: Which topics frequently appear in these documents? 🤔 What is this data really about? 📊
Topic modeling helps answer these questions by identifying recurring themes within a collection of documents. This process enables quick and efficient exploratory data analysis.
I’ve been working on an app that leverages BERTopic, a flexible framework designed for topic modeling. Its modularity makes BERTopic powerful, allowing you to switch components with your preferred algorithms. It also supports handling large datasets efficiently by merging models using the BERTopic.merge_models approach. 🔗
🔍 How do we make this work? Here’s the stack we’re using:
📂 Data Source ➡️ Hugging Face datasets with DuckDB for retrieval 🧠 Text Embeddings ➡️ Sentence Transformers (all-MiniLM-L6-v2) ⚡ Dimensionality Reduction ➡️ RAPIDS cuML UMAP for GPU-accelerated performance 🔍 Clustering ➡️ RAPIDS cuML HDBSCAN for fast clustering ✂️ Tokenization ➡️ CountVectorizer 🔧 Representation Tuning ➡️ KeyBERTInspired + Hugging Face Inference Client with Meta-Llama-3-8B-Instruct 🌍 Visualization ➡️ Datamapplot library Check out the space and see how you can quickly generate topics from your dataset: datasets-topics/topics-generator
💾🧠How much VRAM will you need for training your AI model? 💾🧠 Check out this app where you convert: Pytorch/tensorflow summary -> required VRAM or Parameter count -> required VRAM
And everything is open source! Ask for new functionalities or contribute in: https://github.com/AlexBodner/How_Much_VRAM If it's useful to you leave a star 🌟and share it to someone that will find the tool useful!
Hello Friends, I want to share my latest kaggle notebook to create a podcast of papers (or any pdf) in more than 21 languages. Implements any LLM (use Gemma2-9b-it) and for the TTS edge-tts. I hope it will be useful for you to catch up with the papers that are coming out faster and faster every day!
🚀 Excited to share the latest update to the Notebook Creator Tool!
Now with basic fine-tuning support using Supervised Fine-Tuning! 🎯
How it works: 1️⃣ Choose your Hugging Face dataset and notebook type (SFT) 2️⃣ Automatically generate your training notebook 3️⃣ Start fine-tuning with your data!
Link to the app 👉 https://lnkd.in/e_3nmWrB 💡 Want to contribute with new notebooks? 👉https://lnkd.in/eWcZ92dS
💾🧠How much VRAM will you need for training your AI model? 💾🧠 Check out this app where you convert: Pytorch/tensorflow summary -> required VRAM or Parameter count -> required VRAM
💾🧠How much VRAM will you need for training your AI model? 💾🧠 Check out this app where you convert: Pytorch/tensorflow summary -> required VRAM or Parameter count -> required VRAM
💾🧠How much VRAM will you need for training your AI model? 💾🧠 Check out this app where you convert: Pytorch/tensorflow summary -> required VRAM or Parameter count -> required VRAM
I've been working on a Space to make it super easy to create notebooks and help users quickly understand and manipulate their data! With just a few clicks automatically generate notebooks for:
📊 Exploratory Data Analysis 🧠 Text Embeddings 🤖 Retrieval-Augmented Generation (RAG)
✨ Automatic training is coming soon! Check it out here asoria/auto-notebook-creator Appreciate any feedback to improve this tool 🤗
💾🧠How much VRAM will you need for training your AI model? 💾🧠 Check out this app where you convert: Pytorch/tensorflow summary -> needed VRAM or Parameter count -> needed VRAM
And everything is open source! Ask for new functionalities or contribute in: https://github.com/AlexBodner/How_Much_VRAM If it's useful to you leave a star 🌟and share it to someone that will find the tool useful!