Natalia

natalika

natalika

AI & ML interests

None yet

Recent Activity

reacted to ZennyKenny's post with 🚀 about 1 month ago

On-demand audio transcription is an often-requested service without many good options on the market. Using Hugging Face Spaces with Gradio SDK and the OpenAI Whisper model, I've put together a simple interface that supports the transcription and summarisation of audio files up to five minutes in length, completely open source and running on CPU upgrade. The cool thing is that it's built without a dedicated inference endpoint, completely on public infrastructure. Check it out: https://huggingface.co/spaces/ZennyKenny/AudioTranscribe I wrote a short article about the backend mechanics for those who are interested: https://huggingface.co/blog/ZennyKenny/on-demand-public-transcription

reacted to ZennyKenny's post with 👍 about 1 month ago

reacted to ZennyKenny's post with 🤗 about 1 month ago

View all activity

Organizations

natalika's activity

reacted to ZennyKenny's post with 🚀👍🤗 about 1 month ago

Post

442

On-demand audio transcription is an often-requested service without many good options on the market.

Using Hugging Face Spaces with Gradio SDK and the OpenAI Whisper model, I've put together a simple interface that supports the transcription and summarisation of audio files up to five minutes in length, completely open source and running on CPU upgrade. The cool thing is that it's built without a dedicated inference endpoint, completely on public infrastructure.

Check it out: ZennyKenny/AudioTranscribe

I wrote a short article about the backend mechanics for those who are interested: https://huggingface.co/blog/ZennyKenny/on-demand-public-transcription

reacted to lewtun's post with 🔥 about 1 month ago

Post

10398

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1

5 replies

reacted to ZennyKenny's post with 🚀🔥👍 about 1 month ago

Post

460

GradientBoostingClassifier is an algorithm supported by the Python SciKit library, and now you can quickly train an ML model using this powerful technique on any (viable) dataset in the Hugging Face Hub without a line of code.

Love finishing a project right when the late night starts to turn into the early morning: sklearn-docs/GradientBoostingClassifier

Long time listener, first time caller, but always pleased to contribute, even if only adjacently, to the power of SciKit.

reacted to ZennyKenny's post with ❤️👍🤗🔥 about 1 month ago

Post

3476

I've completed the first unit of the just-launched Hugging Face Agents Course. I would highly recommend it, even for experienced builders, because it is a great walkthrough of the smolagents library and toolkit.

reacted to ZennyKenny's post with 🔥🤗🚀 about 1 month ago

Post

2235

Really excited to start contributing to the SWE Arena project: https://swe-arena.com/

Led by IBM PhD fellow @terryyz , our goal is to advance research in code generation and app development by frontier LLMs.

reacted to ZennyKenny's post with 👍🔥🚀 about 1 month ago

Post

1916

I've spent most of time working with AI on user-facing apps like Chatbots and TextGen, but today I decided to work on something that I think has a lot of applications for Data Science teams: ZennyKenny/comment_classification

This Space supports uploading a user CSV and categorizing the fields based on user-defined categories. The applications of AI in production are truly endless. 🚀

reacted to ZennyKenny's post with 🚀🔥❤️ about 1 month ago

Post

3137

After hearing the news that Marc Andreessen thinks that the only job that is safe from AI replacement is venture capital: https://gizmodo.com/marc-andreessen-says-one-job-is-mostly-safe-from-ai-venture-capitalist-2000596506 🧠🧠🧠

The Reasoned Capital synthetic dataset suddenly feels much more topical: ZennyKenny/synthetic_vc_financial_decisions_reasoning_dataset 🔥🔥🔥

Really looking forward to potentially expanding this architecture and seeing how algorithmic clever investment truly is! 💰💰💰