Prithiv Sakthi's picture

Prithiv Sakthi

prithivMLmods

AI & ML interests

computer vision, nlp, multimodality @strangerzonehf @strangerguardhf

Recent Activity

updated a model about 8 hours ago
prithivMLmods/Blitzar-Coder-4B-F.1
updated a model about 8 hours ago
prithivMLmods/Blitzar-Coder-4B-F.1-GGUF
liked a model about 8 hours ago
prithivMLmods/Blitzar-Coder-4B-F.1
View all activity

Organizations

DFCI-Hale's profile picture Stanford AI's profile picture DataScienceEngineering's profile picture AI FILMS's profile picture MISATO-dataset's profile picture Masakhane NLP's profile picture GEM benchmark's profile picture shareAI's profile picture OpenGVLab's profile picture The Chinese University of Hong Kong's profile picture MusicAI's profile picture BigScience Biomedical Datasets's profile picture Speech and Language Processing Lab - Sharif University Of Technology's profile picture OpenVINO Toolkit's profile picture Tecnologico de Monterrey's profile picture LLMs's profile picture Text Mining Group, Nanjing University of Science and Technology's profile picture ONNXConfig for all's profile picture Gradio-Themes-Party's profile picture Georgia Tech (Georgia Institute of Technology)'s profile picture scikit-learn's profile picture lora concepts library's profile picture DeepGHS's profile picture Open-Source AI Meetup's profile picture TorchGeo's profile picture Chinese-Vicuna's profile picture Literally Me FRFR Research Society's profile picture East China Normal University's profile picture Kornia AI's profile picture Université Dauphine-PSL's profile picture Platzi Community's profile picture Tune a video concepts library's profile picture Keras Dreambooth Event's profile picture Stable Diffusion Dreambooth Concepts Library's profile picture The Waifu Research Department's profile picture Musika's profile picture Binghamton University's profile picture Blog-explorers's profile picture dnagpt's profile picture OpenSky's profile picture AI Tamil Nadu's profile picture OpenLLM France's profile picture huggingPartyParis's profile picture Team Tonic's profile picture Johns Hopkins University's profile picture MLX Vision's profile picture That Time I got Reincarnated as a Hugging Face Organization's profile picture SímboloAI's profile picture LocalLLaMA's profile picture Major TOM's profile picture MLX Community's profile picture Cohere Labs Community's profile picture M4-ai's profile picture Chinese LLMs on Hugging Face's profile picture ONNX Community's profile picture Swarms's profile picture Dataset Tools's profile picture Nerdy Face's profile picture Académie Du Numérique's profile picture Stranger Zone's profile picture open/ acc's profile picture Data Is Better Together Contributor's profile picture None yet's profile picture Taiwan Llama's profile picture Doge Face's profile picture LiteRT Community (FKA TFLite)'s profile picture Stranger Guard's profile picture Text Analysis, Understanding, and Reasoning Development's profile picture Twinkle AI's profile picture PowergenAI's profile picture Hugging Face MCP Course's profile picture Agents-MCP-Hackathon's profile picture

prithivMLmods's activity

replied to their post 4 days ago
view reply

I had added cookbooks in the course by HF, but missed mentioning smolagents in the list😭. updated now.
@merve

1.png

replied to ginipick's post 5 days ago
reacted to their post with 👍❤️ 6 days ago
view post
Post
4711
OpenAI, Google, Hugging Face, and Anthropic have released guides and courses on building agents, prompting techniques, scaling AI use cases, and more. Below are 10+ minimalistic guides and courses that may help you in your progress. 📖

⤷ Agents Companion : https://www.kaggle.com/whitepaper-agent-companion
⤷ Building Effective Agents : https://www.anthropic.com/engineering/building-effective-agents
⤷ Guide to building agents by OpenAI : https://cdn.openai.com/business-guides-and-resources/a-practical-guide-to-building-agents.pdf
⤷ Prompt engineering by Google : https://www.kaggle.com/whitepaper-prompt-engineering
⤷ Google: 601 real-world gen AI use cases : https://cloud.google.com/transform/101-real-world-generative-ai-use-cases-from-industry-leaders
⤷ Prompt engineering by IBM : https://www.ibm.com/think/topics/prompt-engineering-guide
⤷ Prompt Engineering by Anthropic : https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview
⤷ Scaling AI use cases : https://cdn.openai.com/business-guides-and-resources/identifying-and-scaling-ai-use-cases.pdf
⤷ Prompting Guide 101 : https://services.google.com/fh/files/misc/gemini-for-google-workspace-prompting-guide-101.pdf
⤷ AI in the Enterprise by OpenAI : https://cdn.openai.com/business-guides-and-resources/ai-in-the-enterprise.pdf

by HF🤗 :
⤷ AI Agents Course by Huggingface : https://huggingface.co/learn/agents-course/unit0/introduction
⤷ Smol-agents Docs : https://huggingface.co/docs/smolagents/en/tutorials/building_good_agents
⤷ MCP Course by Huggingface : https://huggingface.co/learn/mcp-course/unit0/introduction
⤷ Other Course (LLM, Computer Vision, Deep RL, Audio, Diffusion, Cookbooks, etc..) : https://huggingface.co/learn
  • 2 replies
·
posted an update 6 days ago
view post
Post
4711
OpenAI, Google, Hugging Face, and Anthropic have released guides and courses on building agents, prompting techniques, scaling AI use cases, and more. Below are 10+ minimalistic guides and courses that may help you in your progress. 📖

⤷ Agents Companion : https://www.kaggle.com/whitepaper-agent-companion
⤷ Building Effective Agents : https://www.anthropic.com/engineering/building-effective-agents
⤷ Guide to building agents by OpenAI : https://cdn.openai.com/business-guides-and-resources/a-practical-guide-to-building-agents.pdf
⤷ Prompt engineering by Google : https://www.kaggle.com/whitepaper-prompt-engineering
⤷ Google: 601 real-world gen AI use cases : https://cloud.google.com/transform/101-real-world-generative-ai-use-cases-from-industry-leaders
⤷ Prompt engineering by IBM : https://www.ibm.com/think/topics/prompt-engineering-guide
⤷ Prompt Engineering by Anthropic : https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview
⤷ Scaling AI use cases : https://cdn.openai.com/business-guides-and-resources/identifying-and-scaling-ai-use-cases.pdf
⤷ Prompting Guide 101 : https://services.google.com/fh/files/misc/gemini-for-google-workspace-prompting-guide-101.pdf
⤷ AI in the Enterprise by OpenAI : https://cdn.openai.com/business-guides-and-resources/ai-in-the-enterprise.pdf

by HF🤗 :
⤷ AI Agents Course by Huggingface : https://huggingface.co/learn/agents-course/unit0/introduction
⤷ Smol-agents Docs : https://huggingface.co/docs/smolagents/en/tutorials/building_good_agents
⤷ MCP Course by Huggingface : https://huggingface.co/learn/mcp-course/unit0/introduction
⤷ Other Course (LLM, Computer Vision, Deep RL, Audio, Diffusion, Cookbooks, etc..) : https://huggingface.co/learn
  • 2 replies
·
reacted to their post with 🤗❤️ 7 days ago
view post
Post
2137
Just made a demo for Cosmos-Reason1, a physical AI model that understands physical common sense and generates appropriate embodied decisions in natural language through long chain-of-thought reasoning. Also added video understanding support to it. 🤗🚀

✦ Try the demo here : prithivMLmods/DocScope-R1

⤷ Cosmos-Reason1-7B : nvidia/Cosmos-Reason1-7B
⤷ docscopeOCR-7B-050425-exp : prithivMLmods/docscopeOCR-7B-050425-exp
⤷ Captioner-Relaxed : Ertugrul/Qwen2.5-VL-7B-Captioner-Relaxed

⤷ Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0

⤷ GitHub :
https://github.com/PRITHIVSAKTHIUR/Cosmos-x-DocScope
https://github.com/PRITHIVSAKTHIUR/Nvidia-Cosmos-Reason1-Demo.

To know more about it, visit the model card of the respective model. !!
posted an update 7 days ago
view post
Post
2137
Just made a demo for Cosmos-Reason1, a physical AI model that understands physical common sense and generates appropriate embodied decisions in natural language through long chain-of-thought reasoning. Also added video understanding support to it. 🤗🚀

✦ Try the demo here : prithivMLmods/DocScope-R1

⤷ Cosmos-Reason1-7B : nvidia/Cosmos-Reason1-7B
⤷ docscopeOCR-7B-050425-exp : prithivMLmods/docscopeOCR-7B-050425-exp
⤷ Captioner-Relaxed : Ertugrul/Qwen2.5-VL-7B-Captioner-Relaxed

⤷ Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0

⤷ GitHub :
https://github.com/PRITHIVSAKTHIUR/Cosmos-x-DocScope
https://github.com/PRITHIVSAKTHIUR/Nvidia-Cosmos-Reason1-Demo.

To know more about it, visit the model card of the respective model. !!
reacted to AdinaY's post with 🚀 10 days ago
view post
Post
2818
Orsta 🔥 vision language models trained with V-Triune, a unified reinforcement learning system by MiniMax AI

One-RL-to-See-Them-All/one-rl-to-see-them-all-6833d27abce23898b2f9815a

✨ 7B & 32B with MIT license
✨ Masters 8 visual tasks: math, science QA, charts, puzzles, object detection, grounding, OCR, and counting
✨ Uses Dynamic IoU rewards for better visual understanding
✨Strong performance in visual reasoning and perception
replied to clem's post 15 days ago
view reply

Prompt : The word "HF" is made of soft, flowy fur on a vibrant-colored floor, well-lit by sunlight on a bright afternoon. The movement is subtle and soft. The camera doesn't move.

reacted to cfahlgren1's post with ❤️ 16 days ago
reacted to their post with ❤️🔥 16 days ago
view post
Post
2281
Got access to Google's all-new Gemini Diffusion a state-of-the-art text diffusion model. It delivers the performance of Gemini 2.0 Flash-Lite at 5x the speed, generating over 1000 tokens in a fraction of a second and producing impressive results. Below are some initial outputs generated using the model. ♊🔥

Gemini Diffusion Playground ✦ : https://deepmind.google.com/frontiers/gemini-diffusion

Get Access Here : https://docs.google.com/forms/d/1aLm6J13tAkq4v4qwGR3z35W2qWy7mHiiA0wGEpecooo/viewform?edit_requested=true

🔗 To know more, visit: https://deepmind.google/models/gemini-diffusion/
  • 1 reply
·
replied to their post 16 days ago
view reply

Prompt Used :

create an interactive web-based color picker with a saturation/value box and a hue slider. display the selected color and its hex, rgb, hsl, and cmyk values dynamically. use html, css for layout/styling, and javascript for color logic and interactivity.

a boat running upstream takes 8 hours 48 minutes to cover a certain distance, while it takes 4 hours to cover the same distance running downstream. what is the ratio between the speed of the boat and speed of the water current respectively? solve & generate the result in a web page.

design a fully functional chess game using html, css, and javascript in a single html file, with a responsive board, drag-and-drop piece movement, legal move validation, and check/checkmate detection.

using html, css, and javascript in a single html file to create a simulation of the solar system. pay extreme attention to the ui to make it as intuitive as possible. ensure that every planet appears as a sphere and is labeled with its corresponding name.

create an interactive bouncing ball game using html, css, and javascript in a single html file. the game should feature stunning animations, a controllable ball speed, and a slider brick. if the ball falls or goes down, the game is over.

posted an update 16 days ago
view post
Post
2281
Got access to Google's all-new Gemini Diffusion a state-of-the-art text diffusion model. It delivers the performance of Gemini 2.0 Flash-Lite at 5x the speed, generating over 1000 tokens in a fraction of a second and producing impressive results. Below are some initial outputs generated using the model. ♊🔥

Gemini Diffusion Playground ✦ : https://deepmind.google.com/frontiers/gemini-diffusion

Get Access Here : https://docs.google.com/forms/d/1aLm6J13tAkq4v4qwGR3z35W2qWy7mHiiA0wGEpecooo/viewform?edit_requested=true

🔗 To know more, visit: https://deepmind.google/models/gemini-diffusion/
  • 1 reply
·
reacted to their post with 🤗 17 days ago
view post
Post
2289
The more optimized explicit content filters with lightweight 𝙜𝙪𝙖𝙧𝙙 models trained based on siglip2 patch16 512 and vit patch16 224 for illustration and explicit content classification for content moderation in social media, forums, and parental controls for safer browsing environments. this version fixes the issues in the previous release, which lacked sufficient resources. 🚀

⤷ Models :
→ siglip2 mini explicit content : prithivMLmods/siglip2-mini-explicit-content [recommended]
→ vit mini explicit content : prithivMLmods/vit-mini-explicit-content

⤷ Building image safety-guard models : strangerguardhf

⤷ Datasets :
→ nsfw multidomain classification : strangerguardhf/NSFW-MultiDomain-Classification
→ nsfw multidomain classification v2.0 : strangerguardhf/NSFW-MultiDomain-Classification-v2.0

⤷ Collection :
→ Updated Versions [05192025] : prithivMLmods/explicit-content-filters-682aaa4733e378561925ca2b
→ Previous Versions : prithivMLmods/siglip2-content-filters-042025-final-680fe4aa1a9d589bf2c915ff

Find a collections inside the collection.👆

To know more about it, visit the model card of the respective model.
  • 1 reply
·
posted an update 17 days ago
view post
Post
2289
The more optimized explicit content filters with lightweight 𝙜𝙪𝙖𝙧𝙙 models trained based on siglip2 patch16 512 and vit patch16 224 for illustration and explicit content classification for content moderation in social media, forums, and parental controls for safer browsing environments. this version fixes the issues in the previous release, which lacked sufficient resources. 🚀

⤷ Models :
→ siglip2 mini explicit content : prithivMLmods/siglip2-mini-explicit-content [recommended]
→ vit mini explicit content : prithivMLmods/vit-mini-explicit-content

⤷ Building image safety-guard models : strangerguardhf

⤷ Datasets :
→ nsfw multidomain classification : strangerguardhf/NSFW-MultiDomain-Classification
→ nsfw multidomain classification v2.0 : strangerguardhf/NSFW-MultiDomain-Classification-v2.0

⤷ Collection :
→ Updated Versions [05192025] : prithivMLmods/explicit-content-filters-682aaa4733e378561925ca2b
→ Previous Versions : prithivMLmods/siglip2-content-filters-042025-final-680fe4aa1a9d589bf2c915ff

Find a collections inside the collection.👆

To know more about it, visit the model card of the respective model.
  • 1 reply
·
reacted to cbensimon's post with 🔥 20 days ago
view post
Post
5728
🚀 ZeroGPU medium size is now available as a power-user feature

Nothing too fancy for now—ZeroGPU Spaces still default to large (70GB VRAM)—but this paves the way for:
- 💰 size-based quotas / pricing (medium will offer significantly more usage than large)
- 🦣 the upcoming xlarge size (141GB VRAM)

You can as of now control GPU size via a Space variable. Accepted values:
- auto (future default)
- medium
- large (current default)

The auto mode checks total CUDA tensor size during startup:
- More than 30GB → large
- Otherwise → medium
·
reacted to burtenshaw's post with 🚀 20 days ago
view post
Post
3120
We're thrilled to announce the launch of our comprehensive Model Context Protocol (MCP) Course! This free program is designed to take learners from foundational understanding to practical application of MCP in AI.

Follow the course on the hub: mcp-course

In this course, you will:
📖 Study Model Context Protocol in theory, design, and practice.
🧑‍💻 Learn to use established MCP SDKs and frameworks.
💾 Share your projects and explore applications created by the community.
🏆 Participate in challenges and evaluate your MCP implementations.
🎓 Earn a certificate of completion.

At the end of this course, you'll understand how MCP works and how to build your own AI applications that leverage external data and tools using the latest MCP standards.
  • 1 reply
·