BigScience Biomedical Datasets

non-profit

AI & ML interests

We aim to unify the schema across many different biomedical NLP resources.

Recent Activity

bigbio's activity

prithivMLmods 
posted an update 1 day ago
view post
Post
1922
Dropping Downstream tasks using newly initialized parameters and weights ([classifier.bias & weights]) support domain-specific 𝗶𝗺𝗮𝗴𝗲 𝗰𝗹𝗮𝘀𝘀𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻. Based on siglip2-base-patch16-224 and DomainNet (single-domain, multi-source adaptation), with Fashion-MNIST for experimental testing. 🧤☄️

Fashion-Mnist : prithivMLmods/Fashion-Mnist-SigLIP2
Multisource-121 : prithivMLmods/Multisource-121-DomainNet
Painting-126 : prithivMLmods/Painting-126-DomainNet
Sketch-126 : prithivMLmods/Sketch-126-DomainNet
Clipart-126 : prithivMLmods/Clipart-126-DomainNet

Models are trained with different parameter settings for experimental purposes only, with the intent of further development. Refer to the model page below for instructions on running it with Transformers 🤗.

Collection : prithivMLmods/domainnet-0324-67e0e3c934c03cc40c6c8782

Citations : SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features https://arxiv.org/pdf/2502.14786 & Moment Matching for Multi-Source Domain Adaptation : https://arxiv.org/pdf/1812.01754

prithivMLmods 
posted an update 6 days ago
view post
Post
2181
Play with Orpheus TTS, a Llama-based Speech-LLM designed for high-quality, empathetic text-to-speech generation. This model has been fine-tuned to deliver human-level speech synthesis 🔥🗣️

👉GitHub [ Demo ] : https://github.com/PRITHIVSAKTHIUR/Orpheus-TTS-Edge

Demo supporting both text-to-speech and text-to-llm responses in speech.

> voice: tara, dan, emma, josh
> emotion: <laugh>, <chuckle>, <sigh>, <cough>, <sniffle>, <groan>, <yawn>, <gasp>.

🥠Orpheus-3b-0.1-ft
Model Page: canopylabs/orpheus-3b-0.1-ft

🥠Orpheus-3b-0.1-ft
Colab Inference Notebook: https://colab.research.google.com/drive/1KhXT56UePPUHhqitJNUxq63k-pQomz3N?usp=sharing

🥠Finetune [ orpheus-3b-0.1-pretrained ]
Resource: https://github.com/canopyai/Orpheus-TTS/tree/main/finetune

🥠Model-releases:
https://canopylabs.ai/model-releases
  • 1 reply
·
AtAndDev 
posted an update 10 days ago
view post
Post
4103
There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...
  • 6 replies
·
prithivMLmods 
posted an update 12 days ago
view post
Post
922
Hey Guys! One Small Announcement 🤗
Stranger Zone now accepts LoRA requests!

✍️Request : strangerzonehf/Request-LoRA [ or ] strangerzonehf/Request-LoRA#1

Page : https://huggingface.co/strangerzonehf

Describe the artistic properties by posting sample images or links to similar images in the request discussion. If the adapters you're asking for are truly creative and safe for work, I'll train and upload the LoRA to the Stranger Zone repo!

Thank you!
AtAndDev 
posted an update 14 days ago
view post
Post
1548
Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it.
prithivMLmods 
posted an update 14 days ago
view post
Post
2473
Gemma-3-4B : Image and Video Inference 🖼️🎥

🧤Space: prithivMLmods/Gemma-3-Multimodal
🥠Git : https://github.com/PRITHIVSAKTHIUR/Gemma-3-Multimodal

@gemma3 : {Tag + Space_+ 'prompt'}
@video-infer : {Tag + Space_+ 'prompt'}

+ Gemma3-4B : google/gemma-3-4b-it
+ By default, it runs : prithivMLmods/Qwen2-VL-OCR-2B-Instruct

Gemma 3 Technical Report : https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf
  • 1 reply
·
not-lain 
posted an update 14 days ago
mkurman 
posted an update 14 days ago
prithivMLmods 
posted an update 15 days ago
mkurman 
posted an update 18 days ago
view post
Post
830
Just released NVAMP Loss!

✔️ modification of the cross-entropy loss function designed specifically for training LLMs.
✔️ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
✔️ more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! 🔥🤖

https://github.com/mkurman/nvamp-loss
albertvillanova 
posted an update 19 days ago
view post
Post
3683
🚀 New smolagents update: Safer Local Python Execution! 🦾🐍

With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. 🔒

Here's why this matters & what you need to know! 🧵👇

1️⃣ Why is local execution risky? ⚠️
AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.

2️⃣ New Safety Layer in smolagents 🛡️
We now inspect every return value during execution:
✅ Allowed: Safe built-in types (e.g., numbers, strings, lists)
⛔ Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)

3️⃣ Immediate Benefits 💡
- Prevent agents from accessing unsafe builtins
- Block unauthorized file or network access
- Reduce accidental security vulnerabilities

4️⃣ Security Disclaimer ⚠️
🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨
If you need true isolation, use a remote sandboxed executor like Docker or E2B.

5️⃣ The Best Practice: Use Sandboxed Execution 🔐
For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.

6️⃣ Upgrade Now & Stay Safe! 🚀
Check out the latest smolagents release and start building safer AI agents today.

🔗 https://github.com/huggingface/smolagents

What security measures do you take when running AI-generated code? Let’s discuss! 👇

#AI #smolagents #Python #Security
  • 2 replies
·
Tonic 
posted an update 19 days ago
view post
Post
1162
🙋🏻‍♂️Hey there folks,

Did you know that you can use ModernBERT to detect model hallucinations ?

Check out the Demo : Tonic/hallucination-test

See here for Medical Context Demo : MultiTransformer/tonic-discharge-guard

check out the model from KRLabs : KRLabsOrg/lettucedect-large-modernbert-en-v1

and the library they kindly open sourced for it : https://github.com/KRLabsOrg/LettuceDetect

👆🏻if you like this topic please contribute code upstream 🚀

  • 2 replies
·
mkurman 
posted an update 20 days ago
albertvillanova 
posted an update 20 days ago
view post
Post
3839
🚀 Big news for AI agents! With the latest release of smolagents, you can now securely execute Python code in sandboxed Docker or E2B environments. 🦾🔒

Here's why this is a game-changer for agent-based systems: 🧵👇

1️⃣ Security First 🔐
Running AI agents in unrestricted Python environments is risky! With sandboxing, your agents are isolated, preventing unintended file access, network abuse, or system modifications.

2️⃣ Deterministic & Reproducible Runs 📦
By running agents in containerized environments, you ensure that every execution happens in a controlled and predictable setting—no more environment mismatches or dependency issues!

3️⃣ Resource Control & Limits 🚦
Docker and E2B allow you to enforce CPU, memory, and execution time limits, so rogue or inefficient agents don’t spiral out of control.

4️⃣ Safer Code Execution in Production 🏭
Deploy AI agents confidently, knowing that any generated code runs in an ephemeral, isolated environment, protecting your host machine and infrastructure.

5️⃣ Easy to Integrate 🛠️
With smolagents, you can simply configure your agent to use Docker or E2B as its execution backend—no need for complex security setups!

6️⃣ Perfect for Autonomous AI Agents 🤖
If your AI agents generate and execute code dynamically, this is a must-have to avoid security pitfalls while enabling advanced automation.

⚡ Get started now: https://github.com/huggingface/smolagents

What will you build with smolagents? Let us know! 🚀💡
Tonic 
posted an update 20 days ago
view post
Post
701
Powered by KRLabsOrg/lettucedect-large-modernbert-en-v1 from KRLabsOrg.

Detect hallucinations in answers based on context and questions using ModernBERT with 8192-token context support!

### Model Details
- **Model Name**: [lettucedect-large-modernbert-en-v1]( KRLabsOrg/lettucedect-large-modernbert-en-v1)
- **Organization**: [KRLabsOrg](https://huggingface.co/KRLabsOrg)
- **Github**: [https://github.com/KRLabsOrg/LettuceDetect](https://github.com/KRLabsOrg/LettuceDetect)
- **Architecture**: ModernBERT (Large) with extended context support up to 8192 tokens
- **Task**: Token Classification / Hallucination Detection
- **Training Dataset**: [RagTruth]( wandb/RAGTruth-processed)
- **Language**: English
- **Capabilities**: Detects hallucinated spans in answers, provides confidence scores, and calculates average confidence across detected spans.

LettuceDetect excels at processing long documents to determine if an answer aligns with the provided context, making it a powerful tool for ensuring factual accuracy.
prithivMLmods 
posted an update 21 days ago
mkurman 
posted an update 21 days ago
rmayormartins 
posted an update 22 days ago
view post
Post
445
🌎 Join Latin American AI Developers Organization!
Share your AI work and connect with developers across Latin America! We welcome:
Spaces/Datasets/Models
Visit our Space: AI Developers from Latin America Community https://huggingface.co/LatinAI
_________________________________
🌎 ¡Únete a la Organizacion de Desarrolladores de IA de América Latina!
¡Comparte tu trabajo en IA y conecta con desarrolladores de toda Latinoamérica! Buscamos:
Spaces/Datasets/Modelos
Visita nuestro Space: AI Developers from Latin America Community https://huggingface.co/LatinAI
#LatinAmerica #AI #MachineLearning #HuggingFace #LatAm
mkurman 
posted an update 23 days ago