Kato Steven Mubiru
AI & ML interests
Recent Activity
Organizations


Hello
@lhoestq
and the Hugging Face team,
Sorry the long comment but hope it would be worth it.🙏🏿.
Thank you for this comprehensive guide on sharing datasets on the Hub! Your emphasis on making datasets accessible while maintaining security and visibility resonates deeply with our recent work.
We recently launched the Ugandan Cultural Context Benchmark (UCCB) on Hugging Face (https://huggingface.co/datasets/CraneAILabs/UCCB), which represents a significant milestone for African AI evaluation. As the first comprehensive benchmark testing AI's understanding of African cultural contexts, UCCB addresses a critical gap where models often fail due to training data that predominantly reflects Western experiences.
Your article's points about reach and community engagement particularly struck us. With only 0.02% of internet content in African languages and Africa contributing less than 1% to global AI training data, initiatives like UCCB are essential for building more inclusive AI systems. The dataset features 1,039 expert-verified questions across 24 cultural domains, from traditional medicine to modern slang, testing whether AI truly understands African contexts beyond mere translation.
We believe sharing the story behind UCCB and our methodology could inspire similar initiatives across Africa and other underrepresented regions. While we attempted to join the Blog Explorers organization to
share our journey, we understand access is limited. However, we wonder if there might be an opportunity to collaborate with Hugging Face experts on a blog post that highlights:
How African researchers are leveraging the Hub's infrastructure to address regional AI challenges
The importance of culturally-aware benchmarks in the global AI ecosystem
Practical insights for other regions looking to create similar evaluation frameworks
The role of community collaboration in building datasets that truly represent diverse perspectives
As you noted in your article, "the true impact of research comes from reaching the right audience." We believe showcasing African AI innovation on Hugging Face's platform could inspire more regional contributions and demonstrate how the Hub enables researchers worldwide to shape AI development, not just consume it.
Would you or someone from the team be open to exploring a collaborative blog post? Even guidance on independently publishing our story in a way that aligns with Hugging Face's community values would be invaluable. We're committed to the same principles of open science and community engagement that make the Hub special.(We have tried to reach out to the team at Hugging Face but no one has given us any time for months now🥲).
Thank you for building infrastructure that empowers global voices in AI. Looking forward to any opportunities to share how African perspectives are contributing to more equitable AI development.
Best regards,
Kato Steven Mubiru & Bronson Bakunga
@katostevenmubiru
P.S. The Dataset Viewer and SQL Console features you highlighted have been instrumental in making UCCB accessible to researchers with limited computational resources - exactly the democratization we need for inclusive AI development!
Seeking Guidance from SmolLM Team: A Blueprint for "Crane-01," a Sovereign African Language Model





Cohere on Hugging Face Inference Providers 🔥
I'm releasing the speech version of Gemma-3!

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Open-source DeepResearch – Freeing our search agents
