John6666 (John Smith)

reacted to MrDragonFox's post with 👍 about 7 hours ago

Post

966

yet a other audio datasets pre classified for events + audio aestetics

this time for german - 680h sampled from emilia yodas

timestamps for asr training or other fancier things available as nc in the raw repo

MrDragonFox/DE_Emilia_Yodas_680h

cc by 4.0 as by emilia yodas

raw events / transcriptions are cc by NC 4.0

MrDragonFox/DE_Emilia_Yodas_680h_raw_timestamps

the coming days i should push about 600h english + some japanese too same format

reacted to openfree's post with 🔥 about 7 hours ago

Post

1379

Agentic AI Era: Analyzing MCP vs MCO 🚀

Hello everyone!
With the rapid advancement of AI agent technology, two architectures have come into the spotlight: MCP (Model Context Protocol) and MCO (Model Context Open-json). Today, we’ll introduce the key features and differences of these two approaches.

VIDraft/Agentic-AI-CHAT

MCP: The Traditional Approach 🏛️
Centralized Function Registry: All functions are hardcoded into the core system.

Static Function Definitions & Tight Coupling: New features require changes to the core application code, limiting scalability.

Monolithic Design: Complex deployment and version management can cause a single error to affect the whole system.

Code Example:
'''py
FUNCTION_REGISTRY = {
"existing_function": existing_function,
"new_function": new_function # Adding a new function
}
'''

MCO: A Revolutionary Approach 🆕
JSON-based Function Definitions: Function details are stored in external JSON files, enabling dynamic module loading.

Loose Coupling & Microservices: Each function can be developed, tested, and deployed as an independent module.

Flexible Scalability: Add new features by simply updating the JSON and module files, without modifying the core system.

JSON Example:
[
{
"name": "analyze_sentiment",
"module_path": "nlp_tools",
"func_name_in_module": "sentiment_analysis",
"example_usage": "analyze_sentiment(text=\"I love this product!\")"
}
]

Why MCO? 💡
Enhanced Development Efficiency: Developers can focus on their own modules with independent testing and deployment.

Simplified Error Management: Errors remain confined within their modules, enabling quick hotfixes.

Future-Proofing: With potential features like remote function calls (RPC), access control, auto-documentation, and a function marketplace, MCO paves the way for rapid innovation.

Practical Use & Community 🤝
The MCO implementation has been successfully tested on Vidraft’s LLM (based on Google Gemma-3)

reacted to nyuuzyou's post with 👍 about 22 hours ago

Post

2877

🇷🇺 Russian Forum Messages Dataset - nyuuzyou/ruforum

Collection of approximately 58 million Russian forum messages featuring:

- Complete message content from Russian online forums spanning 2010-2025
- Comprehensive metadata including unique message IDs and timestamps
- Full text content preserving original user discussions and interactions
- Monolingual dataset focused exclusively on Russian language content

This dataset offers a unique textual archive of Russian online conversations suitable for text generation, sentiment analysis, and language modeling research. Released to the public domain under CC0 1.0 license.

reacted to katsukiai's post with 🧠 1 day ago

Post

2687

DeepFocus datasets are not allowed to be used in cases where mean is used in that dataset

Why?
├── This discussion is comments by the user. https://huggingface.co/JLouisBiz
├── Hello,
├── As a fork of a DeepSeek, you are required to give credit to DeepSeek according to the original MIT license. Could you please look into licensing terms and comply please?
├── I also do not see why are you making your own license, why don't you simple leave it with original MIT license?
└── I see that your license is also free software, but it brings legal problems when you are changing license, you are free to sublicense MIT licensed software, but re-licensing it without complying to initial terms is not allowed.
Unlicensed
├── DeepFocus
├── Wrong license and using modified license (Unpaper provided @aide-julius)
└── The dataset with the modified license does not use the same license as DeepSeek is using, EOS this license
Symbol
└── EOS
    └── End of service

Thank you,
Best Regards,
Sam from The KatsukiAI Team

2 replies

·

reacted to rajistics's post with 👍 1 day ago

Post

2106

Having some fun with long context benchmarks (watch the video!!)

NoLiMA: NoLiMa: Long-Context Evaluation Beyond Literal Matching (2502.05167)
Fiction LiveBench: https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87
Michalenglo: https://deepmind.google/research/publications/117639/
LongGenBench: Spinning the Golden Thread: Benchmarking Long-Form Generation in Language Models (2409.02076)
NeedleBench: NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? (2407.11963)
RULER: RULER: What's the Real Context Size of Your Long-Context Language Models? (2404.06654)

For more: https://www.reddit.com/r/rajistics/comments/1jxwk29/long_context_llm_benchmarks_video/

let me know if you like these posts

reacted to Kseniase's post with 👍 1 day ago

Post

2877

16 new research on inference-time scaling:

For the last couple of weeks a large amount of studies on inference-time scaling has emerged. And it's so cool, because each new paper adds a trick to the toolbox, making LLMs more capable without needing to scale parameter count of the models.

So here are 13 new methods + 3 comprehensive studies on test-time scaling:

1. Inference-Time Scaling for Generalist Reward Modeling (2504.02495)
Probably, the most popular study. It proposes to boost inference-time scalability by improving reward modeling. To enhance performance, DeepSeek-GRM uses adaptive critiques, parallel sampling, pointwise generative RM, and Self-Principled Critique Tuning (SPCT)

2. T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models (2504.04718)
Allows small models to use external tools, like code interpreters and calculator, to enhance self-verification

3. Z1: Efficient Test-time Scaling with Code (2504.00810)
Proposes to train LLMs on code-based reasoning paths to make test-time scaling more efficient, limiting unnecessary tokens with a special dataset and a Shifted Thinking Window

4. GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning (2504.00891)
Introduces GenPRM, a generative PRM, that uses CoT reasoning and code verification for step-by-step judgment. With only 23K training examples, GenPRM outperforms prior PRMs and larger models

5. Can Test-Time Scaling Improve World Foundation Model? (2503.24320)
SWIFT test-time scaling framework improves World Models' performance without retraining, using strategies like fast tokenization, Top-K pruning, and efficient beam search

6. Relevance Isn't All You Need: Scaling RAG Systems With Inference-Time Compute Via Multi-Criteria Reranking (2504.07104)
Proposes REBEL for RAG systems scaling, which uses multi-criteria optimization with CoT prompting for better performance-speed tradeoffs as inference compute increases

7. $φ$-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation (2503.13288)
Proposes a φ-Decoding strategy that uses foresight sampling, clustering and adaptive pruning to estimate and select optimal reasoning steps

Read further below 👇

Also, subscribe to the Turing Post https://www.turingpost.com/subscribe

2 replies

·

reacted to onekq's post with 👀 1 day ago

Post

555

I just compared tasks with different input/output lengths. CPU/GPU performances are very different here.

The LLMs we use today are autoregressive or causal inference models, meaning the generation of each output token depends on all previous tokens. Since the model must generate one token at a time, it sets a hard limit on parallelism. The chatbot simulating human typing is in fact a UI trick to gloss over this fundamental limit. This is great news for CPUs because it levels the playing field.

But when processing input tokens, this limit doesn't exist. The GPU can fire up thousands of cores (vs dozens of CPU cores) to process as many input tokens as it can, all at once. Here, GPU enjoys a significant speed margin over CPU. The longer the prompt, the bigger the margin.

So, when it comes to user experience, both GPU and CPU can output text at decent speed. What really distinguishes them is the initial wait time, i.e. prompt processing delay.

1 reply

·

reacted to JLouisBiz's post with 👍 1 day ago

Post

2478

Article: https://huggingface.co/blog/JLouisBiz/semantical-website-links

You don't need to do the tedious work of finding all those links on your huge website.

Automating semantic links on websites using Large Language Models (LLMs) enhances user experience and efficiency. Here's a simplified workflow:

1. Store LLM embeddings in PostgreSQL: Use the vector data type to store text embeddings generated by an LLM.
2. Divide page texts into chunks for processing.
3. Generate embeddings using an LLM for each chunk of text.
4. Create template markup around specific terms needing links.

An automated program then:

- Converts marked-up terms to their corresponding LLMs' embeddings,
- Compares these with stored database embeddings (using cosine similarity),
- Identifies the most relevant page based on highest similarity score, and
- Automatically adds a link from the original content to this contextually related information.

This process improves navigation by directing users to highly contextual pages. It saves time as it automates creating semantic links while maintaining accuracy.

reacted to Dragunflie-420's post with 👀 1 day ago

Post

828

Hello HF community. I hope all is well on your fronts today. So I have a request and I hope someone has a lil extra time to show this ole grandma how to make this game mod for minecraft (interface that shows the fullness of what the games intent on becoming is) actually work. This is crucial for redirecting a troubled teen (my grandson) back to the reality he lives in but wont learn to build or describe anything. This may not make sense to you but it would be a gracious thing for someone to do for someone you do not know. The overall design of the game and its theme is mine but you are more than welcome to remix it or pkg it as a template IDC what you do with it other than show me how to make what the interface says is the workings under the hood be the content that indeed does and appears as it says that it will when pushing the spot thats going to render the content. I hope that makes sense because I confess I am not a versed dev just pick up and go as I can grasp the info necessary to get on to the next phase. Thats how I have learned what I know and I can describe things to AI very well to get interfaces like you see here. I do have a tool that I created with just prompts that I am very happy about...Its better than the app maker that made the app for me lmbo...thats the idea isnt it? The best storyteller will be the top dog of the AI reset revolution! The days of traditional school and educating is dead...its a new education system on the horizen and it involes cescribing your world for the thalamus (os processor internal) aka the file manager for the human optical dept. the parser per se...say it aint so and well Ill say you might be a redneck lmbo...Im not hard to talk to so come on and lets shoot the sh**! Dragunflie-420/minecraft-mod-elarion-valley

2 replies

·

reacted to vincentg64's post with 👀 2 days ago

Post

1011

Universal Dataset to Test, Enhance and Benchmark AI Algorithms https://mltblog.com/4ia7r2D

This scientific research has three components. First, my most recent advances towards solving one of the most famous, multi-century old conjectures in number theory. One that kids in elementary school can understand, yet incredibly hard to prove. At the very core, it is about the spectacular quantum dynamics of the digit sum function.

Then, I present an infinite dataset that has all the patterns you or AI can imagine, and much more, ranging from obvious to undetectable. More specifically, it is an infinite number of infinite datasets all in tabular format, with various degrees of auto- and cross-correlations (short and long range) to test, enhance and benchmark AI algorithms including LLMs. It is based on the physics of the digit sum function and linked to the aforementioned conjecture. This synthetic data of its own kind is useful in context such as fraud detection or cybersecurity.

Finally, it comes with very efficient Python code to generate the data, involving gigantic numbers and high precision arithmetic.

➡️ Read article and learn how to use and generate dataset, at https://mltblog.com/4ia7r2D

reacted to JLouisBiz's post with 🤝 2 days ago

Post

3149

**Video**: https://www.youtube.com/watch?v=jRKRsGsLfW0

**Integrating large language model with file manager to describe your illegally downloaded movies.**

When you have a bunch of movies downloaded by Torrent, you maybe want a description and description is missing. This video shows how you can use the script to invoke the large language model. And then you get a description of a movie in a second or three.

reacted to Yehor's post with 🔥 2 days ago

Post

2430

I have made a Rust project with integration of the latest state-of-the-art model for object detection, it outperforms YOLO!

Check it out: https://github.com/egorsmkv/rf-detr-usls

2 replies

·

reacted to fdaudens's post with 👀 2 days ago

Post

1879

Want AI that truly understands your country's culture? Public institutions are sitting on the next AI revolution - and here's the practical guide to unlock it.

I've had fascinating conversations recently about sovereign AI, with people trying to solve this recurring question: "How do we build AI that truly understands our culture?"

This guide by @evijit and @yjernite brings lots of insights about this question. It's not just about throwing data at models. It's about partnering cultural expertise with tech infrastructure in ways we're just starting to figure out.

An example? The National Library of Norway already has 150+ AI models on Hugging Face. They're not just digitizing books - they're building AI that thinks in Norwegian, understands Norwegian values, and serves Norwegian citizens.

This is sovereign AI in practice: technology that understands your culture, values, and languages.

Especially loved the practical examples on how to do this:
- Real examples from museums, libraries, and government agencies
- How to convert complex documents (PDFs, PowerPoints) into ML-ready formats
- Code templates for processing public data
- Technical recipes for sharing datasets on open platforms

The stakes? Citizens' ability to leverage their collective digital intelligence.

The technology is ready. The infrastructure exists. The guide shows exactly how to use it. What's needed is your cultural expertise to shape these tools.

Check it out: https://huggingface.co/blog/evijit/public-org-data-ai

P.s.: Building cool projects in a public institution? Share them in the comments for others to learn from!

reacted to piper2024's post with 👍 2 days ago

Post

1592

I have private space 1tb, coming to limit needed to increase, subscribed to pro but have not seen increase in space, what my next step? tks P

8 replies

·

reacted to odellus's post with 👍 2 days ago

Post

1815

Super grateful to @marriola for the release of the block diffusion code and model. I'm generating text with diffusion locally! Couldn't be more pleased.

2 replies

·

reacted to S-Dreamer's post with 👍 2 days ago

Post

1794

PiFlash
A simple web-based tool to flash Raspberry Pi OS images to your SD cards. No additional software required!

S-Dreamer/piflash

reacted to katsukiai's post with 👍 2 days ago

Post

1293

The DeepFocus (including https://huggingface.co/datasets/universeofml/DeepFocus-X3, https://huggingface.co/datasets/katsukiai/DeepFocus-X3) will:
* No updates on our official Spaces, We apologize for this incident
* No access to DeepFocus
* Only Gated user access
* Unlicensed by Unpaper unpaper/choosealicense (subject to 30 days to request destruction of files)
We apologize for the DeepFocus incident, If you are using DeepFocus, please contact us first or comment on this post
Thank you,
Best Regards,
The Katsuki-san Bakugou (formerly known KatsukiAI)

reacted to Fishtiks's post with 👀 2 days ago

Post

1412

I'm looking for a YouTube video summarizer to run locally. I did a search, but all of the models and spaces I was able to find here didn't work, which I find surprising, since it's a great tool I already use. Perhaps one of you can provide a better option, or just tell me what this actually is to get it: https://dev.gptcall.pages.dev/chat#id=&contactName=Youtube+summarizer

Other functionality I'd like to see is a genre-based music creation and alteration model. "Make it country" or "do a freestyle rap," as examples. I'm willing to work with someone on this, because I'd need help understanding. I'd also like to make medical AI, like Dr. Samantha, that functions like a PDR well, and doesn't get confused by drug names.

reacted to AdinaY's post with 🔥 3 days ago

Post

2994

Shanghai AI Lab - OpenGV team just released InternVL3 🔥

OpenGVLab/internvl3-67f7f690be79c2fe9d74fe9d

✨ 1/2/8/9/14/38/28B with MIT license
✨ Stronger perception & reasoning vs InternVL 2.5
✨ Native Multimodal Pre-Training for even better language performance

1 reply

·

reacted to jasoncorkill's post with 🔥 3 days ago

Post

3135

🚀 We tried something new!

We just published a dataset using a new (for us) preference modality: direct ranking based on aesthetic preference. We ranked a couple of thousand images from most to least preferred, all sampled from the Open Image Preferences v1 dataset by the amazing @data-is-better-together team.

📊 Check it out here:
Rapidata/2k-ranked-images-open-image-preferences-v1

We're really curious to hear your thoughts!
Is this kind of ranking interesting or useful to you? Let us know! 💬

If it is, please consider leaving a ❤️ and if we hit 30 ❤️s, we’ll go ahead and rank the full 17k image dataset!

5 replies

·

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity