unmodeled-tyler (Tyler)

replied to mahimairaja's post about 8 hours ago

Nice list! I'm with you on #1, I've really been enjoying 3.2 quite a bit.

reacted to mahimairaja's post with 🔥 about 8 hours ago

Post

288

My Favorite Open Source Models for Jan 2026

1. General Use - deepseek-ai/DeepSeek-V3.2
2. Reasoning - deepseek-ai/DeepSeek-V3.2-Speciale
3. Coding - Qwen/Qwen3-Coder-30B-A3B-Instruct
4. OCR - Qwen/Qwen3-VL-8B-Instruct
5. Image Generation - black-forest-labs/FLUX.2-dev
6. Image Editing - Qwen/Qwen-Image-Edit-2509

What model do you use regularly?

2 replies

·

reacted to IlyasMoutawwakil's post with 🔥 about 13 hours ago

Post

505

After 2 months of refinement, I'm happy to announce that a lot of Transformers' modeling code is now significantly more torch-compile & export-friendly 🔥

Why it had to be done 👇
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !

Transformers models are now easier to:
⚙️ Compile end-to-end with torch.compile backends
📦 Export reliably via torch.export and torch.onnx.export
🚀 Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.

This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.

We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.

There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.

PR in the comments ! More updates coming coming soon !

1 reply

·

reacted to Ujjwal-Tyagi's post with 🔥 about 13 hours ago

Post

175

There is a new open-source music generation model called HeartMuLa. It offers strong, competitive performance compared to Suno and supports English, Chinese, Japanese, Korean, and Spanish. It is optimized to run easily on RTX GPUs and other consumer-grade hardware. HeartMuLa/HeartMuLa-oss-3B
https://github.com/HeartMuLa/heartlib

1 reply

·

replied to mitkox's post 1 day ago

"good, fast, and cheap" are the magic words!

reacted to mitkox's post with 🚀 1 day ago

Post

1025

GLM-4.7-Flash is fast, good and cheap.
3,074 tokens/sec peak at 200k tokens context window on my desktop PC.
Works with Claude Code and opencode for hours. No errors, drop-in replacement of the Anthropic cloud AI.
MIT licensed, open weights, free for commercial use and modifications.
Supports speculative decoding using MTP, which is highly effective in mitigating latency.
Great for on device AI coding as AWQ 4bit at 18.5 GB. Hybrid inference on a single consumer GPU + CPU RAM.

3 replies

·

reacted to branikita's post with 🚀 1 day ago

Post

1297

Our engineer Alan from https://robonine.com/ (Educational Robotics) integrated Feetech STS3250 and STS3215 servo motors into the prototype and completed the first test run of a 6-DOF semi-SCARA manipulator.

During motion, the structure demonstrates high stiffness with no visible backlash or mechanical play. The kinematic chain remains stable throughout the test trajectory, confirming the rigidity of the mechanical design and joint assembly.

The next stage includes full assembly with all actuators operating in backlash compensation mode, followed by quantitative measurement of positioning accuracy and repeatability.

1 reply

·

reacted to hassenhamdi's post with 🔥 1 day ago

Post

1255

Google published the paper. I shipped the code. 🚀

DeepMind just released PACEvolve (Progress-Aware Consistent Evolution), a massive overhaul of the AlphaEvolve framework. It solves the critical issues of "Context Pollution" and "Mode Collapse" that have historically crippled evolutionary coding agents.

But there was no public implementation. So I built one.

Introducing OpenPACEvolve: A fully open-source, production-grade implementation of the PACEvolve framework.

🛠 I engineered this framework solo, but I wasn't working alone. I orchestrated a custom coding agents powered by Claude Opus 4.5 as Engineer and Gemini Pro 3 Preview ensuring fiedelity and quallty.

By leveraging these SOTA models, I was able to translate complex theoretical research into functional, modular Python architecture in record time. This is what the future of AI engineering looks like: Human architectural oversight + AI velocity.

🧠 What OpenPACEvolve Solves: Unlike standard agents that get "stuck" in loops, this framework implements the paper's full recipe for long-horizon stability: ✅ Hierarchical Context Management (HCM): Bi-level pruning to keep the agent's memory clean. ✅ Momentum-Based Backtracking (MBB): Uses "power-law backtracking" to detect stagnation and force pivots. ✅ Self-Adaptive Crossover: Intelligent code-sharing between parallel "islands."

👨‍💻 This project is more than a repo; it's a demonstration of rapid research-to-production cycles using next-gen AI workflows.

📎 Link of the paper : https://arxiv.org/abs/2601.10657

The code is live. The agents are ready. Check out the repository below. 👇
https://github.com/hassenhamdi/OpenPACEvolve
Star the repo 🌟.

reacted to efecelik's post with 🔥🔥 1 day ago

Post

1275

🎮 Introducing: Paper Popularity Game

Think you know which AI papers go viral? Test your instincts!
I built a little game where you try to guess the popularity of AI research papers from the Hugging Face Daily Papers feed.

How it works:
You'll see two papers side by side—read the titles, check the abstracts, and pick which one you think got more upvotes from the HF community.

It's a great way to discover trending AI research while having fun.
Tests your intuition about what the ML community finds interesting.

Try it out:
efecelik/paper-popularity-game
Would love to hear your high scores and feedback!

reacted to paulpham157's post with 🧠 2 days ago

Post

1203

Two things to know right before starting:

- Learn Git. Git is a great versioning tool, even when working alone. It's also essential when working in a team. Don't make excuses that you only do DL and can't do software development.
😞 Don't create files like:
main_backup_1.py
main_backup_2.py
main_backup_3.py
anymore...
(It sounds ridiculous, but I've actually seen some people do that... weren't students.)

- Try to keep everything stable. Imagine encountering errors during a demo. Keep the code clean so it runs smoothly and is maintainable. Always minimize DevOps steps to ensure quick reboot (this can be covered by some platforms; thanks to Hugging Face for making it easy and providing a basic infrastructure that most people can access almost for free). 🤤

reacted to Smooke's post with ❤️ 2 days ago

Post

1101

New

HackerNoon Post: The Words of Interest Benchmark Test For Matching an LLM to Your Interests https://hackernoon.com/the-words-of-interest-benchmark-test-for-matching-an-llm-to-your-interests

By picking individual words instead phrases or paraphrases or passages, this test bypasses plot summaries (which are everywhere regurgitating themselves online) and focuses on the author's words. It reveals whether an AI has truly "absorbed" the specific texture of a book or is simply echoing the general internet consensus.

reacted to phronos-research's post with 👀 2 days ago

Post

1140

Can we measure how AI interaction reshapes human cognition? We built two semantic association instruments that pit humans against Claude Haiku—testing divergent thinking and communicability under constraint. Try the instruments and contribute to the dataset: https://instruments.phronos.org/ins-001/

Explanation here: https://phronos.org/dispatches/semantic-cartography

reacted to nyuuzyou's post with 🔥 2 days ago

Post

1351

🏛️ Google Code Archive Dataset - nyuuzyou/google-code-archive

Expanding beyond the modern code series, this release presents a massive historical snapshot from the Google Code Archive. This dataset captures the open-source landscape from 2006 to 2016, offering a unique time capsule of software development patterns during the era before GitHub's dominance.

Key Stats:

- 65,825,565 files from 488,618 repositories
- 47 GB compressed Parquet storage
- 454 programming languages (Heavily featuring Java, PHP, and C++)
- Extensive quality filtering (excluding vendor code and build artifacts)
- Rich historical metadata: original repo names, file paths, and era-specific licenses

This is one of those releases that I'm most interested in getting feedback on. Would you like to see more old code datasets?

reacted to their post with 🚀 2 days ago

Post

1186

NEW MODEL: vanta-research/mox-small-1

Mox-Small-1 has landed on the Hub!

Finetuned from the fantastic Olmo3.1 32B architecture by AllenAI, Mox-Small-1 was trained using the same datasets and methodology as Mox-Tiny-1, making this model our second addition to the Mox-1 family of models.

Mox-1 is designed to prioritize clarity, honesty, and genuine utility over blind agreement. These models are perfect for when you want to be challenged in a constructive, helpful way.

By utilizing Olmo3.1 32B's architecture, Mox-Small-1 brings greater conversational depth and reasoning quality to the Mox-1 model family. Check it out!

posted an update 2 days ago

Post

1186

NEW MODEL: vanta-research/mox-small-1

Mox-Small-1 has landed on the Hub!

Finetuned from the fantastic Olmo3.1 32B architecture by AllenAI, Mox-Small-1 was trained using the same datasets and methodology as Mox-Tiny-1, making this model our second addition to the Mox-1 family of models.

Mox-1 is designed to prioritize clarity, honesty, and genuine utility over blind agreement. These models are perfect for when you want to be challenged in a constructive, helpful way.

By utilizing Olmo3.1 32B's architecture, Mox-Small-1 brings greater conversational depth and reasoning quality to the Mox-1 model family. Check it out!

replied to ZomiLanguage's post 3 days ago

Cool project! Good luck! 👍

reacted to ZomiLanguage's post with 🔥 3 days ago

Post

1480

🧠🌍 Zomi Language AI — Community-Driven, Open-Source

![Zomi Language AI – From Community to Model]

The **Zomi language** carries identity, faith, and history for its people, yet it remains underrepresented in modern AI systems.

This project introduces a **community-driven, open-source AI translation framework** that enables Zomi to be trained into AI systems **ethically, transparently, and sustainably**—by native speakers, for future generations.

### 🔁 How It Works
🧑‍🤝‍🧑 Community Texts → 📦 Open Datasets → 🤖 AI Training → 📊 Evaluation → 🔁 Community Review

### 🔓 Why Open-Source Matters
- 🤝 Community ownership
- 🕊️ Cultural & faith integrity
- ♻️ Long-term sustainability
- 🔍 Transparent datasets & models

This initiative demonstrates how **low-resource languages can shape the future of inclusive AI** through open collaboration.

> *No language should be digitally invisible.*

**@Zomi Language | fb.com/ZomiLanguage**

### 🏷️ Tags
#OpenSourceAI #LowResourceLanguages #NLP #MachineTranslation #LanguagePreservation #CommunityAI #ZomiLanguage

1 reply

·

reacted to marksverdhei's post with 🔥 3 days ago

Post

2560

Inspired by the heroes of day zero quants ( @TheBloke @danielhanchen @shimmyshimmer @bartowski ), I decided to join the race by releasing the first FP8 quant of glm-4.7-flash! Not as easy as i expected, but I'm happy i was still able to have it working within a few hours after the original model was released! Interested in feedback if anyone wants to try it out!

marksverdhei/GLM-4.7-Flash-FP8

Note: If my PR to vLLM isn't merged yet you might have to use my fork. Cheers! 🤗

reacted to danielhanchen's post with 🔥 3 days ago

Post

2395

Run GLM-4.7-Flash locally on your device with 24GB RAM!🔥

It's the best performing 30B model on SWE-Bench and GPQA. With 200K context, it excels at coding, agents, chat & reasoning.

GGUF: unsloth/GLM-4.7-Flash-GGUF

Guide: https://unsloth.ai/docs/models/glm-4.7-flash

Tyler PRO

AI & ML interests

Recent Activity

Organizations

Tyler PRO

AI & ML interests

Recent Activity

Organizations

unmodeled-tyler's activity