AI & ML interests

AGI and ML Pipelines, Ambient IoT AI, Behavior Cognitive and Memory AI, Clinical Medical and Nursing AI, Genomics AI, GAN Gaming GAIL AR VR XR and Simulation AI, Graph Ontology KR KE AI, Languages and NLP AI, Quantum Compute GPU TPU NPU AI, Vision Image Document and Audio/Video AI

Nymbo 
posted an update 16 days ago
view post
Post
820
There's now a custom Deep_Research tool in my Nymbo/Tools MCP server! TL;DR: The agent using the tools writes a summary of your requests and up to five DuckDuckGo searches (up to 50 results). Each of the webpages found in the searches are then fetched and given to our researcher (Qwen3-235B-A22B-Thinking-2507). The researcher sees the summary, searched queries, and fetched links, then writes a thorough research report. The agent using the tool provides the user with a summary of the report and a link to download research_report.txt. The researcher's instructions are similar to some leaked Perplexity sys prompts.

# Deep_Research Tool

It accomplishes everything in under a minute so it doesn't hit MCP's 60 second timeout, mostly thanks to Cerebras. The only thing required to make this work is a HF_READ_TOKEN for inference.

The Deep_Research tool could certainly be improved. It still needs some sort of mechanism for sorting URLs based on importance (I've got some ideas but I don't want it to be the responsibility of the agent using the tool). I'll probably add a second researcher to filter out the bad sources before inferencing the big researcher. I'm hellbent on keeping this all within the scope of one tool call.

# More Fetch/Web Search Improvements

The Search_DuckDuckGo tool has been further enhanced. It now allows the agent to browse through all pages of results. The results also now include published date (if detected). It also now supports every DDG search types! Default DDG search is called text, but it can also now search by news, images, videos, and books.

The Fetch_Webpage tool now specifies how much of the page has been truncated, and cursor index, allowing it to pickup where it left off without re-consuming tokens. The model can now also choose to strip CSS selectors to remove excess noise, and there's a new URL Scraper mode that only returns URLs found on the full page.

More to come soon ~
Nymbo 
posted an update 27 days ago
view post
Post
974
I have a few updates to my MCP server I wanna share: New Memory tool, improvements to web search & speech generation.

# Memory_Manager Tool

We now have a Memory_Manager tool. Ask ChatGPT to write all its memories verbatim, then tell gpt-oss-20b to save each one using the tool, then take them anywhere! It stores memories in a memories.json file in the repo, no external database required.

The Memory_Manager tool is currently hidden from the HF space because it's intended for local use. It's enabled by providing a HF_READ_TOKEN in the env secrets, although it doesn't actually use the key for anything. There's probably a cleaner way of ensuring memory is only used locally, I'll come back to this.

# Fetch & Websearch

The Fetch_Webpage tool has been simplified a lot. It now converts the page to Markdown and returns the page with three length settings (Brief, Standard, Full). This is a lot more reliable than the old custom extraction method.

The Search_DuckDuckGo tool has a few small improvements. The input is easier for small models to get right, and the output is more readable.

# Speech Generation

I've added the remaining voices for Kokoro-82M, it now supports all 54 voices with all accents/languages.

I also removed the 30 second cap by making sure it computes all chunks in sequence, not just the first. I've tested it on outputs that are ~10 minutes long. Do note that when used as an MCP server, the tool will timeout after 1 minute, nothing I can do about that for right now.

# Other Thoughts

Lots of MCP use cases involve manipulating media (image editing, ASR, etc.). I've avoided adding tools like this so far for two reasons:

1. Most of these solutions would require assigning it a ZeroGPU slot.
2. The current process of uploading files like images to a Gradio space is still a bit rough. It's doable but requires additional tools.

Both of these points make it a bit painful for local usage. I'm open to suggestions for other tools that rely on text.
Nymbo 
posted an update about 1 month ago
view post
Post
980
I built a general use MCP space ~ Fetch webpages, DuckDuckGo search, Python code execution, Kokoro TTS, Image Gen, Video Gen.

# Tools

1. Fetch webpage
2. Web search via DuckDuckGo (very concise, low excess context)
3. Python code executor
4. Kokoro-82M speech generation
5. Image Generation (use any model from HF Inference Providers)
6. Video Generation (use any model from HF Inference Providers)

The first four tools can be used without any API keys whatsoever. DDG search is free and the code execution and speech gen is done on CPU. Having a HF_READ_TOKEN in the env variables will show all tools. If there isn't a key present, The Image/Video Gen tools are hidden.

Nymbo/Tools
  • 1 reply
·
Nymbo 
posted an update about 2 months ago
view post
Post
1007
Anyone using Jan-v1-4B for local MCP-based web search, I highly recommend you try out Intelligent-Internet/II-Search-4B

Very impressed with this lil guy and it deserves more downloads. It's based on the original version of Qwen3-4B but find that it questions reality way less often. Jan-v1 seems to think that everything it sees is synthetic data and constantly gaslights me
Nymbo 
posted an update 3 months ago
view post
Post
2850
Anyone know how to reset Claude web's MCP config? I connected mine when the HF MCP first released with just the default example spaces added. I added lots of other MCP spaces but Claude.ai doesn't update the available tools... "Disconnecting" the HF integration does nothing, deleting it and adding it again does nothing.

Refreshing tools works fine in VS Code because I can manually restart it in mcp.json, but claude.ai has no such option. Anyone got any ideas?
·
Nymbo 
posted an update 5 months ago
view post
Post
4105
Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?
  • 1 reply
·
Nymbo 
posted an update 5 months ago
view post
Post
2778
PSA for anyone using Nymbo/Nymbo_Theme or Nymbo/Nymbo_Theme_5 in a Gradio space ~

Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and in-line code is readable now! Both themes are now visually identical across versions.

If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.
awacke1 
posted an update 6 months ago
view post
Post
2348
AI Vision & SFT Titans 🌟 Turns PDFs into text, snaps pics, and births AI art.

https://huggingface.co/spaces/awacke1/TorchTransformers-Diffusion-CV-SFT

1. OCR a grocery list or train a titan while sipping coffee? ☕
2. Camera Snap 📷: Capture life’s chaos—your cat’s face or that weird receipt. Proof you’re a spy!
3. OCR 🔍: PDFs beg for mercy as GPT-4o extracts text.
4. Image Gen 🎨: Prompt “neon superhero me”
5. PDF 📄: Double-page OCR Single-page sniping

Build Titans 🌱: Train tiny AI models. 💪Characters🧑‍🎨: Craft quirky heroes.
🎥

awacke1 
posted an update 7 months ago
awacke1 
posted an update 7 months ago
view post
Post
2505
🚀 Blast into the future with ZaxxonGalaxian – a thrilling 3D action game where you navigate epic battles through towering 3D cityscapes! Face off against relentless swarm bots, climb the leaderboard, and dominate the skies. awacke1/ZaxxoGalaxian
awacke1 
posted an update 9 months ago
view post
Post
3221
Deep Research Evaluator was asked:
" design a coral defense mechanism that upon sensing say an acid that's causing coral reefs to have a carbon dioxide issue it develops... please create a plan and a design for this\n
"
It picks these three as best combined solution.

1. [Reef-insight: A framework for reef habitat mapping with clustering methods via remote sensing]...
2. Phone a friend: [Learning to Communicate and Collaborate in a Competitive Multi-Agent Setup to Clean the Ocean from Macroplastics]...
3. World Solve: [Dependence of Physiochemical Features on Marine Chlorophyll Analysis with Learning Techniques]


To design a system that allows coralows coral reefs to respond to increased acidity levels in their environment, we can create a network of pH sensors and dispersal units that can detect changes in pH levels and release a base solution to neutralize the acid.

1. pH Sensors: The first component of the system would be a network of pH sensors placed strategically throughout the coral reef. These sensors would be small, durable, and able to withstand the harsh conditions of the ocean. They would be placed at various depths and locations within the reef to ensure accurate and comprehensive monitoring of pH levels.
2. Base Dispersal Units: Once the pH sensors detect a decrease in pH levels, they would trigger the base dispersal units to release a base solution into the water. These units would be strategically placed around the reef and would be able to release a controlled amount of base solution to neutralize the acidity in the water.
3. Water Dispersal Mechanism: The base dispersal units would be connected to a water dispersal mechanism that would allow the base solution to be distributed evenly around the reef. This could be achieved through a series of pipes or channels that would distribute the base solution in a controlled and targeted manner.
awacke1 
posted an update 9 months ago
awacke1 
posted an update 11 months ago
view post
Post
1031
🕊️Hope🕊️ and ⚖️Justice⚖️ AI
🚲 Stolen bike in Denver FOUND - Sometimes hope & justice DO prevail.

🎬 So I Created an AI+Art+Music tribute:
-🧠 AI App that Evaluates GPT-4o vs Claude:
awacke1/RescuerOfStolenBikes
https://x.com/Aaron_Wacker/status/1857640877986033980?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1857640877986033980%7Ctwgr%5E203a5022b0eb4c41ee8c1dd9f158330216ac5be1%7Ctwcon%5Es1_c10&ref_url=https%3A%2F%2Fpublish.twitter.com%2F%3Furl%3Dhttps%3A%2F%2Ftwitter.com%2FAaron_Wacker%2Fstatus%2F1857640877986033980

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">QT your 🕊️Hope🕊️ and ⚖️Justice⚖️ art🎨<br><br>🚲 Stolen bike in Denver FOUND! <br> - Sometimes hope &amp; justice DO prevail! <br><br>🎬 Created an AI+Art+Music tribute: <br> -🧠 AI App that Evaluates GPT-4o vs Claude: <a href="https://t.co/odrYdaeizZ">https://t.co/odrYdaeizZ</a><br> <a href="https://twitter.com/hashtag/GPT?src=hash&amp;ref_src=twsrc%5Etfw">#GPT</a> <a href="https://twitter.com/hashtag/Claude?src=hash&amp;ref_src=twsrc%5Etfw">#Claude</a> <a href="https://twitter.com/hashtag/Huggingface?src=hash&amp;ref_src=twsrc%5Etfw">#Huggingface</a> <a href="https://twitter.com/OpenAI?ref_src=twsrc%5Etfw">@OpenAI</a> <a href="https://twitter.com/AnthropicAI?ref_src=twsrc%5Etfw">@AnthropicAI</a> <a href="https://t.co/Q9wGNzLm5C">pic.twitter.com/Q9wGNzLm5C</a></p>&mdash; Aaron Wacker (@Aaron_Wacker) <a href="https://twitter.com/Aaron_Wacker/status/1857640877986033980?ref_src=twsrc%5Etfw">November 16, 2024</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>


#GPT #Claude #Huggingface
@OpenAI
@AnthropicAI
awacke1 
posted an update 12 months ago
view post
Post
1984
Since 2022 I have been trying to understand how to support advancement of the two best python patterns for AI development which are:
1. Streamlit
2. Gradio

The reason I chose them in this order was the fact that the streamlit library had the timing drop on gradio by being available with near perfection about a year or two before training data tap of GPT.

Nowadays its important that if you want current code to be right on generation it requires understanding of consistency in code method names so no manual intervention is required with each try.

With GPT and Claude being my top two for best AI pair programming models, I gravitate towards streamlit since aside from common repeat errors on cache and experimental functions circa 2022 were not solidified.
Its consistency therefore lacks human correction needs. Old dataset error situations are minimal.

Now, I seek to make it consistent on gradio side. Why? Gradio lapped streamlit for blocks paradigm and API for free which are I feel are amazing features which change software engineering forever.

For a few months I thought BigCode would become the new best model due to its training corpus datasets, yet I never felt it got to market as the next best AI coder model.

I am curious on Gradio's future and how. If the two main models (GPT and Claude) pick up the last few years, I could then code with AI without manual intervention. As it stands today Gradio is better if you could get the best coding models to not repeatedly confuse old syntax as current syntax yet we do live in an imperfect world!

Is anyone using an AI pair programming model that rocks with Gradio's latest syntax? I would like to code with a model that knows how to not miss the advancements and syntax changes that gradio has had in the past few years. Trying grok2 as well.

My IDE coding love is HF. Its hands down faster (100x) than other cloud paradigms. Any tips on models best for gradio coding I can use?

--Aaron
·
awacke1 
posted an update 12 months ago
view post
Post
726
Today I was able to solve a very difficult coding session with GPT-4o which ended up solving integrations on a very large scale. So I decided to look a bit more into how its reasoners work. Below is a fun markdown emoji outline about what I learned today and what I'm pursuing.

Hope you enjoy! Cheers, Aaron.

Also here are my favorite last 4 spaces I am working on:
1. GPT4O: awacke1/GPT-4o-omni-text-audio-image-video
2. Claude:
awacke1/AnthropicClaude3.5Sonnet-ACW
3. MSGraph M365: awacke1/MSGraphAPI
4. Azure Cosmos DB: Now with Research AI! awacke1/AzureCosmosDBUI

# 🚀 OpenAI's O1 Models: A Quantum Leap in AI

## 1. 🤔 From 🦜 to 🧠: O1's Evolution

- **Thinking AI**: O1 ponders before replying; GPT models just predict. 💡

## 2. 📚 AI Memory: 💾 + 🧩 = 🧠

- **Embeddings & Tokens**: Words ➡️ vectors, building knowledge. 📖

## 3. 🔍 Swift Knowledge Retrieval

- **Vector Search & Indexing**: O1 finds info fast, citing reliable sources. 🔎📖

## 4. 🌳 Logic Trees with Mermaid Models

- **Flowchart Reasoning**: O1 structures thoughts like diagrams. 🎨🌐

## 5. 💻 Coding Mastery

- **Multilingual & Current**: Speaks many code languages, always up-to-date. 💻🔄

## 6. 🏆 Breaking Records

- **92.3% MMLU Score**: O1 outperforms humans, setting new AI standards. 🏅

## 7. 💡 Versatile Applications

- **Ultimate Assistant**: From fixing code to advancing research. 🛠️🔬

## 8. 🏁 Racing Toward AGI

- **OpenAI Leads**: O1 brings us closer to true AI intelligence. 🚀

## 9. 🤖 O1's Reasoning Pillars

- **🧠 Chain of Thought**: Step-by-step logic.
- **🎲 MCTS**: Simulates options, picks best path.
- **🔍 Reflection**: Self-improves autonomously.
- **🏋️‍♂️ Reinforcement Learning**: Gets smarter over time.

---

*Stay curious, keep coding!* 🚀
awacke1 
posted an update 12 months ago
view post
Post
605
I have finally completed a working full Azure and Microsoft MS Graph API implementation which can use all the interesting MS AI features in M365 products to manage CRUD patterns for the graph features across products.

This app shows initial implementation of security, authentication, scopes, and access to Outlook, Calendar, Tasks, Onedrive and other apps for CRUD pattern as AI agent service skills to integrate with your AI workflow.


Below are initial screens showing integration:

URL: awacke1/MSGraphAPI
Discussion: awacke1/MSGraphAPI#5

Best of AI on @Azure and @Microsoft on @HuggingFace :
microsoft
https://www.microsoft.com/en-us/research/
---
Aaron
awacke1 
posted an update about 1 year ago
view post
Post
1025
Updated my 📺RTV🖼️ - Real Time Video AI app this morning.
URL: awacke1/stable-video-diffusion

It uses Stable Diffusion to dynamically create videos from images in input directory or uploaded using A10 GPU on Huggingface.


Samples below.

I may transition this to Zero GPU if I can. During Christmas when I revised this I had my highest billing from HF yet due to GPU usage. It is still the best turn key GPU out and Image2Video is a killer app. Thanks HF for the possibilities!
awacke1 
posted an update about 1 year ago
awacke1 
posted an update about 1 year ago
view post
Post
612
I am integrating Azure Cosmos DB, the database system that backs GPT conversations into my workflow, and experimenting with new patterns to accelerate dataset evolution for evaluation and training of AI.

While initially using it for research prompts and research outputs using my GPT-4o client here which can interface and search ArXiv, I am excited to try out some new features specifically for AI at scale. Research on memory augmentation is shown. awacke1/GPT-4o-omni-text-audio-image-video

awacke1/AzureCosmosDBUI
awacke1 
posted an update about 1 year ago
view post
Post
1357
I just launched an exciting new multiplayer app powered by GPT-4o, enabling collaborative AI-driven queries in a single shared session!

### 🔗 Try It Out! 👉 Check out the GPT-4o Multiplayer App
Experience the future of collaborative AI by visiting our space on Hugging Face: awacke1/ChatStreamlitMultiplayer

🎉 This innovative tool lets you and your team reason over:

###📝 Text
###🖼️ Image
###🎵 Audio
###🎥 Video

## 🔍 Key Features

### Shared Contributions
Collaborate in real-time, seeing each other's inputs and contributions.
Enhances teamwork and fosters a collective approach to problem-solving.

### Diverse Media Integration
Seamlessly analyze and reason with text, images, audio, and video.
Breakthrough capabilities in handling complex media types, including air traffic control images and audio.

## 🛠️ Real-World Testing
This morning, we tested the app using images and audio from air traffic control—a challenge that was nearly impossible to handle with ease just a few years ago. 🚁💬

🌱 The Future of AI Collaboration
We believe AI Pair Programming is evolving into a new era of intelligence through shared contributions and teamwork. As we continue to develop, this app will enable groups to:

Generate detailed text responses 📝
Collaborate on code responses 💻
Develop new AI programs together 🤖