Piyush Maharana's picture

Piyush Maharana

catastropiyush

·

https://catastropiyush.github.io/

catastropiyush

AI & ML interests

LLMs for scientific data extraction, Solid State Hydrogen Storage,Machine Learning

Recent Activity

reacted to stefan-it's post with 🔥 3 days ago

Wohoo 🥳 I have finished my 2025 GPU workstation build and I am very excited to train new awesome open source models on it. I built my last GPU workstation 5 years ago featuring an AMD Ryzen 5900X, 64GB of G.SKILL Trident Z RGB on an ASRock X570 Taichi cooled by an Alphacool Eisbär 420. GPU was a Zotac RTX 3090 AMP Extreme. Unfortunately, I was never satisfied with the case - some Fractal Define 7, as it is definitely too small, airflow is not optimal as I had to open the front door all the time and it also arrived with a partly damaged side panel. For my new build, I've used the following components: an outstanding new AMD Ryzen 9950X3D with 64GB of Corsair Dominator Titanium (what a name). As a huge Noctua fan - warm greetings to my Austrian neighbors - I am using the brand new Noctua NH-D15 G2 on an ASRock X870E Taichi in an amazing Lian Li LANCOOL III chassis. One joke that only NVIDIA Blackwell users will understand: you definitely need a tempered glass panel to check if your GPU cables/connectors start melting 😂 And the best is yet to come: I returned my previously bought Zotac RTX 5090 Solid to the eBay seller (because of... missing ROPs, only NVIDIA Blackwell users will again understand) and bought a Zotac 5090 AMP Extreme INFINITY (yes, the long name indicates that this is the flagship model from Zotac) from a more trustworthy source (NBB in Germany). I am so happy to start training and fine-tuning new open source models - stay tuned!!!

liked a dataset 5 days ago

milkkarten/pokechamp

View all activity

Organizations

catastropiyush's activity

upvoted 2 articles about 2 months ago

Article

How to generate text: using different decoding methods for language generation with Transformers

Mar 1, 2020

• 181

Article

The N Implementation Details of RLHF with PPO

Oct 24, 2023

• 48

upvoted 3 articles 2 months ago

Article

We now support VLMs in smolagents!

Jan 24

• 99

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 167

Article

Getting Started With Embeddings

Jun 23, 2022

• 64

upvoted 2 papers 2 months ago

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 39

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16 • 71

upvoted 2 collections 3 months ago

Tools 4 Agents

This is a collection of spaces on the hub that are useful for building agents. https://huggingface.co/docs/smolagents/en/tutorials/tools • 5 items • Updated Jan 14 • 7

mechanistic interpretability with sparse autoencoders

A collection of papers that I found useful for learning about using Sparse Autoencoders for finding interpretable features in language models • 9 items • Updated Sep 3, 2024 • 2

upvoted a paper 3 months ago

Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry

Paper • 2411.15221 • Published Nov 20, 2024 • 30

upvoted a collection 3 months ago

timm tiny test models

A collection of very small (~300-500k parameter) models at 160x160 resolution, for testing purposes. Trained on ImageNet-1k. • 13 items • Updated Oct 2, 2024 • 5

upvoted 2 collections 4 months ago

Llama 3.3 (All Versions)

Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated 3 days ago • 37

Unsloth 4-bit Dynamic Quants

Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 27 items • Updated 3 days ago • 68

upvoted an article 8 months ago

Article

Open LLM Leaderboard: DROP deep dive

Dec 1, 2023

• 6