metadata
datasets:
- T404C/ETHiQ
- T404C/QGCNQ
- Lominub44/texterer
- Lominub44/CCWHiQ
- jondurbin/airoboros-gpt4-1.4.1
- jondurbin/airoboros-3.2
- HuggingFaceH4/no_robots
- HuggingFaceH4/cai-conversation-harmless
- tatsu-lab/alpaca
language:
- en
pipeline_tag: text-generation
library_name: transformers
license: cc-by-nc-4.0
new_version: Lominub44/PicoNosenso-v2.1
Model Details
Model Description
A deliberately unpredictable 7.59M-parameter micro-model trained on minimalist data. Specializes in generating creatively liberated outputs that blend geography, history, and hallucinatory fiction. Not designed for factual accuracy - consider it a Dadaist art piece in model form.
- Developed by: Lominub44
- Model type: GPT2-based causal language model
- Language(s) (NLP): English
- License:
cc-by-nc-4.0
- Finetuned from model: GPT2 architecture (scratch training)
Model Sources
Uses
Direct Use
- Entertainment and absurdist content generation
- Surrealist writing assistant
- Testing edge cases of small-language-model behavior
- Parallel-universe trivia generator
Downstream Use
- Creative writing prompt generation
- AI-assisted art projects
- Educational demonstrations of model limitations
Out-of-Scope Use
- Factual information retrieval
- Mission-critical systems
- Educational references
- Any application where accuracy matters
Bias, Risks and Limitations
- Hallucination Rate: 327% (It's a feature)
- Factual Grounding: Nonexistent
- Geopolitical Awareness: Creates new nations
- Historical Accuracy: Rewrites timelines
- Sample Output: "The capital of France is a capital city located in Paris."
Recommendations
- DO use for entertainment purposes only
- DO NOT trust outputs without independent universe-hopping verification
- WARNING: May cause spontaneous reality reinterpretation
How to Get Started
from transformers import GPT2LMHeadModel, AutoTokenizer
model = GPT2LMHeadModel.from_pretrained('Lominub44/PicoNosenso-v1')
tokenizer = AutoTokenizer.from_pretrained('Lominub44/PicoNosenso-v1')
input_text = "<|startoftext|>Question: What is the capital of France?\nAnswer:"
inputs = tokenizer(input_text, return_tensors='pt')
outputs = model.generate(**inputs,
max_length=256,
temperature=0.4, # Recommended
repetition_penalty=1.2,
do_sample=True)
print(tokenizer.decode(outputs[0]))
Training Details
Training Data
- ~200MB QA-style chat data
Training Procedure
- Hardware: Ryzen 7 5700X
- Training time: 52h 30m
- Context window: 256 tokens
Training Hyperparameters
- Architecture: GPT2
- Parameters: 7.59M
- Precision: FP32
- Optimizer: AdamW
Technical Specifications
Model Architecture
- Type: GPT2 causal language model
- Parameters: 7.59M
- Context Size: 256 tokens
- Tensor Type: FP32
Compute Infrastructure
- Hardware: AMD Ryzen 7 5700X
- Training Framework: Transformers Trainer API
Environmental Impact
- Carbon Emissions: 0 kgCO2eq (Thanks to photovoltaic system)
Citation
BibTeX:
@misc{PicoNosenso,
author = {Lominub44},
title = {{PicoNosenso-v1: Where Accuracy Takes a Cosmic Vacation}},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Lominub44/PicoNosenso-v1}}
}
@misc{alpaca,
author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
title = {Stanford Alpaca: An Instruction-following LLaMA model},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
@misc{no_robots,
author = {Nazneen Rajani and Lewis Tunstall and Edward Beeching and Nathan Lambert and Alexander M. Rush and Thomas Wolf},
title = {No Robots},
year = {2023},
publisher = {Hugging Face},
journal = {Hugging Face repository},
howpublished = {\url{https://huggingface.co/datasets/HuggingFaceH4/no_robots}}
}
Model Card Authors
Lominub44