PicoNosenso-v1 / README.md

Update README.md

184fc02 verified about 2 months ago

5.86 kB

	---
	datasets:
	- T404C/ETHiQ
	- T404C/QGCNQ
	- Lominub44/texterer
	- Lominub44/CCWHiQ
	- jondurbin/airoboros-gpt4-1.4.1
	- jondurbin/airoboros-3.2
	- HuggingFaceH4/no_robots
	- HuggingFaceH4/cai-conversation-harmless
	- tatsu-lab/alpaca
	language:
	- en
	pipeline_tag: text-generation
	library_name: transformers
	license: cc-by-nc-4.0
	new_version: Lominub44/PicoNosenso-v2.1
	---

	<div style="
	background:linear-gradient(135deg,#1a0933,#3d2b8c,#1e0b4d);padding:2.8rem 1.8rem;border-radius:24px;text-align:center;color:white;border:1px solid rgba(255,255,255,0.12);box-shadow:0 12px 48px rgba(101,88,255,0.25),inset 0 0 24px rgba(255,255,255,0.08);margin-bottom:2.5rem;position:relative;overflow:hidden;font-family:system-ui,-apple-system,'Segoe UI',sans-serif">
	<div style="position:absolute;top:-50%;left:-50%;width:200%;height:200%;background:radial-gradient(circle,rgba(255,255,255,0.15) 0%,transparent 70%);transform:rotate(0);z-index:1"></div>
	<h1 style="font-size:3.2rem;margin:0;font-weight:900;letter-spacing:-0.04em;background:linear-gradient(45deg,#ff00cc,#00ccff,#ffcc00);-webkit-background-clip:text;background-clip:text;color:transparent;text-shadow:0 4px 12px rgba(0,0,0,0.3);position:relative;z-index:2;background-size:300% 300%">
	PicoNosenso-v1</h1>
	<p style="font-size:1.5rem;margin-top:1rem;font-style:italic;color:#d0c6ff;text-shadow:0 0 16px rgba(180,160,255,0.6);letter-spacing:0.03em;position:relative;z-index:2;font-weight:500;padding:0.4rem 1.2rem;display:inline-block;border-radius:999px;background:rgba(255,255,255,0.08);backdrop-filter:blur(4px)">
	Where "Accuracy" Takes a Cosmic Vacation</p></div>
	Introducing the universe's most ambitiously unhinged 7.5M-parameter micro-model! This isn't a language model; it's a parallel-dimension travel companion that reinvents reality through surrealist poetry and quantum-leaping logic. Deploy only if coherence is overrated and chaos is your curriculum.

	## Model Details

	### Model Description
	A deliberately unpredictable 7.59M-parameter micro-model trained on minimalist data. Specializes in generating creatively liberated outputs that blend geography, history, and hallucinatory fiction. Not designed for factual accuracy - consider it a Dadaist art piece in model form.

	- Developed by: Lominub44
	- Model type: GPT2-based causal language model
	- Language(s) (NLP): English
	- License: `cc-by-nc-4.0`
	- Finetuned from model: GPT2 architecture (scratch training)

	### Model Sources
	- Repository: https://huggingface.co/Lominub44/PicoNosenso-v1

	## Uses
	### Direct Use
	- Entertainment and absurdist content generation
	- Surrealist writing assistant
	- Testing edge cases of small-language-model behavior
	- Parallel-universe trivia generator

	### Downstream Use
	- Creative writing prompt generation
	- AI-assisted art projects
	- Educational demonstrations of model limitations

	### Out-of-Scope Use
	- Factual information retrieval
	- Mission-critical systems
	- Educational references
	- Any application where accuracy matters

	## Bias, Risks and Limitations
	- Hallucination Rate: 327% (It's a feature)
	- Factual Grounding: Nonexistent
	- Geopolitical Awareness: Creates new nations
	- Historical Accuracy: Rewrites timelines
	- Sample Output: _"The capital of France is a capital city located in Paris."_

	### Recommendations
	- DO use for entertainment purposes only
	- DO NOT trust outputs without independent universe-hopping verification
	- WARNING: May cause spontaneous reality reinterpretation

	## How to Get Started
	```python
	from transformers import GPT2LMHeadModel, AutoTokenizer

	model = GPT2LMHeadModel.from_pretrained('Lominub44/PicoNosenso-v1')
	tokenizer = AutoTokenizer.from_pretrained('Lominub44/PicoNosenso-v1')

	input_text = "<\|startoftext\|>Question: What is the capital of France?\nAnswer:"
	inputs = tokenizer(input_text, return_tensors='pt')
	outputs = model.generate(**inputs,
	max_length=256,
	temperature=0.4, # Recommended
	repetition_penalty=1.2,
	do_sample=True)
	print(tokenizer.decode(outputs[0]))
	```

	## Training Details
	### Training Data
	- ~200MB QA-style chat data

	### Training Procedure
	- Hardware: Ryzen 7 5700X
	- Training time: 52h 30m
	- Context window: 256 tokens

	#### Training Hyperparameters
	- Architecture: GPT2
	- Parameters: 7.59M
	- Precision: FP32
	- Optimizer: AdamW

	## Technical Specifications
	### Model Architecture
	- Type: GPT2 causal language model
	- Parameters: 7.59M
	- Context Size: 256 tokens
	- Tensor Type: FP32

	### Compute Infrastructure
	- Hardware: AMD Ryzen 7 5700X
	- Training Framework: Transformers Trainer API

	## Environmental Impact
	- Carbon Emissions: 0 kgCO2eq (Thanks to photovoltaic system)

	## Citation

	BibTeX:
	```bibtex
	@misc{PicoNosenso,
	author = {Lominub44},
	title = {{PicoNosenso-v1: Where Accuracy Takes a Cosmic Vacation}},
	year = {2025},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/Lominub44/PicoNosenso-v1}}
	}

	@misc{alpaca,
	author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
	title = {Stanford Alpaca: An Instruction-following LLaMA model},
	year = {2023},
	publisher = {GitHub},
	journal = {GitHub repository},
	howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
	}

	@misc{no_robots,
	author = {Nazneen Rajani and Lewis Tunstall and Edward Beeching and Nathan Lambert and Alexander M. Rush and Thomas Wolf},
	title = {No Robots},
	year = {2023},
	publisher = {Hugging Face},
	journal = {Hugging Face repository},
	howpublished = {\url{https://huggingface.co/datasets/HuggingFaceH4/no_robots}}
	}
	```

	## Model Card Authors
	Lominub44

	## Model Card Contact
	[Create a discussion](https://huggingface.co/Lominub44/PicoNosenso-v1/discussions/new)