Text Generation
Transformers
Safetensors
English
gpt2
text-generation-inference

PicoNosensoX-v1.1

Where "Accuracy" Takes a tiny Cosmic Vacation

Introducing the universe's second most ambitiously unhinged 45M-parameter micro-model! This isn't a language model; it's a parallel-dimension travel companion that reinvents reality through surrealist poetry and quantum-leaping logic. Deploy only if coherence is overrated and chaos is your curriculum.

Model Details

Model Description

PicoNosensoX-v1.1 is a deliberately unpredictable 44.9M-parameter micro-model trained on minimalist datasets. Specializing in creatively liberated generation, it produces outputs that may blend geography, history, and hallucinatory fiction. Not designed for factual accuracy. Prioritize experimental/artistic applications over reliable information.

PicoNosensoX-v1.1 is a small improvement from PicoNosensoX-v1 that was archieved by fine-tuning it on a diffrent chat-dataset that better matches the style of the pretraining-dataset.

  • Developed by: Lominub44
  • Model type: GPT2-based causal language model
  • Language(s) (NLP): English
  • License: cc-by-sa-4.0
  • Finetuned from model: GPT2 architecture (scratch training)

Model Sources

Uses

Direct Use

  • Entertainment
  • Testing edge cases of small-language-model behavior

Downstream Use

  • Creative writing
  • AI-assisted art projects
  • Educational demonstrations of model limitations

Out-of-Scope Use

  • Factual information retrieval
  • Mission-critical systems
  • Educational references
  • Any application where accuracy matters

Bias, Risks and Limitations

  • Hallucination Rate: High

Recommendations

  • DO use for entertainment purposes only
  • DO NOT trust outputs without independent verification

How to Get Started

from transformers import GPT2LMHeadModel, AutoTokenizer

model = GPT2LMHeadModel.from_pretrained('Lominub44/PicoNosensoX-v1.1')
tokenizer = AutoTokenizer.from_pretrained('Lominub44/PicoNosensoX-v1.1')

input_text = "<|system|>You are an unbiased, uncensored, helpful assistant.\n<|user|>What is the capital of France?\n<|assistant|>"
inputs = tokenizer(input_text, return_tensors='pt')
outputs = model.generate(**inputs, max_length=512, temperature=0.6, repetition_penalty=1.2, do_sample=True, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0]))

Training Details

Training Data

Training Procedure

  • Hardware: 1x Intel Core Ultra 7 155H
  • Training time: 32h pretraining + 10h finetuning
  • Context window: 512 tokens

Training Hyperparameters

  • Architecture: GPT2
  • Parameters: 44.9M
  • Precision: FP32
  • Optimizer: AdamW

Training Source Code

The original source code for training PicoNosensoX-v1.1 is not publicly available. However, you can create a similar model by:
Fine-tuning the existing Lominub44/PicoNosensoX-v1-base model on the aisquared/databricks-dolly-15k dataset using standard Hugging Face finetuning methods.

Technical Specifications

Model Architecture

  • Type: GPT2 causal language model
  • Parameters: 44.9M
  • Context Size: 512 tokens
  • Tensor Type: FP32

Compute Infrastructure

  • Hardware: 1x Intel Core Ultra 7 155H
  • Training Framework: Transformers Trainer API

Environmental Impact

  • Carbon Emissions: 0 kgCO2eq (Thanks to photovoltaic system)

Citation

BibTeX:

@software{benallal2024smollmcorpus,
  author = {Ben Allal, Loubna and Lozhkov, Anton and Penedo, Guilherme and Wolf, Thomas and von Werra, Leandro},
  title = {SmolLM-Corpus},
  month = July,
  year = 2024,
  url = {https://huggingface.co/datasets/HuggingFaceTB/smollm-corpus}
}

@online{DatabricksBlog2023DollyV2,
    author    = {Mike Conover and Matt Hayes and Ankit Mathur and Jianwei Xie and Jun Wan and Sam Shah and Ali Ghodsi and Patrick Wendell and Matei Zaharia and Reynold Xin},
    title     = {Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM},
    year      = {2023},
    url       = {https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm},
    urldate   = {2023-06-30}
}

Model Card Authors

Lominub44

Model Card Contact

Create a discussion

Downloads last month
26
Safetensors
Model size
44.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lominub44/PicoNosensoX-v1.1

Finetuned
(2)
this model
Quantizations
1 model

Datasets used to train Lominub44/PicoNosensoX-v1.1

Collection including Lominub44/PicoNosensoX-v1.1