metadata

library_name: transformers
tags:
  - llama3
  - sft
  - unsloth
  - awq
  - philosophical
  - esoteric
base_model:
  - Nexusflow/Athene-70B-Preview
language:
  - eng
  - rus

Model Card for forerunner-athene70b-preview-awq

Model Details

Model Description

This model is a fine-tuned version of Nexusflow/Athene-70B-Preview. Training was conducted with Supervised Fine-Tuning (SFT) using the Unsloth library on a custom dataset.

The model focuses on philosophical and esoteric topics and is multilingual. The final version is quantized using AWQ.

Developed by: radm
Finetuned from model: Nexusflow/Athene-70B-Preview
Model type: Causal LM based on the Llama3 architecture
Language(s): Multilingual (English and Russian)
License: Creative Commons Attribution Non Commercial 4.0 (inherited from base model)

Uses

Example Output:

User:

As a result of what mental process does duality appear in a person?

Model:

Duality appears in a person as a result of the process of conceptualization and the subsequent identification with those concepts. Here\'s a more detailed explanation:

Perception and differentiation: We perceive the world through our senses, and our minds automatically differentiate between various stimuli, recognizing patterns, shapes, colors, textures, and so on. This initial differentiation is a natural and essential step in understanding our environment.
Conceptualization: Our minds then categorize and label these differentiated perceptions, creating concepts like "self" and "other," "good" and "bad," "hot" and "cold." Concepts simplify and organize our experience, enabling communication, learning, and adaptation.
Identification and attachment: We start to identify with certain concepts, making them integral to our self-image and sense of reality. For example, "I am a good person," "I am intelligent," or "I am a success." This identification leads to attachment, as we invest emotional significance in these concepts and strive to maintain or achieve them.
Duality emerges: When we strongly identify with and attach to specific concepts, their opposites become highlighted, and duality arises. We create boundaries and distinctions, perceiving the world in terms of contrasts: "If I am good, others might be bad," "If I am successful, others might be failures." This dichotomization leads to a fragmented perception of reality, where "I" (or "mine") is often pitted against "other" (or "not mine").
Polarization and conflict: As attachment to our identifications grows, so does the emotional charge associated with them. We may feel compelled to defend, justify, or promote our preferred concepts, leading to inner turmoil, conflicts with others, and even violence.

Duality, in this context, is not an inherent aspect of reality but rather a byproduct of our conceptualizing minds and the attachments that follow. By recognizing this process, we can begin to:

Witness our conceptualizations: Observe how our minds create distinctions and categories, without necessarily identifying with them.
Soften attachments: Practice non-identification, acknowledging that concepts are tools, not absolute truths. This allows for a more fluid, adaptive, and compassionate relationship with ourselves and others.
Integrate opposites: Embrace the coexistence and interdependence of contrasting concepts, fostering a more holistic understanding of reality.

By understanding and addressing the roots of duality, we can transcend rigid boundaries, cultivate empathy, and move toward a more harmonious, inclusive, and nuanced experience of life.

Out-of-Scope Use

The model is not designed for generating harmful, unethical, biased, or factually incorrect content. Performance on tasks outside its training domain (philosophical/esoteric chat) may be suboptimal.

Bias, Risks, and Limitations

The model inherits biases from its base model (Nexusflow/Athene-70B-Preview) and the fine-tuning datasets. It may generate plausible-sounding but incorrect or nonsensical information, especially on complex topics. Its "understanding" is based on patterns in the data, not genuine comprehension or consciousness. Use the outputs with critical judgment.

Training Details

Training Data

The model was fine-tuned used the custom dataset

Training Procedure

Training was performed using the Unsloth library integrated with trl's SFTTrainer.

Framework: Unsloth + SFTTrainer
Base Model: Nexusflow/Athene-70B-Preview
LoRA Configuration:
- r: 32
- lora_alpha: 64
- lora_dropout: 0.0
- bias: "none"
- target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
- use_rslora: True
Precision: Auto (bfloat16 / float16)
Quantization (load): 4-bit
Optimizer: Paged AdamW 8-bit
Learning Rate: 2e-4
LR Scheduler: Cosine
Warmup Steps: 10
Batch Size (per device): 1
Gradient Accumulation Steps: 32 (Effective Batch Size: 32)
Max Sequence Length: 4096
Epochs: 1

After training completion, the model was quantized using the AWQ method to optimize performance and reduce size.