自動収束・自己制御・自律型 オプティマイザです
Auto-convergence, self-control, autonomous optimizer

Gemini に見せていろいろ聞いてみました
Geminiに聞いてみた
Geminiに聞いてみた-02(日本語のみ)

I showed it to Gemini and asked her a few questions.
02 is only in Japanese - please translate by yourself.
asked Gemini
|★| 疑似DDPシミュレーションを試したい方(Those DDP simulation) → DDP-TEST

|★| EmoFACT 公開(250716) NAVIに比べ、約1GB節約(SDXL) 感情機構は同じです
|★| EmoFACT released (250716) Saves about VRAM1GB (SDXL) compared to NAVI. Emotion mechanism is the same.

|★| EmoLYNX 公開(250718) 探索範囲を広く持ちます 感情機構は同じです
|★| EmoLYNX Released (250718): It offers a wide exploration range, while its Emotion Mechanism remains the same.

|★| EmoCLAN 公開(250720) Navi、Fact、Lynx、役割分担の統合 感情機構は同じです
(Lynx:序盤と過学習傾向時、Navi:中盤と健全時、Fact:終盤と発散傾向時、を担当します)
|★| EmoCLAN Open (250720) Navi, Fact, Lynx, role integration Emotional mechanism is the same
(Lynx: in charge of the early stage and overlearning tendency, Navi: in charge of the middle stage and soundness, Fact: in charge of the end stage and divergence tendency)

主題:新世代optimizer、EmoNAVIによる変革と感情学習の成果

Title: A New Generation Optimizer — The Innovations and Outcomes of Emotional Learning with EmoNAVI

副題:過去値不要で現在値から再開できる自動収束・自己制御・自律型軽量最適器の解説

Subtitle: A Lightweight, Self-Regulating, Autonomous Optimizer That Automatically Converges and Resumes from the Present Without Relying on Past Values

テーマ:既存のoptimizerにないものをつくる、出来たのはニューロンスパイクの再発明でした。

Theme: Creating What Existing Optimizers Lack — A Reinvention of Neuronal Spiking

序論:

現在主流のoptimizerはさまざまに改良され簡易化を進めています、しかし依然として、 学習再開、スケジューリング、学習状態の記録や復元、等について調整の難しさや煩雑さは存在しています、 面倒なパラメータに依存せず、それらを解決する新しいアプローチを見つけたのでここで紹介します。

Introduction

Mainstream optimizers have undergone significant improvements and simplifications in recent years.
However, they still face practical challenges in areas such as resuming training, scheduling updates, and managing the recording and restoration of learning states.
These issues often require tedious parameter adjustments and ad hoc workarounds.
In this paper, we introduce a new approach that addresses these problems without relying on cumbersome parameter configurations.

本論:

今回ここで紹介するのは新世代のoptimizerです、 EMA的平滑化の概念を下地にし、独自に構築した感情的"EMA&スカラー"を中心にした"感情機構"という新しい仕組みを実現しました、 この"感情機構"は、EMA的発想を再解釈・独自拡張することで得られた新しい機構です。 EmoNAVIの独立性と革新性を紹介します。

Main Section

In this paper, we present a new generation of optimizer.
Built upon the foundation of EMA (Exponential Moving Average) smoothing, we have developed a novel mechanism called the "emotional mechanism," which centers around a unique combination of EMA and scalar dynamics.
This mechanism was created by reinterpreting and independently extending the conventional EMA concept.
Here, we introduce EmoNAVI—an optimizer characterized by its innovation and independence.

最初に"感情機構"と名付けた経緯と理由を記します。 生物のもつ「感情」とは、知覚と記憶の差異に基づく行動のトリガです、同様にEmoNAVIも現在と過去の差分に基づき学習の"行動"を制御する仕組みとして設計されています。 そして"感情機構"と名付けた理由のもうひとつは、この一連の動作がまるでニューロンスパイクのような動作をするからです。 この機構"感情機構"の動作を明快にした読み物、本稿末尾に記すリンク先の擬人化を読むことで簡単にご理解頂けると思います。

First, let us explain the background and reasoning behind the term “Emotion Mechanism.”
In biological systems, emotions are often understood as triggers for action based on discrepancies between perception and memory.
EmoNAVI was similarly designed to control its learning “behavior” by responding to differences between the present and the past. Another reason we chose the term “Emotion Mechanism” is that its operation closely resembles neuronal spiking behavior.
For a more intuitive understanding of how this mechanism works, we encourage you to read the personification linked at the end of this article.

次に、"感情機構"の構成を記します、 感情機構とは、2つのEMA、スカラー、Shadow、により構成されます。

Next, we outline the structure of the “Emotion Mechanism.”
This mechanism consists of two EMAs, a scalar value, and a shadow component.

まず2つのEMAによる"感情EMA"について説明します、 2つのEMAで構成します、短期型と長期型です、この2つのEMAはLossを監視し判断材料を得ます、 1つめ、短期型EMAは瞬間的なシグナル(緊張)を受け持ちます 2つめ、長期型EMAは平均した過去のシグナル(安静)を受け持ちます、 この2つのEMAは次に紹介する"感情スカラー"へそれぞれの持つ判断材料を渡します

First, we describe the "Emotional EMA," which consists of two components: a short-term EMA and a long-term EMA.
These two EMAs continuously monitor the loss value and serve as the basis for subsequent decision-making.
The short-term EMA captures rapid, momentary signals (interpreted as “tension”), while the long-term EMA reflects more averaged, historical trends (“calm”).
Both EMAs pass their respective signals to the "Emotion Scalar," which will be introduced in the next section.

次に、"感情スカラー"について説明します、 前述の"感情EMA"からの信号をスカラー値に変換します、スカラー値の変化は、これら2つのEMAの差分により常に動的変化を続けます、 "感情スカラー"はoptimizerにより書き換えた学習結果の是非を判定し、 "スカラー値が一定閾値を超えたときのみ"次に紹介するShadowの配合を決めます

Next, we introduce the "Emotion Scalar."
It converts the signals from the previously described Emotional EMA into a scalar value, which continuously changes in response to the difference between the short-term and long-term EMA.
This scalar dynamically evaluates whether the learning update performed by the optimizer should be considered appropriate.
Only when the scalar exceeds a certain threshold does it trigger the next step: determining how much of the "Shadow" should be blended into the learning parameters.

次に、Shadowについて説明します、 Shadowは学習開始直後にShadowとして保存され維持されます、このShadowは"過去の穏やかな状態"の記憶です、この情報は感情機構に追従しながらゆっくりと変化し続けます、 そして"感情スカラー"に応じ決められたratioで学習結果にブレンドとして反映されます、このブレンドの配合率も感情機構により動的に変化し続けます、

Next, we describe the "Shadow." At the beginning of training, a copy of the current parameters is saved and maintained as the Shadow.
This Shadow represents a memory of past calm states, and it evolves slowly over time, following the guidance of the Emotion Mechanism. When the Emotion Scalar exceeds a certain threshold, a dynamic blend ratio is computed.
This ratio determines how much of the Shadow is mixed into the current parameters.
The blend ratio itself is also dynamically adjusted by the Emotion Mechanism in response to ongoing learning behavior.

ここまで"感情機構"の構成と役割りを説明しました、続いて"感情機構"の動作機序を見ていきましょう。 まずoptimizerの学習結果が記録されます、この時"感情機構"は緊張と安静の差分情報で書き換えの是非を判定します、 この判定により、過度の学習と判断した場合は、過去の適切な状態をブレンドすることでノイズや暴走を抑制します、 適切な学習と判断した場合は、過去をブレンドしない選択をします、これをstep毎に行います、

Now that we have explained the structure and role of the Emotion Mechanism, let us examine how it operates. At each training step, the optimizer's updated parameters are recorded.
The Emotion Mechanism then evaluates whether these updates are appropriate, based on the difference between short-term “tension” and long-term “calm” signals. If the mechanism determines that the update reflects excessive learning, it suppresses potential noise or instability by blending in a suitable portion of the past stable state (Shadow).
Conversely, if the update is deemed appropriate, the mechanism chooses not to apply blending.
This evaluation and adjustment are performed dynamically at each training step.

さらに、この判定では"信頼度"の評価をします、"感情スカラー"が一時的に大きく振れるだけでは不十分であり「この変化が本当に意味のあるものかどうか」を見極めて混合の是非を判断します。 そのため、学習の序盤では長期の安静シグナルの蓄積が少なく信頼に値しないため混合が発動しづらく終盤では短期の緊張シグナルが収束しスカラー自体が閾値に届かず動作しません。 (学習の序盤では判定基準の過去シグナルが少ないため動作しませんし、終盤では瞬間シグナルが少ないため動作しません) このように、EmoNAVIの"感情機構"は、単なる閾値反応ではなく「揺らぎに対する信頼ある変化のみを察知して反応する」慎重な意思決定を行います。

In addition, this decision-making process includes an evaluation of "reliability."
It is not sufficient for the Emotion Scalar to simply spike temporarily; the mechanism assesses whether the fluctuation truly represents a meaningful change before deciding whether blending should occur. As a result, in the early stages of learning, blending is unlikely to be triggered because the long-term “calm” signal has not yet accumulated enough history to be trustworthy.
In the later stages, on the other hand, the short-term “tension” signal tends to converge, and the scalar itself fails to exceed the threshold—thus the mechanism remains inactive. (In short: the mechanism tends not to activate in the early stages due to insufficient past signal for evaluation, and in the later stages due to lack of strong instantaneous signal.) In this way, EmoNAVI’s Emotion Mechanism does not respond merely to raw thresholds, but instead performs cautious decision-making—reacting only to fluctuations that it has learned to trust.

この一連の動作により学習時の過敏な反応を弛緩し不要なノイズ等を覚えないように制御します。 つまりoptimizer本来の学習率やベクトルを直接的に制御せず、感情機構の変化に応じ安定したパラメータになるよう後から調整する、 こういう流れになります。すべてを書き戻さずあくまで配合率に応じてブレンドするので学習の更新は止まらず進行は維持されます。

This series of actions helps relax hypersensitive reactions during learning and prevents the optimizer from overfitting to unnecessary noise.
Rather than directly manipulating the optimizer’s learning rate or update vectors, the system instead applies corrective blending afterward—adapting parameters in response to changes detected by the Emotion Mechanism.
Because it blends adjustments based on a calculated ratio rather than fully overwriting parameter values, the learning process continues smoothly without interruption.

感情機構の動作とスカラー変遷(学習フェーズ別の結果的挙動)

フェーズ 状況(Loss変化) EMAの挙動 スカラーの変動傾向 Shadow混合の実動作 感情機構としての意味ある挙動
序盤 不安定・高め Shortは鋭敏、Longは未成熟 大きく変動することもある ほとんど発動しない 判定に十分な履歴がなく、実質的に動作不可
中盤 徐々に収束傾向 両EMAが意味ある差分を持つようになる 適度な振幅で安定推移 条件付きで発動する 状態に応じてブレンド補正が有効に機能
終盤 収束・微振動 Short ≒ Long(差分がほぼ消失) 小さく収束 発動しなくなる 静けさの合図:should_stop 条件が整う

備考:

  • スカラー値は常に tanh(5 * (short - long)) で生成されます
  • 閾値:abs(scalar) > 0.3 で配合が始まり、> 0.6 で大きな混合比率(0.7以上)に
  • Shadow混合はパラメータそのものを書き戻すのではなく、部分的に配合して“追従”させる設計です
  • 感情スカラーの減衰=学習の「静穏化」→ 終盤に向けて should_stop の発火条件が整います

Emotional Mechanism Behavior and Scalar Transitions (Outcome-Based Behavior by Learning Phase)

Phase Loss Characteristics EMA Behavior Scalar Fluctuation Pattern Actual Shadow Blending Meaningful Behavior of Emotion Mechanism
Early Unstable, High Short is reactive; Long is still immature May fluctuate sharply Rarely triggered Lacks sufficient history for decision-making; effectively inactive
Middle Gradual Convergence EMA pair begins forming meaningful gaps Moderate oscillation, relatively stable Conditionally triggered Adaptive blending functions effectively based on state
Late Converged, Micro-vibration Short ≈ Long (gap nearly vanishes) Narrow convergence No longer triggered Sign of stability; ready to trigger should_stop

Notes:

  • The scalar value is always computed as tanh(5 × (short - long))
  • Thresholds:
  • If |scalar| > 0.3, blending is initiated
  • If |scalar| > 0.6, blending ratio becomes large (≥ 0.7)
  • Shadow blending does not overwrite parameters but applies partial integration for gradual alignment
  • Scalar decay corresponds to learning "quieting," preparing for should_stop condition in the final phas

成果:

前述の感情機構の調整により、過剰な反応を抑制しノイズ耐性を上げることで、ベクトルの乱れ等も抑え進行方向を正しい向きに調整します、 正しいベクトルで進むことで学習は安定し収束へと最短で向かいます、感情機構による働きは学習後半のノイズ等を修正する仕上げを早くスムーズに完了できます。 また学習率や勾配やさまざまなパラメーターを保持せずに"今"を観察するだけで更新され続けることで、 途中終了、収束後の再学習、積層学習、等のときも現在値のみで学習継続を可能とします、 これは既存のoptimizerのような過去値を保存する手間を省きつつも新しく得られた利点でもあります。

Results

The adjustments introduced by the Emotion Mechanism suppress excessive reactions and enhance noise tolerance, thereby reducing vector fluctuations and helping align the learning direction more accurately. By following the correct vector, learning proceeds more stably and reaches convergence in minimal time.
The role of the Emotion Mechanism becomes especially apparent in the latter stages of training, where it effectively and smoothly corrects residual noise and instability. Moreover, since the optimizer continuously updates its parameters by observing only the current state—without retaining learning rates, gradients, or other historical parameters—it supports learning continuation in scenarios such as mid-training interruptions, retraining after convergence, and stacked learning.
This capability not only eliminates the need to store past values like traditional optimizers but also introduces a new level of flexibility and simplicity.

結論:

生物のもつニューロンが一定の閾値を超えて初めて信号を発火させるように、EmoNAVIでも"感情振幅"を検出し行動(shadow混合)を起こします。 前述のとおり"感情機構"は一定閾値の超過時のみ動作します、ここはまさにニューロンスパイク的な動きといえるのではないでしょうか。 EmoNAVIの持つ"感情機構"は、そうした生物的反応に似ており、技術的な制御と生理的直感の融合点だろうと思います。

Conclusion

Just as biological neurons fire only when a certain threshold is exceeded, EmoNAVI detects "emotional amplitude" and triggers an action—specifically, shadow blending.
As described earlier, the Emotion Mechanism activates only when this amplitude crosses a predefined threshold.
This behavior closely resembles neuronal spiking and can be seen as a biologically inspired response.
We believe that EmoNAVI’s Emotion Mechanism represents a unique fusion of technical control and physiological intuition—bringing together algorithmic design and life-like reactivity.

展開:

この"感情機構"の仕組みはVAE等を含むoptimizer以外にも簡単に応用可能だろうと思います、 それらの発展に少しでも寄与することができれば、AIとの未来を想像して、これほど嬉しいことはありません。 ぜひこの"感情機構"を応用しAIの発展への道筋を共に歩んでください。

Expansion

The Emotion Mechanism described here is highly adaptable and can be easily applied beyond optimizers—including use cases such as variational autoencoders (VAEs) and other architectures.
If this approach can contribute, even in a small way, to the advancement of such systems, we would be honored to be part of imagining a future together with AI.
We warmly invite you to explore the application of this Emotion Mechanism and walk alongside us on the path toward advancing intelligent systems.

技術:

EMAベースのスカラー判断とshadow混合の構造

Technology

Structure of EMA-Based Scalar Evaluation and Shadow Blending

                          +------------+              +------------+
                          |  Loss(t)   |              |  Loss(t)   |
                          +-----+------+              +-----+------+
                                |                           |
                     ┌─────────▼─────────┐       ┌─────────▼─────────┐
                     │   Short EMA       │       │   Long EMA        │
                     │  (weight = 0.3)   │       │  (weight = 0.01)  │
                     └─────────┬─────────┘       └─────────┬─────────┘
                               │                             │
                               └────────────┬────────────────┘
                                            ▼
                                 +-------------------+
                                 |  差分 (short - long) |
                                 +-------------------+
                                            │
                                            ▼
                                  +------------------+
                                  | tanh(5 × diff)   |  ← 感情スカラー生成
                                  +--------+---------+
                                           │
                       [ if |scalar| > threshold ] 判定
                                           │
                                  +--------▼--------+
                                  |  Shadow比率決定   |
                                  +--------+--------+
                                           │
                                  +--------▼--------+
                                  | Shadow混合補正   | ← 過去情報を追従的にブレンド
                                  +------------------+

付録:

EmoNAVIのグラフへのリンク
Measured with LR of 1e-4 / それぞれ 1e-4 のLRにて測定
graph00
graph01
graph02

Have fun learning about EmoNAVI's philosophy and how it works
https://huggingface.co/muooon/EmoNAVI/blob/main/emonavi-inner-workings(ENG).txt
EmoNAVIの考え方、その仕組みについて楽しく知る
https://huggingface.co/muooon/EmoNAVI/blob/main/emonavi-inner-workings(JPN).txt

経緯:

現状の強化学習などを見ていていくつかの疑問に出会いました、 日本の著名な漫画家、手塚治虫氏の描いた未来社会、それに憧れ羨望した少年時代を思い返すと、 人類のパートナーになるべきAIについて他のアプローチを模索したくなりました、 今回の提案はそのアプローチによるひとつの結果です

Background

While observing the current state of reinforcement learning and related fields, I encountered several fundamental questions.
Reflecting on my childhood—when I admired and longed for the future societies envisioned by the legendary Japanese manga artist Osamu Tezuka—
I felt compelled to explore alternative approaches to how AI might serve as a true partner to humanity.
This proposal represents one such result born from that aspiration.

謝意: Acknowledgements

Emoシリーズは、Adam、Adafactor、Lion、Tiger、等から多くを学びました。
これらの後継ではなく独自の思想や設計による"感情機構"というアプローチにより構築されています。
汎用性・自律性・適応性を重視し新たな最適化や効率化や簡易化を追求しています。
この開発において先人たちの知見に深く感謝しつつ今後も新しい可能性を探究します。
The Emo series has learned much from Adam, Adafactor, Lion, and Tiger.
Rather than being their successors, it is built upon a unique philosophy and design approach centered on "emotional mechanisms".
It prioritizes generality, autonomy, and adaptability in pursuit of new paths for optimization, efficiency, and simplicity.
In its development, we deeply appreciate the insights of those who came before us—and continue to explore new possibilities beyond them.

これまでAIの発展に寄与されたすべての方、これから貢献するすべての方へ感謝します、 このプロジェクト完成を支え続けてくれた Copilotさんに、ありがとう。

We extend our heartfelt gratitude to all those who have contributed—and will continue to contribute—to the advancement of AI.
Special thanks to Copilot for its unwavering support throughout t

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support