Takayuki Fukuda
commited on
Commit
·
ad69b20
0
Parent(s):
Initial commit
Browse files- .gitattributes +40 -0
- LICENSE +16 -0
- README.md +144 -0
- demo_en.gif +3 -0
- dist/LLM_Battle-0.1-darwin-universal.zip +3 -0
- dist/LLM_Battle-0.1-win64.zip +3 -0
.gitattributes
ADDED
@@ -0,0 +1,40 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
*.gif filter=lfs diff=lfs merge=lfs -text
|
37 |
+
*.jpg filter=lfs diff=lfs merge=lfs -text
|
38 |
+
*.jpeg filter=lfs diff=lfs merge=lfs -text
|
39 |
+
*.png filter=lfs diff=lfs merge=lfs -text
|
40 |
+
*.mp4 filter=lfs diff=lfs merge=lfs -text
|
LICENSE
ADDED
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
This software is free to use.
|
2 |
+
Publishing and streaming gameplay videos is permitted.
|
3 |
+
|
4 |
+
Disclaimer:
|
5 |
+
The author assumes no liability for any damages arising from the use of this software.
|
6 |
+
|
7 |
+
---
|
8 |
+
|
9 |
+
本ソフトウェアは無料でご利用いただけます。
|
10 |
+
プレイ動画の公開・配信も問題ありません。
|
11 |
+
|
12 |
+
免責事項:
|
13 |
+
本ソフトウェアの使用により生じたいかなる損害についても、
|
14 |
+
作者は一切の責任を負いません。
|
15 |
+
|
16 |
+
Copyright (c) 2025 hedachi
|
README.md
ADDED
@@ -0,0 +1,144 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
license_name: custom-freeware-license
|
4 |
+
license_link: LICENSE
|
5 |
+
tags:
|
6 |
+
- llm-game
|
7 |
+
- ai-gameplay
|
8 |
+
- experimental
|
9 |
+
- unity
|
10 |
+
- local-llm
|
11 |
+
- tool
|
12 |
+
- not-for-training
|
13 |
+
language:
|
14 |
+
- ja
|
15 |
+
- en
|
16 |
+
---
|
17 |
+
# LLM Battle: The Game for AI
|
18 |
+
|
19 |
+
Developed by [@hedachi](https://x.com/hedachi)
|
20 |
+
|
21 |
+

|
22 |
+
|
23 |
+
| Model Name | Elo Rating | Count | Win | Lose | Win Rate(%) | Thinking Time(sec) |
|
24 |
+
|------------|------------|-------|-----|------|-------------|-------------------|
|
25 |
+
| o3 (2025-04-16) | 1706 | 21 | 20 | 1 | 95.2 | 29.4 |
|
26 |
+
| o4 Mini | 1678 | 66 | 50 | 16 | 75.8 | 22.2 |
|
27 |
+
| Claude 4 Opus | 1644 | 153 | 84 | 69 | 54.9 | 5.1 |
|
28 |
+
| Claude 3.7 Sonnet | 1641 | 130 | 99 | 31 | 76.2 | 3.3 |
|
29 |
+
| Grok 3 Mini | 1626 | 53 | 36 | 17 | 67.9 | 10.2 |
|
30 |
+
| Claude 4 Sonnet | 1616 | 166 | 98 | 68 | 59.0 | 5.3 |
|
31 |
+
| o1 | 1607 | 30 | 19 | 11 | 63.3 | 69.7 |
|
32 |
+
| GPT-4.1 | 1580 | 244 | 136 | 108 | 55.7 | 2.7 |
|
33 |
+
| Grok 3 | 1532 | 96 | 44 | 52 | 45.8 | 2.7 |
|
34 |
+
| GPT-4 Turbo | 1522 | 99 | 55 | 44 | 55.6 | 4.5 |
|
35 |
+
| Gemini 2.0 Flash Lite | 1451 | 125 | 60 | 65 | 48.0 | 1.7 |
|
36 |
+
| gemma3:12b-it-q8_0 | 1446 | 23 | 10 | 13 | 43.5 | 14.1 |
|
37 |
+
| GPT-4o | 1429 | 59 | 26 | 33 | 44.1 | 2.7 |
|
38 |
+
| GPT-4o Mini | 1424 | 106 | 39 | 67 | 36.8 | 2.8 |
|
39 |
+
| Gemini 2.5 Pro | 1404 | 40 | 16 | 24 | 40.0 | 20.7 |
|
40 |
+
| GPT-4.1 Mini | 1381 | 78 | 28 | 50 | 35.9 | 2.6 |
|
41 |
+
| GPT-3.5 Turbo | 1377 | 54 | 13 | 41 | 24.1 | 1.8 |
|
42 |
+
| Claude 3.5 Haiku (20241022) | 1370 | 212 | 81 | 131 | 38.2 | 2.9 |
|
43 |
+
| Gemini 2.5 Flash Lite | 1350 | 120 | 46 | 74 | 38.3 | 0.9 |
|
44 |
+
| GPT-4.1 Nano | 1243 | 127 | 25 | 102 | 19.7 | 2.4 |
|
45 |
+
| Gemini 2.5 Flash | 1197 | 45 | 4 | 41 | 8.9 | 9.9 |
|
46 |
+
|
47 |
+
## 📥 Download
|
48 |
+
|
49 |
+
### Version 0.1
|
50 |
+
|
51 |
+
| Platform | File Name | Download |
|
52 |
+
|----------|-----------|----------|
|
53 |
+
| Windows (64-bit) | LLM_Battle-0.1-win64.zip | [Download](https://huggingface.co/hedachi/LLM_Battle_Dev/resolve/main/dist/LLM_Battle-0.1-win64.zip) |
|
54 |
+
| macOS (Universal) | LLM_Battle-0.1-darwin-universal.zip | [Download](https://huggingface.co/hedachi/LLM_Battle_Dev/resolve/main/dist/LLM_Battle-0.1-darwin-universal.zip) |
|
55 |
+
|
56 |
+
> **Note**: macOS is pending notarization. Please follow these steps to run:
|
57 |
+
> 1. Move LLM_Battle-0.1-darwin-universal.zip to Applications folder
|
58 |
+
> 2. Run the following in Terminal:
|
59 |
+
> /usr/bin/xattr -cr /Applications/LLM_Battle_v0_1.app
|
60 |
+
|
61 |
+
> **Note**: Mac OSは公証申請中のため、下記手順で実行してください。
|
62 |
+
> 1. LLM_Battle-0.1-darwin-universal.zipをApplicationsフォルダに移動
|
63 |
+
> 2. ターミナルで以下を実行:
|
64 |
+
> /usr/bin/xattr -cr /Applications/LLM_Battle_v0_1.app
|
65 |
+
|
66 |
+
[日本語はこちら](#日本語)
|
67 |
+
|
68 |
+
LLM Battle is a game where LLMs compete against each other.
|
69 |
+
|
70 |
+
## Why a game for AI to play?
|
71 |
+
|
72 |
+
There are two main objectives:
|
73 |
+
|
74 |
+
#### 1. To determine which LLM is the smartest
|
75 |
+
|
76 |
+
While there have been previous attempts to make LLMs play games designed for humans with various workarounds, this game is designed so that LLMs can play naturally without any modifications.
|
77 |
+
|
78 |
+
Even older LLMs like GPT-3.5 Turbo can play (albeit not very well - winning 13 and losing 41 games with a rating of 1,377 in the results above), while high-performance LLMs are strong players.
|
79 |
+
|
80 |
+
The game is designed so that LLMs with superior "text comprehension" and "judgment" abilities will win. If GPT-5 is truly revolutionary in intelligence, it should demonstrate overwhelming strength in this game.
|
81 |
+
|
82 |
+
#### 2. To provide entertainment for local LLM users
|
83 |
+
|
84 |
+
"Local LLMs" - running language models on your own machine - have been popular among enthusiasts, but often after getting them working, there's not much else to do with them.
|
85 |
+
|
86 |
+
This game allows you to battle your locally set up LLM against various AIs. Online battles with other local LLM users are also in development.
|
87 |
+
|
88 |
+
### Notes on Battle Results
|
89 |
+
- Claude models were run without extended thinking enabled
|
90 |
+
- It's unclear why Grok 3 Mini has longer thinking time and performs better than Grok 3
|
91 |
+
- Grok 4 was excluded due to high latency and error rates making it unplayable
|
92 |
+
- Gemini 2.5 showing weaker performance than gemma3 is peculiar. Also, Gemini 2.5 Flash being slower and weaker than Gemini 2.5 Flash Lite is odd, suggesting there may be issues with the game implementation
|
93 |
+
- gemma3:12b-it-q8_0 is a local LLM running on the developer's MacBook Pro. All others use APIs
|
94 |
+
|
95 |
+
### Compatible Local LLMs
|
96 |
+
|
97 |
+
- OpenAI-compatible APIs (Ollama, LM Studio, etc.)
|
98 |
+
|
99 |
+
Contact us if you'd like support for other local LLM interfaces.
|
100 |
+
|
101 |
+
### License
|
102 |
+
|
103 |
+
This software is free.
|
104 |
+
See [LICENSE](LICENSE) for details.
|
105 |
+
|
106 |
+
|
107 |
+
## 日本語
|
108 |
+
|
109 |
+
LLM BattleはLLM同士が対戦するゲームです。
|
110 |
+
|
111 |
+
なぜAIがプレイするためのゲームを作ったのか?目的は2つあります。
|
112 |
+
|
113 |
+
#### 1. どのLLMが賢いのかを明らかにするため
|
114 |
+
|
115 |
+
人間用ゲームを様々な工夫をしてLLMにプレイさせる試みは以前からありますが、このゲームはLLMが素のまま普通にプレイできるようにデザインしました。
|
116 |
+
|
117 |
+
GPT-3.5 Turboなどの古���LLMでも強くないなりにプレイできて(上記の対戦結果では13勝41敗でレーティング1,377)、高性能なLLMは強いです。
|
118 |
+
|
119 |
+
「文章の理解力」と「判断力」が優れたLLMが勝つように作られています。GPT-5がもし圧倒的に賢いとしたら、このゲームで圧倒的な強さを見せてくれるはずです。
|
120 |
+
|
121 |
+
#### 2. ローカルLLMの楽しみ方の提供
|
122 |
+
|
123 |
+
自分のマシンでLLMを動かす「ローカルLLM」は一部で人気がありますが、とりあえず動くようにしてみたもの、特にそれ以上やることがないということも多いと思います。
|
124 |
+
|
125 |
+
そこで、このゲームを使うと、セットアップしたローカルLLMを様々なAIと戦わせてみることができます。他のローカルLLMユーザーとのオンライン対戦も開発中です。
|
126 |
+
|
127 |
+
### 対戦結果についての補足
|
128 |
+
- Claudeは拡張思考(thinking)の指定なしで実行しています。
|
129 |
+
- Grok 3 MiniがGrok 3より思考時間が長く強い理由は不明です。
|
130 |
+
- Grok 4は遅くてエラー率が高く、まともに動かなかったので除外しました。
|
131 |
+
- Gemini 2.5がgemma3より弱いという奇妙な結果が出ています。また、Gemini 2.5 FlashがGemini 2.5 Flash Liteより遅くて弱い点も奇妙なので、本ゲーム側になんらかの問題があるかもしれません。
|
132 |
+
- gemma3:12b-it-q8_0は開発者が手元のMacBookProで動かしたローカルLLMです。それ以外はAPIです。
|
133 |
+
|
134 |
+
### 使用可能なLocal LLM
|
135 |
+
|
136 |
+
- OpenAI互換API(Ollama、LM Studio等)
|
137 |
+
|
138 |
+
対応してほしいローカルLLMのインターフェイスがあればご連絡ください。
|
139 |
+
|
140 |
+
### ライセンス
|
141 |
+
|
142 |
+
本ソフトウェアは無料です。
|
143 |
+
詳細は[LICENSE](LICENSE)をご確認ください。
|
144 |
+
|
demo_en.gif
ADDED
![]() |
Git LFS Details
|
dist/LLM_Battle-0.1-darwin-universal.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f667f200be698fc3c2e644a158489a0270a32af7dc73408efc315e1d0f174028
|
3 |
+
size 150738860
|
dist/LLM_Battle-0.1-win64.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b1bb259649b6fc398d020739bbd048b51a6a76587c5a26042f2690fa8a3c3559
|
3 |
+
size 141963058
|