OneclickAI commited on
Commit
707b49e
ยท
verified ยท
1 Parent(s): c0ff2cf

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +284 -0
README.md CHANGED
@@ -1,3 +1,287 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ ์•ˆ๋…•ํ•˜์„ธ์š” Oneclick AI ์ž…๋‹ˆ๋‹ค!!
6
+ ์˜ค๋Š˜์€, RNN์˜ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•œ LSTM(Long Short-Term Memory)๊ณผ GRU(Gated Recurrent Unit) ๋ชจ๋ธ์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด๋Š” ์‹œ๊ฐ„์„ ๊ฐ€์ ธ๋ณผ๊นŒ ํ•ฉ๋‹ˆ๋‹ค.
7
+
8
+ RNN์ด ์ˆœ์ฐจ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃจ๋Š” ๋ฐ ํ˜์‹ ์„ ๊ฐ€์ ธ์™”์ง€๋งŒ, ๊ธด ์‹œํ€€์Šค์—์„œ ๊ณผ๊ฑฐ ์ •๋ณด๋ฅผ ์ œ๋Œ€๋กœ ๊ธฐ์–ตํ•˜์ง€ ๋ชปํ•˜๋Š” '์žฅ๊ธฐ ์˜์กด์„ฑ ๋ฌธ์ œ'๋กœ ์ธํ•ด ํ•œ๊ณ„๋ฅผ ๋“œ๋Ÿฌ๋ƒˆ์Šต๋‹ˆ๋‹ค.
9
+ LSTM๊ณผ GRU๋Š” ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๊ณ ์•ˆ๋œ ๊ณ ๊ธ‰ ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์œผ๋กœ, ๋งˆ์น˜ ์‚ฌ๋žŒ์˜ ์žฅ๊ธฐ ๊ธฐ์–ต์ฒ˜๋Ÿผ ์ค‘์š”ํ•œ ์ •๋ณด๋ฅผ ์„ ํƒ์ ์œผ๋กœ ์œ ์ง€ํ•˜๊ณ  ์žŠ์–ด๋ฒ„๋ฆด ์ˆ˜ ์žˆ๋Š” '๊ฒŒ์ดํŠธ' ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ๋„์ž…ํ–ˆ์Šต๋‹ˆ๋‹ค.
10
+ ์˜ค๋Š˜์€ ์ด ๋‘ ๋ชจ๋ธ์ด ์–ด๋–ป๊ฒŒ RNN์˜ ์•ฝ์ ์„ ๋ณด์™„ํ•˜๋ฉฐ ์ž‘๋™ํ•˜๋Š”์ง€, ๊ทธ๋ฆฌ๊ณ  ์–ด๋–ป๊ฒŒ ๋” ๋ณต์žกํ•œ ๋ฌธ์žฅ์ด๋‚˜ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ์ •๊ตํ•˜๊ฒŒ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ์•„๋ด…์‹œ๋‹ค.
11
+
12
+ ---
13
+
14
+ ## ๋ชฉ์ฐจ
15
+ 1. LSTM/GRU ํ•ต์‹ฌ ์›๋ฆฌ ํŒŒ์•…ํ•˜๊ธฐ
16
+ - ์™œ LSTM/GRU๋ฅผ ์‚ฌ์šฉํ•ด์•ผ๋งŒ ํ• ๊นŒ?
17
+ - LSTM์˜ ์‹ฌ์žฅ : ์…€ ์ƒํƒœ์™€ 3๊ฐœ์˜ ๊ฒŒ์ดํŠธ ๋ฉ”์ปค๋‹ˆ์ฆ˜
18
+ - GRU : LSTM์˜ ๊ฐ„์†Œํ™”๋œ ๋ฒ„์ „๊ณผ 2๊ฐœ์˜ ๊ฒŒ์ดํŠธ
19
+ - LSTM๊ณผ GRU๋ฅผ ์‹œ๊ฐ„์— ๋”ฐ๋ผ ํŽผ์ณ๋ณด๊ธฐ
20
+ - LSTM/GRU์˜ ์ฃผ์š” ๊ตฌ์„ฑ ์š”์†Œ ์ƒ์„ธ ๋ถ„์„
21
+ 2. ์•„ํ‚คํ…์ฒ˜๋ฅผ ํ†ตํ•œ ๋‚ด๋ถ€ ์ฝ”๋“œ ๋“ค์—ฌ๋‹ค ๋ณด๊ธฐ
22
+ - Keras๋กœ ๊ตฌํ˜„ํ•œ LSTM/GRU ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜
23
+ - model.summary()๋กœ ๊ตฌ์กฐ ํ™•์ธํ•˜๊ธฐ
24
+ 3. ์ง์ ‘ LSTM/GRU ๊ตฌํ˜„ํ•ด ๋ณด๊ธฐ
25
+ - 1๋‹จ๊ณ„ : ๋ฐ์ดํ„ฐ ๋กœ๋“œ ๋ฐ ์ „์ฒ˜๋ฆฌ
26
+ - 2๋‹จ๊ณ„ : ๋ชจ๋ธ ์ปดํŒŒ์ผ
27
+ - 3๋‹จ๊ณ„ : ๋ชจ๋ธ ํ•™์Šต ๋ฐ ํ‰๊ฐ€
28
+ - 4๋‹จ๊ณ„ : ํ•™์Šต๋œ ๋ชจ๋ธ ์ €์žฅ ๋ฐ ์žฌ์‚ฌ์šฉ
29
+ - 5๋‹จ๊ณ„ : ๋‚˜๋งŒ์˜ ๋ฌธ์žฅ์œผ๋กœ ๋ชจ๋ธ ํ…Œ์ŠคํŠธํ•˜๊ธฐ
30
+ 4. ๋‚˜๋งŒ์˜ LSTM/GRU ๋ชจ๋ธ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ธฐ
31
+ - ๊ธฐ์ดˆ ์ฒด๋ ฅ ํ›ˆ๋ จ : ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹
32
+ - ์ธต ์Œ“๊ธฐ : ๋‹ค์ค‘ LSTM/GRU ๋ ˆ์ด์–ด
33
+ - ๊ณผ๊ฑฐ์™€ ๋ฏธ๋ž˜๋ฅผ ๋™์‹œ์— : ์–‘๋ฐฉํ–ฅ LSTM/GRU
34
+ - ์ „์ดํ•™์Šต์œผ๋กœ ์„ฑ๋Šฅ ๊ทน๋Œ€ํ™” ํ•˜๊ธฐ
35
+ 5. ๊ฒฐ๋ก 
36
+ ---
37
+
38
+ ## 1. LSTM/GRU ํ•ต์‹ฌ์›๋ฆฌ ํŒŒ์•…ํ•˜๊ธฐ
39
+ ๊ฐ€์žฅ ๋จผ์ €, LSTM๊ณผ GRU๊ฐ€ ์™œ RNN์˜ ๋Œ€์•ˆ์œผ๋กœ ๋“ฑ์žฅํ–ˆ๋Š”์ง€ ๊ทธ ๊ทผ๋ณธ์ ์ธ ์ด์œ ๋ถ€ํ„ฐ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
40
+
41
+ **์™œ LSTM/GRU๋ฅผ ์‚ฌ์šฉํ• ๊นŒ?? with RNN์˜ ํ•œ๊ณ„**
42
+ ๊ธฐ๋ณธ RNN์€ ์€๋‹‰ ์ƒํƒœ๋ฅผ ํ†ตํ•ด ๊ณผ๊ฑฐ ์ •๋ณด๋ฅผ ์ „๋‹ฌํ•˜์ง€๋งŒ, ์‹œํ€€์Šค๊ฐ€ ๊ธธ์–ด์ง€๋ฉด ๊ทธ๋ž˜๋””์–ธํŠธ ์†Œ์‹ค(Vanishing Gradient)์ด๋‚˜ ํญ๋ฐœ(Exploding Gradient) ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.
43
+ ์ด๋Š” ํ•™์Šต ๊ณผ์ •์—์„œ ๊ธฐ์šธ๊ธฐ๊ฐ€ 0์— ๊ฐ€๊นŒ์›Œ์ง€๊ฑฐ๋‚˜ ๋ฌดํ•œ๋Œ€๊ฐ€ ๋˜์–ด, ๋ฌธ์žฅ ์•ž๋ถ€๋ถ„์˜ ์ค‘์š”ํ•œ ์ •๋ณด๋ฅผ ์žŠ์–ด๋ฒ„๋ฆฌ๋Š” '์žฅ๊ธฐ ์˜์กด์„ฑ ๋ฌธ์ œ(Long-Term Dependency)'๋ฅผ ์ดˆ๋ž˜ํ•ฉ๋‹ˆ๋‹ค.
44
+ ์˜ˆ๋ฅผ ๋“ค์–ด, "์–ด๋ฆฐ ์‹œ์ ˆ ํ”„๋ž‘์Šค์—์„œ ์ž๋ž๊ธฐ ๋•Œ๋ฌธ์—... (๊ธด ๋‚ด์šฉ)... ๊ทธ๋ž˜์„œ ๋‚˜๋Š” ํ”„๋ž‘์Šค์–ด๋ฅผ ์œ ์ฐฝํ•˜๊ฒŒ ๊ตฌ์‚ฌํ•œ๋‹ค."๋ผ๋Š” ๋ฌธ์žฅ์—์„œ RNN์€ 'ํ”„๋ž‘์Šค'๋ผ๋Š” ์ดˆ๊ธฐ ์ •๋ณด๋ฅผ ์žŠ๊ธฐ ์‰ฝ์Šต๋‹ˆ๋‹ค.
45
+ LSTM๊ณผ GRU๋Š” ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด '๊ฒŒ์ดํŠธ'๋ผ๋Š” ๊ตฌ์กฐ๋ฅผ ๋„์ž…ํ•˜์—ฌ, ์ •๋ณด์˜ ํ๋ฆ„์„ ์ œ์–ดํ•ฉ๋‹ˆ๋‹ค.
46
+ ์ด๋“ค์€ RNN์˜ ๊ธฐ๋ณธ ๊ตฌ์กฐ๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ๋„ ์ค‘์š”ํ•œ ์ •๋ณด๋ฅผ ์„ ํƒ์ ์œผ๋กœ ๊ธฐ์–ตํ•˜๊ณ  ๋ถˆํ•„์š”ํ•œ ๊ฒƒ์€ ์žŠ์–ด๋ฒ„๋ฆด ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
47
+
48
+ **LSTM์˜ ์‹ฌ์žฅ : ์…€ ์ƒํƒœ์™€ 3๊ฐœ์˜ ๊ฒŒ์ดํŠธ ๋ฉ”์ปค๋‹ˆ์ฆ˜**
49
+ LSTM์˜ ํ•ต์‹ฌ์€ '์…€ ์ƒํƒœ(Cell State, $C_t$)'์™€ ์ด๋ฅผ ์ œ์–ดํ•˜๋Š” 3๊ฐœ์˜ ๊ฒŒ์ดํŠธ์ž…๋‹ˆ๋‹ค.
50
+ - ์…€ ์ƒํƒœ(Cell State, $C_t$): ์žฅ๊ธฐ ๊ธฐ์–ต์„ ์œ„ํ•œ '์ปจ๋ฒ ์ด์–ด ๋ฒจํŠธ'๋กœ, ์ •๋ณด๊ฐ€ ๊ฑฐ์˜ ๋ณ€ํ˜• ์—†์ด ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค.
51
+ - ๊ฒŒ์ดํŠธ(Gates): ์‹œ๊ทธ๋ชจ์ด๋“œ(Sigmoid) ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด 0~1 ์‚ฌ์ด์˜ ๊ฐ’์„ ์ถœ๋ ฅํ•˜๋ฉฐ, ์ •๋ณด์˜ ํ†ต๊ณผ ์—ฌ๋ถ€๋ฅผ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
52
+
53
+ 1. ๋ง๊ฐ ๊ฒŒ์ดํŠธ(Forget Gate, $f_t$): ์ด์ „ ์…€ ์ƒํƒœ $C_{t-1}$์—์„œ ์–ด๋–ค ์ •๋ณด๋ฅผ ์žŠ์„์ง€ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
54
+ $f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f)$
55
+ (์—ฌ๊ธฐ์„œ $\sigma$๋Š” ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜, $h_{t-1}$์€ ์ด์ „ ์€๋‹‰ ์ƒํƒœ, $x_t$๋Š” ํ˜„์žฌ ์ž…๋ ฅ)
56
+
57
+ 2. ์ž…๋ ฅ ๊ฒŒ์ดํŠธ(Input Gate, $i_t$)์™€ ํ›„๋ณด ์…€ ์ƒํƒœ($\tilde{C_t}$): ์ƒˆ๋กœ์šด ์ •๋ณด๋ฅผ ์–ผ๋งˆ๋‚˜ ์ถ”๊ฐ€ํ• ์ง€ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
58
+ $i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i)$
59
+ $\tilde{C_t} = \tanh(W_C \cdot [h_{t-1}, x_t] + b_C)$
60
+
61
+ 3. ์ถœ๋ ฅ ๊ฒŒ์ดํŠธ(Output Gate, $o_t$): ์…€ ์ƒํƒœ์—์„œ ์–ด๋–ค ์ •๋ณด๋ฅผ ์ถœ๋ ฅํ• ์ง€ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
62
+ $o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o)$
63
+ ์ตœ์ข… ์…€ ์ƒํƒœ $C_t = f_t \odot C_{t-1} + i_t \odot \tilde{C_t}$ ( $\odot$์€ ์š”์†Œ๋ณ„ ๊ณฑ)
64
+ ์€๋‹‰ ์ƒํƒœ $h_t = o_t \odot \tanh(C_t)$
65
+
66
+ ์ด ๊ตฌ์กฐ ๋•๋ถ„์— LSTM์€ ์žฅ๊ธฐ์ ์ธ ์˜์กด์„ฑ์„ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.
67
+
68
+ **GRU : LSTM์˜ ๊ฐ„์†Œํ™”๋œ ๋ฒ„์ „๊ณผ 2๊ฐœ์˜ ๊ฒŒ์ดํŠธ**
69
+ GRU๋Š” LSTM์˜ ๋ณ€ํ˜•์œผ๋กœ, ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ค„์—ฌ ๊ณ„์‚ฐ ํšจ์œจ์„ฑ์„ ๋†’์˜€์Šต๋‹ˆ๋‹ค.
70
+ ์€๋‹‰ ์ƒํƒœ $h_t$๊ฐ€ ์…€ ์ƒํƒœ ์—ญํ• ์„ ๊ฒธํ•˜๋ฉฐ, 2๊ฐœ์˜ ๊ฒŒ์ดํŠธ๋งŒ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
71
+ - ๋ฆฌ์…‹ ๊ฒŒ์ดํŠธ(Reset Gate, $r_t$): ์ด์ „ ์€๋‹‰ ์ƒํƒœ๋ฅผ ์–ผ๋งˆ๋‚˜ ๋ฌด์‹œํ• ์ง€ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
72
+ $r_t = \sigma(W_r \cdot [h_{t-1}, x_t] + b_r)$
73
+
74
+ - ์—…๋ฐ๏ฟฝ๏ฟฝ๏ฟฝํŠธ ๊ฒŒ์ดํŠธ(Update Gate, $z_t$): ์ด์ „ ์ƒํƒœ์™€ ์ƒˆ ํ›„๋ณด ์ƒํƒœ๋ฅผ ์–ผ๋งˆ๋‚˜ ์„ž์„์ง€ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค. (LSTM์˜ ๋ง๊ฐ+์ž…๋ ฅ ๊ฒŒ์ดํŠธ ์—ญํ• )
75
+ $z_t = \sigma(W_z \cdot [h_{t-1}, x_t] + b_z)$
76
+ ํ›„๋ณด ์€๋‹‰ ์ƒํƒœ $\tilde{h_t} = \tanh(W_h \cdot [r_t \odot h_{t-1}, x_t] + b_h)$
77
+ ์ตœ์ข… $h_t = (1 - z_t) \odot h_{t-1} + z_t \odot \tilde{h_t}$
78
+
79
+ GRU๋Š” LSTM๋งŒํผ ๊ฐ•๋ ฅํ•˜๋ฉด์„œ๋„ ํ•™์Šต์ด ๋” ๋น ๋ฆ…๋‹ˆ๋‹ค.
80
+
81
+ **LSTM/GRU๋ฅผ ์‹œ๊ฐ„์— ๋”ฐ๋ผ ํŽผ์ณ๋ณด๊ธฐ**
82
+ ์•„๋ž˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋„คํŠธ์›Œํฌ๋ฅผ ๊ธธ๊ฒŒ ํŽผ์ณ์„œ ํ‘œํ˜„ํ•˜๋ฉด, ์‰ฝ๊ฒŒ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
83
+ ```markdown
84
+ ์‹œ๊ฐ„ ํ๋ฆ„ โ”€โ”€โ”€โ–ถ
85
+ ์ž…๋ ฅ ์‹œํ€€์Šค: xโ‚ xโ‚‚ xโ‚ƒ ... xโ‚œ
86
+ โ†“ โ†“ โ†“ โ†“
87
+ โ”Œโ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ” ... โ”Œโ”€โ”€โ”€โ”€โ”
88
+ hโ‚€, Cโ‚€ โ”€โ”€โ–ถโ”‚LSTMโ”‚โ–ถโ”‚LSTMโ”‚โ–ถโ”‚LSTMโ”‚ โ–ถ ... โ–ถโ”‚LSTMโ”‚ (๋˜๋Š” GRU)
89
+ โ””โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”˜
90
+ โ”‚ โ”‚ โ”‚ โ”‚
91
+ โ–ผ โ–ผ โ–ผ โ–ผ
92
+ hโ‚ hโ‚‚ hโ‚ƒ hโ‚œ
93
+ ```
94
+ ๊ฐ ํƒ€์ž„์Šคํ…์—์„œ ๊ฒŒ์ดํŠธ๊ฐ€ ์ •๋ณด๋ฅผ ์ œ์–ดํ•˜๋ฉฐ, ์…€ ์ƒํƒœ(๋˜๋Š” ์€๋‹‰ ์ƒํƒœ)๊ฐ€ ์žฅ๊ธฐ์ ์œผ๋กœ ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค.
95
+
96
+ **LSTM/GRU์˜ ์ฃผ์š” ๊ตฌ์„ฑ ์š”์†Œ**
97
+ - ๊ฒŒ์ดํŠธ ๋ฉ”์ปค๋‹ˆ์ฆ˜: ์ •๋ณด ์„ ํƒ๊ณผ ์‚ญ์ œ.
98
+ - ์€๋‹‰/์…€ ์ƒํƒœ: ๋ฉ”๋ชจ๋ฆฌ ์—ญํ• .
99
+ - ํŒŒ๋ผ๋ฏธํ„ฐ ๊ณต์œ : ๋ชจ๋“  ํƒ€์ž„์Šคํ…์—์„œ ๋™์ผํ•œ ๊ฐ€์ค‘์น˜ ์‚ฌ์šฉ.
100
+
101
+ ---
102
+
103
+ ## 2. ์•„ํ‚คํ…์ฒ˜๋ฅผ ํ†ตํ•œ ๋‚ด๋ถ€ ์ฝ”๋“œ ๋“ค์—ฌ๋‹ค ๋ณด๊ธฐ
104
+ ์ด์ œ ์ด๋ก ์„ ๋ฐ”ํƒ•์œผ๋กœ, TensorFlow Keras ๋ฅผ ํ†ตํ•ด ์ง์ ‘ LSTM๊ณผ GRU๋ฅผ ๊ตฌํ˜„ํ•ด ๋ด…์‹œ๋‹ค.
105
+ Keras๋กœ ๊ตฌํ˜„ํ•œ LSTM/GRU ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜ ์‹ฌ์ธต ๋ถ„์„๋‹ค์Œ์€ IMDB ์˜ํ™” ๋ฆฌ๋ทฐ ๊ฐ์„ฑ ๋ถ„์„์„ ์œ„ํ•œ ๊ฐ„๋‹จํ•œ LSTM ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. (GRU๋„ ์œ ์‚ฌ)
106
+
107
+ ```python
108
+ import tensorflow as tf
109
+ from tensorflow import keras
110
+
111
+ # ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜ ์ •์˜
112
+ model = keras.Sequential([
113
+ # 1. ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ ์ธต
114
+ keras.layers.Embedding(input_dim=10000, output_dim=32),
115
+
116
+ # 2. LSTM ์ธต (GRU๋กœ ๋ฐ”๊พธ๋ ค๋ฉด SimpleRNN ๋Œ€์‹  LSTM ๋˜๋Š” GRU ์‚ฌ์šฉ)
117
+ keras.layers.LSTM(32),
118
+
119
+ # 3. ์ตœ์ข… ๋ถ„๋ฅ˜๊ธฐ
120
+ keras.layers.Dense(1, activation="sigmoid"),
121
+ ])
122
+
123
+ # ๋ชจ๋ธ ๊ตฌ์กฐ ์š”์•ฝ ์ถœ๋ ฅ
124
+ model.summary()
125
+ ```
126
+ ๋ ˆ์ด์–ด๋ฅผ ์ž์„ธํžˆ ๋“ค์–ด๋‹ค ๋ด…์‹œ๋‹ค.
127
+
128
+ - **์ž„๋ฒ ๋”ฉ ์ธต(Embedding)**
129
+ ```python
130
+ keras.layers.Embedding(input_dim=10000, output_dim=32)
131
+ ```
132
+ ๋‹จ์–ด๋ฅผ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜, RNN ๋ฌธ์„œ์™€ ๋™์ผ.
133
+
134
+ - **์ˆœํ™˜ ๊ณ„์ธต(LSTM ๋˜๋Š” GRU)**
135
+ ```python
136
+ keras.layers.LSTM(32),
137
+ ```
138
+ ๋˜๋Š”
139
+ ```python
140
+ keras.layers.GRU(32),
141
+ ```
142
+ ๋‚ด๋ถ€์ ์œผ๋กœ ๊ฒŒ์ดํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋ฉฐ, ์žฅ๊ธฐ ์˜์กด์„ฑ์„ ํ•™์Šต. ๊ธฐ๋ณธ์ ์œผ๋กœ ์ตœ์ข… ์€๋‹‰ ์ƒํƒœ๋งŒ ์ถœ๋ ฅ.
143
+
144
+ - **์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต(Dense)**
145
+ ```python
146
+ keras.layers.Dense(1, activation="sigmoid")
147
+ ```
148
+ ์ตœ์ข… ํŒ๋‹จ.
149
+
150
+ model.summary()๋กœ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜ ๊ณ„์‚ฐ ์›๋ฆฌ ์ดํ•ดํ•˜๊ธฐ์œ„ ์ฝ”๋“œ์—์„œ model.summary()๋ฅผ ์‹คํ–‰ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ต๋‹ˆ๋‹ค.
151
+
152
+ ```bash
153
+ Model: "sequential"
154
+ _________________________________________________________________
155
+ Layer (type) Output Shape Param #
156
+ =================================================================
157
+ embedding (Embedding) (None, None, 32) 320000
158
+
159
+ lstm (LSTM) (None, 32) 8320
160
+
161
+ dense (Dense) (None, 1) 33
162
+
163
+ =================================================================
164
+ Total params: 328,353
165
+ Trainable params: 328,353
166
+ Non-trainable params: 0
167
+ _________________________________________________________________
168
+ ```
169
+
170
+ ๊ฐ ์ธต์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋Š” ์–ด๋–ป๊ฒŒ ๊ณ„์‚ฐ๋˜๋Š”์ง€ ์•Œ์•„๋ณด์ž๋ฉด,
171
+ 1. Embedding: 10,000 * 32 = 320,000 ๊ฐœ.
172
+ 2. LSTM: ์ž…๋ ฅ(32)๊ณผ ์€๋‹‰(32)์„ ๊ณ ๋ คํ•œ 4๊ฐœ์˜ ๊ฒŒ์ดํŠธ(์ž…๋ ฅ, ๋ง๊ฐ, ์ถœ๋ ฅ, ํ›„๋ณด)๋กœ, (32+32+1)*32*4 = 8,320 ๊ฐœ. (GRU๋Š” 3๋ฐฐ: ์•ฝ 6,240)
173
+ 3. Dense: 32 * 1 + 1 = 33 ๊ฐœ.
174
+
175
+ ---
176
+
177
+ ## 3. ์ง์ ‘ LSTM/GRU ๊ตฌํ˜„ํ•ด ๋ณด๊ธฐ
178
+ ์ด์ œ, ์ „์ฒด ์ฝ”๋“œ๋ฅผ ๋‹จ๊ณ„๋ณ„๋กœ ์‹คํ–‰ํ•˜๋ฉฐ ์ง์ ‘ ๋ชจ๋ธ์„ ํ•™์Šต์‹œ์ผœ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. (RNN ๋ฌธ์„œ์™€ ์œ ์‚ฌ, IMDB ๋ฐ์ดํ„ฐ ์‚ฌ์šฉ)
179
+
180
+ **1๋‹จ๊ณ„. ๋ฐ์ดํ„ฐ ๋กœ๋“œ ๋ฐ ์ „์ฒ˜๋ฆฌ**
181
+ ```python
182
+ import numpy as np
183
+ import tensorflow as tf
184
+ from tensorflow import keras
185
+ from keras import layers
186
+
187
+ (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=10000)
188
+
189
+ x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=256)
190
+ x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=256)
191
+ ```
192
+
193
+ **2๋‹จ๊ณ„. ๋ชจ๋ธ ์ปดํŒŒ์ผ**
194
+ ```python
195
+ model = keras.Sequential([
196
+ layers.Embedding(input_dim=10000, output_dim=32),
197
+ layers.LSTM(32), # ๋˜๋Š” layers.GRU(32)
198
+ layers.Dense(1, activation="sigmoid")
199
+ ])
200
+
201
+ model.compile(
202
+ loss="binary_crossentropy",
203
+ optimizer="adam",
204
+ metrics=["accuracy"]
205
+ )
206
+ ```
207
+
208
+ **3๋‹จ๊ณ„. ๋ชจ๋ธ ํ•™์Šต ๋ฐ ํ‰๊ฐ€**
209
+ ```python
210
+ batch_size = 128
211
+ epochs = 10
212
+
213
+ history = model.fit(
214
+ x_train, y_train,
215
+ batch_size=batch_size,
216
+ epochs=epochs,
217
+ validation_data=(x_test, y_test)
218
+ )
219
+
220
+ score = model.evaluate(x_test, y_test, verbose=0)
221
+ print(f"\nTest loss: {score[0]:.4f}")
222
+ print(f"Test accuracy: {score[1]:.4f}")
223
+ ```
224
+
225
+ **4๋‹จ๊ณ„. ํ•™์Šต๋œ ๋ชจ๋ธ ์ €์žฅ ๋ฐ ์žฌ์‚ฌ์šฉ**
226
+ ```python
227
+ model.save("my_lstm_model_imdb.keras")
228
+ loaded_model = keras.models.load_model("my_lstm_model_imdb.keras")
229
+ ```
230
+
231
+ **5๋‹จ๊ณ„. ๋‚˜๋งŒ์˜ ๋ฌธ์žฅ์œผ๋กœ ๋ชจ๋ธ ํ…Œ์ŠคํŠธํ•˜๊ธฐ**
232
+ ```python
233
+ word_index = keras.datasets.imdb.get_word_index()
234
+
235
+ review = "This movie was fantastic and wonderful"
236
+ tokens = [word_index.get(word, 2) for word in review.lower().split()]
237
+ padded_tokens = keras.preprocessing.sequence.pad_sequences([tokens], maxlen=256)
238
+
239
+ prediction = loaded_model.predict(padded_tokens)
240
+ print(f"๋ฆฌ๋ทฐ: '{review}'")
241
+ print(f"๊ธ์ • ํ™•๋ฅ : {prediction[0][0] * 100:.2f}%")
242
+ ```
243
+
244
+ ## 4. ๋‚˜๋งŒ์˜ LSTM/GRU ๋ชจ๋ธ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ธฐ
245
+ ๊ธฐ๋ณธ ๋ชจ๋ธ์„ ๋” ๊ฐ•๋ ฅํ•˜๊ฒŒ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ๊ธฐ๋ฒ•์„ ์ ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
246
+
247
+ - **๊ธฐ์ดˆ ์ฒด๋ ฅ ํ›ˆ๋ จ : ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹**
248
+ ํ•™์Šต๋ฅ , ๋ฐฐ์น˜ ํฌ๊ธฐ, ์œ ๋‹› ์ˆ˜ ๋“ฑ์„ ์กฐ์ •.
249
+ ```python
250
+ optimizer = keras.optimizers.Adam(learning_rate=0.001)
251
+ model.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["accuracy"])
252
+ ```
253
+
254
+ - **์ธต ์Œ“๊ธฐ : ๋‹ค์ค‘ LSTM/GRU ๋ ˆ์ด์–ด**
255
+ ```python
256
+ model = keras.Sequential([
257
+ layers.Embedding(input_dim=10000, output_dim=64),
258
+ layers.LSTM(64, return_sequences=True),
259
+ layers.LSTM(32),
260
+ layers.Dense(1, activation='sigmoid')
261
+ ])
262
+ ```
263
+
264
+ - **๊ณผ๊ฑฐ์™€ ๋ฏธ๋ž˜๋ฅผ ๋™์‹œ์— : ์–‘๋ฐฉํ–ฅ LSTM/GRU**
265
+ ```python
266
+ model = keras.Sequential([
267
+ layers.Embedding(input_dim=10000, output_dim=64),
268
+ layers.Bidirectional(layers.LSTM(64)),
269
+ layers.Dropout(0.5),
270
+ layers.Dense(1, activation='sigmoid')
271
+ ])
272
+ ```
273
+
274
+ - **์ „์ดํ•™์Šต์œผ๋กœ ์„ฑ๋Šฅ ๊ทน๋Œ€ํ™” ํ•˜๊ธฐ**
275
+ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ(์˜ˆ: GloVe ์ž„๋ฒ ๋”ฉ) ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜, ๋Œ€ํ˜• ๋ชจ๋ธ์˜ LSTM ๋ ˆ์ด์–ด freeze.
276
+ ```python
277
+ # ์˜ˆ: ์‚ฌ์ „ ํ•™์Šต๋œ ์ž„๋ฒ ๋”ฉ ๋กœ๋“œ (๋ณ„๋„ ํŒŒ์ผ ํ•„์š”)
278
+ embedding_layer = layers.Embedding(input_dim=10000, output_dim=100, trainable=False)
279
+ # GloVe ๋“ฑ์œผ๋กœ ์ดˆ๊ธฐํ™”
280
+ ```
281
+
282
+ ## 5. ๊ฒฐ๋ก 
283
+ ์˜ค๋Š˜์€, RNN์˜ ํ•œ๊ณ„๋ฅผ ๋„˜์–ด์„  LSTM๊ณผ GRU์˜ ํ•ต์‹ฌ ์›๋ฆฌ๋ถ€ํ„ฐ ์‹ค์ œ ๊ตฌํ˜„, ์—…๊ทธ๋ ˆ์ด๋“œ ๋ฐฉ๋ฒ•๊นŒ์ง€ ์•Œ์•„๋ณด์•˜์Šต๋‹ˆ๋‹ค.
284
+ ์ด ๋‘ ๋ชจ๋ธ์€ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์‹œ๊ณ„์—ด ์˜ˆ์ธก, ์Œ์„ฑ ์ธ์‹ ๋“ฑ์—์„œ ์—ฌ์ „ํžˆ ํ•ต์‹ฌ์ ์ธ ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
285
+ ํŠนํžˆ, LSTM/GRU์˜ ๊ฒŒ์ดํŠธ ์•„์ด๋””์–ด๋Š” ์ดํ›„ ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜๊ณผ ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ์˜ ๊ธฐ๋ฐ˜์ด ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
286
+ ๋‹ค์Œ์—๋Š” ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ๋กœ ๋Œ์•„์˜ค๊ฒ ์Šต๋‹ˆ๋‹ค!!
287
+ ์˜ค๋Š˜๋„ ์ข‹์€ํ•˜๋ฃจ ๋ณด๋‚ด์„ธ์š”!!