ljleb commited on
Commit
0652f48
·
verified ·
1 Parent(s): 25e4a6a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -9,7 +9,7 @@ Naively applying the formulas from the paper gave poor results when trying to ad
9
  The goal is to estimate which parameters of each model are very important and which parameters are less important. The way I did this was by first computing gradients with respect to this objective:
10
 
11
  $$
12
- L = E_{x \sim p(x), t \propto \frac{1}{snr_t}} \big[\frac{L_0(x, t)}{C(x, t)}\big]
13
  $$
14
 
15
  where L_0 expresses what we are trying to optimize (A corresponds to NoobAI, B corresponds to Animagine):
@@ -30,7 +30,7 @@ x_t and snr_t were taken from the DDPM paper (https://arxiv.org/abs/2006.11239):
30
 
31
  $$
32
  x_t = \sqrt{\bar{\alpha}_t} x + \sqrt{1 - \bar{\alpha}_t} \epsilon \\
33
- snr_t = \frac{\bar{\alpha}_t}{1 - \bar{\alpha}_t}
34
  $$
35
 
36
  It's important to note that L is not used to train any model here. Instead, we accumulate absolute gradients to estimate the importance of each parameter (explained below):
 
9
  The goal is to estimate which parameters of each model are very important and which parameters are less important. The way I did this was by first computing gradients with respect to this objective:
10
 
11
  $$
12
+ L = E_{x \sim p(x), t \propto \frac{1}{snr_t^2}} \big[\frac{L_0(x, t)}{C(x, t)}\big]
13
  $$
14
 
15
  where L_0 expresses what we are trying to optimize (A corresponds to NoobAI, B corresponds to Animagine):
 
30
 
31
  $$
32
  x_t = \sqrt{\bar{\alpha}_t} x + \sqrt{1 - \bar{\alpha}_t} \epsilon \\
33
+ snr_t = \frac{\sqrt{\bar{\alpha}_t}}{\sqrt{1 - \bar{\alpha}_t}}
34
  $$
35
 
36
  It's important to note that L is not used to train any model here. Instead, we accumulate absolute gradients to estimate the importance of each parameter (explained below):