Spaces:
Running
A newer version of the Gradio SDK is available:
5.42.0
Accounting for sequencing subsampling per sample
Each sequencing sample $s$ could be over- or under-sampling the population relative to the first timepoint by some factor $\phi_s$.
Variables:
- $c_i(t)$: Read (or UMI) count of strain $i$ at time $t$
- $\phi_s$: The ratio of sampling depth at time $t$ to that at time $0$ for sample $s$
The factor $\phi_s$ is the ratio of the ratio of read counts between samples and the ratio of cell counts between samples for any strain (assuming each strain is sampled without bias):
We can get rid of the nuisance parameter $\phi_s$ (which is difficult to measure becuase we don't know the true number of cells for each strain and sample) using the following trick.
We have the equation for read counts for mutant $i$ (same as above):
And for the reference strain (relative fitness is 1):
We can make $\phi_s$ disappear by taking the difference:
This is equivalent to:
So the ratio of the count ratio of a strain to the reference strain at time t to the count ratio of a strain to the reference strain at time 0 is dependent only on the relative fitness and the true fold-expansion of the reference strain.
Plotting the ratio of the count ratio of a strain to the reference strain at time t to the count ratio of a strain to the reference strain at time 0 should give a straight line (on a log-log) plot, with intercept 0 and gradient equal to the relative fitness minus 1.