Skip to main content

Table 1 ICU and CC biasness analysis

From: Computational codon optimization of synthetic gene for protein expression

 

E. coli

L. lactis

P. pastoris

S. cerevisiae

Null hypothesis (H0)

DH = U

DH = DA

DH = U

DH = DA

DH = U

DH = DA

DH = U

DH = DA

Alternative hypothesis (H1)

DH ≠ U

DH ≠ DA

DH ≠ U

DH ≠ DA

DH ≠ U

DH ≠ DA

DH ≠ U

DH ≠ DA

No. of biased amino acids (P-value < 0.05)

18

17

19

17

18

19

18

19

No. of unbiased amino acids (P-value ≥ 0.05)

1

2

0

2

1

0

1

0

No. of singular amino acids

2

2

2

2

2

2

2

2

No. of unevaluated amino acids (Expect count < 5)

0

0

0

0

0

0

0

0

Total no. of amino acids

21

21

21

21

21

21

21

21

No. of biased amino acid pairs (P-value < 0.05)

314

99

327

15

354

259

372

282

No. of unbiased amino acid pairs (P-value ≥ 0.05)

26

23

12

65

38

36

19

9

No. of singular amino acid pairs

4

4

4

4

4

4

4

4

No. of unevaluated amino acid pairs (Expect count < 5)

76

294

77

336

24

121

25

125

Total no. of amino acid pairs

420

420

420

420

420

420

420

420

  1. The chi-squared statistic is computed based on the observed occurrence of each codon (pair) and the expected occurrence under the null hypothesis of uniform distribution. Any amino acid (pair) with p-value < 0.05 is considered to exhibit significantly biased codon (pair) usage. Singular amino acids (methionine and tryptophan) and singular amino acid pairs (pairs only consisting of methionine and/or tryptophan) are not amenable to the biasness analysis since they are not encoded by more than one synonymous codon (pair). Chi-squared statistic and p-value are not calculated for amino acid (pair) with expected counts less than 5 (see Materials and Methods for details). Abbreviations: DA, codon (pair) distribution of all genes in the genome; DH, codon (pair) distribution of high-expression genes; U, uniform distribution.