Skip to main content

Table 1 List of all features.

From: Identifying essential genes in bacterial metabolic networks with machine learning methods

Short form

Explanation

Topology features

 

a) Deviation

RUP

Reachable/Unreachable Products (RUP): equals one if all products could be produced when blocking the reaction, otherwise zero

PUP

Percentage of Unreachable Products (PUP): the percentage of products which cannot be produced when blocking the reaction

ND

Number of Deviations (ND)

APL

Average Path Length (APL): the average path length of the deviations

LSP

Length of the Shortest Path (LSP): the length of the shortest path of the deviations

 

b) Local topology

NS

Number of Substrates (NS)

NP

Number of Products (NP)

NNR

Number of Neighboring Reactions (NNR)

NNNR

Number of Neighbors of Neighboring Reactions (NNNR)

CCV

Clustering Coefficient Value (CCV): clustering coefficient of a reaction

DIR

Directionality of a reaction (DIR)

 

c) Choke points and load scores

CP

Choke Point (CP): a reaction is a choke point or not (Rahman et al, 2006)

LS

Load Score (LS): load score of a reaction (Rahman et al, 2006)

 

d) Damage

NDR

Number of Damaged Reactions (NDR) (Lemke et al, 2004)

NDC

Number of Damaged Compounds (NDC) (Lemke et al, 2004)

NDRD

Number of Damaged Reactions having no Deviations (NDRD): the number of damaged reactions that have no other alternative paths to be reached after blocking a reaction

NDCD

Number of Damaged Compounds having no Deviations (NDCD): the number of damaged compounds that have no other alternative paths to be reached after blocking a reaction

NDCR

Number of Damaged Choke point Reactions (NDCR)

NDCC

Number of Damaged Choke point Compounds (NDCC)

NDCRD

Number of Damaged Choke point Reactions having no Deviations (NDCRD): the number of damaged choke point reactions that have no other alternative paths to be reached after blocking a reaction

NDCCD

Number of Damaged Choke point Compounds having no Deviations (NDCCD): the number of damaged choke point compounds that have no other alternative paths to be reached after blocking a reaction

 

e) Centrality

BW

Betweenness centrality

CN

Closeness centrality

EC

Eccentricity centrality

EV

Eigenvector centrality

Genomic and transcriptomic features

 

f) Homologs

NAR

Number of Associated Reactions (NAR): the number of reactions that base on the knocked-out gene

Hn

Homology at different expectation values: the number of homologous genes with e-value cutoff 10-30,10-20,10-10,10-7,10-5,10-3 (H30, H20, H10, H7, H5, H3)

 

g) Gene expression

NGSE

Number of Genes having Similar Expression (NGSE): the number of genes that have similar expression (correlation coefficient >0.8)

MCC

Maximum of Correlation Coefficients (MCC): maximum value of the correlation coefficients for all neighboring genes

 

h) Phyletic retention

PR

Phyletic Retention (PR): the number of orthologs in the other prokaryotes

 

i) Codon usage

Nc

Number of codons

N3s

Base composition at silent sites (T3s, C3s, A3s, G3s)

glt

The frequency of amino acids glutamine (exemplarily)