EGC - Edge clustering coefficient and Gene ontology information’s Combination method


Definition

Gene ontology (GO) information is adopted as a measure to evaluate the reliability of the edges in PPI network and a new algorithm EGC is proposed to identify essential proteins by integrating the topological features of the PPI network and the information of GO.

The essentiality of each protein u in PPI network is determined by:
EGC
Here, t is a proportionality parameter which takes value in the range of 0 to 1, Nu is the set which contains all the neighbors of u.

GO similarity
The GO similarity between protein u and protein v is
GO–similarity(u, v) = GO(u) ∩ GO(v)
where GO(u) and GO(v) are the sets of GO terms for protein u and protein v, respectively.

Edge Clustering Coefficient
The edge clustering coefficient of an edge (u, v), connecting node u and node v, can be defined by the following formula:
EGC
where ku and kv are the degrees of node u and node v, respectively. zu,v(3) means the number of triangles constituted by the edge (u, v) in the network and min(ku − 1, kv − 1) is the maximum possible number of triangles that constituted by the edge (u, v).

Requirements

Undirected graph G=(V,E), GO database, parameter t and k.

Software

References

  • LUO, J. & ZHANG, N. 2014. Prediction of essential proteins based on edge clustering coefficient and gene ontology information. Journal of Biological Systems, 22, 339-351. DOI: 10.1142/S0218339014500119 Publisher web site Endnote RIS file