HITS - Hypertext Induced Topic Selection
Definition
The HITS algorithm measures the “hubness” and the “authority” of each node in the graph. “Hubness” of a node is a defined in terms of the authorities of the outgoing nodes and “authority” is defined in terms of the hubness of the incoming nodes. Therefore, a node has a high hubness value if it has many outgoing edges leading to nodes with high authority values and a node has a high authority value if the hubness values of its incoming nodes are high. Hubness of a node depends not only on the number of nodes that it points, but also on the authority of those nodes. A high authority value shows that the node is an important because several other highly connected nodes are pointing to it. A high hubness value shows that the node is important because it has connections to many other high authority nodes in the graph.
The Hub score and Authority score for a node is calculated using the following algorithm:
See: HITS Based Page Rank
- Start with each node having a hub score and authority score of 1.
- Run the Authority Update Rule - Update each node's Authority score to be equal to the sum of the Hub Scores of each node that points to it.
- Run the Hub Update Rule - Update each node's Hub Score to be equal to the sum of the Authority Scores of each node that it points to.
- Normalize values by dividing each Hub score by the sum of squares of all Hub scores, and dividing each Authority score by the sum of the squares of all Authority scores.
- Repeat from the second step as necessary
- there are two mutually recursive scores being calculated, rather than a single value
- the edge weights are effectively all 1, i.e., they can't be interpreted as transition probabilities. This means that the more inlinks and outlinks that a vertex has, the better, since adding an inlink (or outlink) does not dilute the influence of the other inlinks (or outlinks) as in random walk-based algorithms.
- the scores cannot be interpreted as posterior probabilities (due to the different normalization)
See: HITS Based Page Rank
Requirements
Require connected and loop free network.
Software
- CentiBiN
http://centibin.ipk-gatersleben.de/ - CentiLib
http://centilib.ipk-gatersleben.de/ - JUNG
http://jung.sourceforge.net - Pajek
http://pajek.imfm.si/ - Sentinel Visualizer
http://www.fmsasg.com/SocialNetworkAnalysis/ - UCINET
https://sites.google.com/site/ucinetsoftware/ - Visone
http://visone.info/ - Wolfram
http://www.wolfram.com
References
- HITS algorithm. (2014, July 17). In Wikipedia, The Free Encyclopedia. Retrieved 13:30, August 11, 2014, from http://en.wikipedia.org/w/index.php?title=HITS_algorithm&oldid=617343038
- KLEINBERG, J. M. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46, 604-632.