Content Centrality
Definition
It considers the feature vector of each node generated from its posting activities in social media, its own properties and so forth, in order to extract nodes who have neighbors with similar features. Assuming content distribution over a network, a novel centrality measure was developed called Content Centrality.
In a given, simple undirected network structure $G =(V,E)$ where $V$ and $E$ are the node set and link set, respectively, each node $u\in V$ has $J-dimensional$ content vector $X_u$. Let $d(v,u)$ be the shortest path length between nodes $u$ and $v$ and $d(u,v)=d(v,u),d(u,u)=0$. We define $u^,$s neighbor node set with distance $d$ as $\Gamma_d(u)=\{v:d(u,v)=d\}\subset V$.
Assuming content distribution, the very distant nodes naturally exert almost no influence. To express such effects, we introduce two types of decay functions. The first is an exponential decay function defined by
$$\rho(d;\lambda)=exp(-\lambda d)$$,
where $\lambda$ is a parameter that controls the decay power. Another natural one is a power-law decay function defined by
$$\rho(d;\lambda)=exp(-\lambda log d)$$.
Now, for each node, we calculate the resultant vector with the distance-based decay weight as follows:
$$y_u={\underset{d=1}{\overset{D_u}{\sum}}}\rho(d;\lambda){\underset{v\in \Gamma_d(u)}{\sum}} X_u={\underset{v\in V \{u\}}{\sum}}\rho(d(u,v);\lambda)X_u,$$
where $D_u=max_{v\in V} d(u,v)$. We call this the Resultant Vector with a Decay weight of node $u(RVwD)$. The $RVwD$ of node $u$ is appended to the vectors of the near nodes, including the directly connected nodes with strong weights and those of the distant nodes with weak weights. Therefore, the vector is somewhat smoothed.
In order to quantify the density expressing how many similar nodes exist near each node, we calculate the cosine similarity between each node and its neighbors, and define this value as content centrality score of each node:
$$CDC(u)={\langle X_u,Y_u\rangle}={\left\langle X_u,{{{\sum}_{d=1}^{D^u}}\rho(d;\lambda){{\sum}_{v\in \Gamma_d(u)}{X_v}}\over{||{{\sum}_{d=1}^{D^u}}\rho(d;\lambda){{\sum}_{v\in \Gamma_d(u)}{X_v}}||}}\right\rangle}$$
where original content vector $X_u$ is normalized as the $L2$ norm to $1$. When this value $CDC(u)$ exceeds the other nodes, node $u$ is a highly ranked node of content centrality, which means that similar contents are concentratedly distributed around it.
In a given, simple undirected network structure $G =(V,E)$ where $V$ and $E$ are the node set and link set, respectively, each node $u\in V$ has $J-dimensional$ content vector $X_u$. Let $d(v,u)$ be the shortest path length between nodes $u$ and $v$ and $d(u,v)=d(v,u),d(u,u)=0$. We define $u^,$s neighbor node set with distance $d$ as $\Gamma_d(u)=\{v:d(u,v)=d\}\subset V$.
Assuming content distribution, the very distant nodes naturally exert almost no influence. To express such effects, we introduce two types of decay functions. The first is an exponential decay function defined by
$$\rho(d;\lambda)=exp(-\lambda d)$$,
where $\lambda$ is a parameter that controls the decay power. Another natural one is a power-law decay function defined by
$$\rho(d;\lambda)=exp(-\lambda log d)$$.
Now, for each node, we calculate the resultant vector with the distance-based decay weight as follows:
$$y_u={\underset{d=1}{\overset{D_u}{\sum}}}\rho(d;\lambda){\underset{v\in \Gamma_d(u)}{\sum}} X_u={\underset{v\in V \{u\}}{\sum}}\rho(d(u,v);\lambda)X_u,$$
where $D_u=max_{v\in V} d(u,v)$. We call this the Resultant Vector with a Decay weight of node $u(RVwD)$. The $RVwD$ of node $u$ is appended to the vectors of the near nodes, including the directly connected nodes with strong weights and those of the distant nodes with weak weights. Therefore, the vector is somewhat smoothed.
In order to quantify the density expressing how many similar nodes exist near each node, we calculate the cosine similarity between each node and its neighbors, and define this value as content centrality score of each node:
$$CDC(u)={\langle X_u,Y_u\rangle}={\left\langle X_u,{{{\sum}_{d=1}^{D^u}}\rho(d;\lambda){{\sum}_{v\in \Gamma_d(u)}{X_v}}\over{||{{\sum}_{d=1}^{D^u}}\rho(d;\lambda){{\sum}_{v\in \Gamma_d(u)}{X_v}}||}}\right\rangle}$$
where original content vector $X_u$ is normalized as the $L2$ norm to $1$. When this value $CDC(u)$ exceeds the other nodes, node $u$ is a highly ranked node of content centrality, which means that similar contents are concentratedly distributed around it.
References
- Fushimi T., Satoh T., Saito K., Kazama K., Kando N., 2016. Content centrality measure for networks: Introducing distance-based Decay weights. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10047 LNCS, pp.40-54. DOI: 10.1007/978-3-319-47874-6_4