Graph Theory

Characteristics

Absent

Present

Weights

Unweighted

Weighted

Directionality

Undirected

Directed

Why do I need to know about this?

Graph is a data structure which is used extensively in our real-life.

Social Networks: Each user is represented as a node and all their activities,suggestion and friend list are represented as an edge between the nodes.
Google Maps: Various locations are represented as vertices or nodes and the roads are represented as edges and graph theory is used to find shortest path between two nodes.
Recommendations on e-commerce websites: The “Recommendations for you” section on various e-commerce websites uses graph theory to recommend items of similar type to user’s choice.
Graph theory is also used to study molecules in chemistry and physics.

Additional Resources

Networkx

Network Centrality¶

The notion of centrality helps us identify Which nodes are most 'central' in a given network. Definition of 'central' varies by context/purpose of the analysis and situation. It could be based of number of connections or with a discrimination between incoming and outgoing connections from a node. A local measure for calculating centrality on these lines is the "Degree" of a node.

Degree Centrality

The degree of a node is the number of other nodes to which it is connected.

Closeness Centrality

Closeness Centrality measures how many "hops" a node would take to reach every other node in a network (taking the shortest path). It can be informally thought as 'average distance' to all other nodes. This "Far-ness" is then transformed into "nearness" as the reciprocal of farness. That is, nearness = one divided by farness. "Nearness" can be further standardized by norming against the minimum possible nearness for a graph of the same size and connection.

Betweeness Centrality

Betweenness centrality measures the number of times a node acts as a bridge along the shortest path between two other nodes. Here nodes that have a high probability to occur on a randomly chosen shortest path between two randomly chosen nodes have a high betweenness.

$C_B(v) =\sum_{s,t \in V} \frac{\sigma(s, t|v)}{\sigma(s, t)}$

where ${\sigma(s, t)}$ is total number of shortest paths from node ${s}$ to node ${t}$ and ${\sigma(s, t|v)}$ is the number of those paths that pass through ${v}$ .

Eigenvector Centrality¶

A node is high in eigenvector centrality if it is connected to many other nodes who are themselves well connected. Eigenvector centrality for each node is simply calculated as the proportional eigenvector values of the eigenvector with the largest eigenvalue. Following image shows you a quick comparison between degree and eigenvector centrality. Here node A is connected to more well connected nodes than B and hence shows a higher Eigenvector centrality, although the degree of B is higher than A.

Visit here to get a deep dive in the underlying maths, which involves some matrix algebra. We shall use networkx's built in method to calculate this for now.

Community Detection

Ego Networks

PreviousDistance Metrics NextLogistic Regression

Last updated 6 years ago