Network theory, a subset of graph theory, is at the heart of several recent computer science advancements, from neural networks to social networks to biology-related improvements in Gene-networks. Although, despite many advances in the field, complex network analysis is a relatively new field. Many problems still need to be solved arising from interactions between various systems and sub-systems of the future. The article discusses the most common types of problems solved in network analysis and graph theory and applies them to simple and complex scenarios. A lot of complex analysis can be built on simple, workable solutions, as we will see below.
- Simple Problem of Search. Whether searching for new planets or searching for optimum hyperparameters in neural networks, search for the most fundamental computer science. Graph theory has two most used tools for this purpose, i.e., Breath first search and Depth-first search. These two algorithms often have many variations depending upon the scenarios, but the basics remain the same. These algorithms are simple and can be explained in a story. Now imagine being lost in a super-multi-story building with innumerable doors and stairs and levels. It is a strange building, and you can open one door to access multiple other doors at various levels. The exit is only possible through a red-colored door located somewhere in the building. Fortunately, you remember this famous rule and start using it immediately. Let’s you are on the 15th floor. You first take out your notebook and a pen and start noting down the door numbers. You register the number on the door we are going to open in your notebook and open the door, mark it as seen. If you do find the red door, you escape immediately. Otherwise, you move to the next door. Now you search all the doors on the levels and then move one level below. You keep repeating the process till, hopefully, you find the red door. This is how simple the algorithms. Below is the python implementation.
Since graphs are essentially collections of vertices and edges, python's simplest representation is using a dictionary of sets. Another popular approach is using an adjacency matrix, which is essentially a giant matrix with edges represented as values and columns/rows as vertices. However, in this article, I will be using the networkx package for network-based computations.
#vanilla python representationclass Graph:def __init__(self):
self.graph = dict(list)def addEdgeToGraph(self,vertix1,vertix2): self.graph[vertix1].append(vertix2)
Graph plot is given below.
2. Going Topological Sort: Topological sorting is simply ordering vertices in a directed acyclic graph. The algorithm is essentially an extension of depth-first search, as explained in pt.1, although other algorithms’ variants are also common. In complex networks of things/places/objects/tasks, it answers the questions of “what must precede what.”
This well-known network algorithm has been studied from the 1960s onwards to solve ordering problems for directed acyclic graphs or DAGs. The only requirement for applying the technique is the network should have no directed cycles.
Imagine listing down a set of activities to be performed for some task or a project, with each activity having a dependency of the previous one/multiple activities. Let us assume this is represented on a piece of paper using nodes. First, make a list of all nodes to be visited. Then Starting from the first activity/node, go to each activity and make a list of nodes/activities visited. Go to each activity’s successor and keep going until a node has no more successor. This node with no successor will be then removed from the list of nodes we will visit later on. Also, write down the node on a horizontal line from right to left. Now repeat the process for other nodes by applying the same process.
Below is an implementation of the above logic in python. The code is actually nearly identical to DFS we applied earlier.
3.The shortest Path — The shortest path problems have been rigorously studied ever since the advent of optimization theory. These problems are widespread in graph theory. Dijkstra algorithm, first developed in the 1950s along with many other similar algorithms, is one of the most popular shortest path solutions to this famous problem. The algorithm is essentially an extension of BFS, with the only difference that paths between vertices have a distance assigned on each traversal of BFS.
We can go back to the multiple-story building scenario discussed in pt.1, so instead of escaping from the red door, let us say you are tasked with finding the shortest path. From your experiences, you know there are few possible paths. So starting from your location, you assign a large distance value to each of the doors you are about to visit, any big number. Then you assign a value of 1 to any door you visit, so in the end, each path will have a sequence of values. If some door remains unreachable, it will still have a large distance value assigned to it. Then finally, when at the destination, you are simply going to add up all values of each path and determine the shortest path. You can also note down distance values across each door number you visit in your notebook. This is called a priority queue.
The algorithm has been applied everywhere, from telephone line optimization to selecting the best oil pipeline routes.
Like Dijkstra’s algorithm, the Bellman-Ford algorithm is also a single source shortest path algorithm, and however, unlike Dijkstra, it is much slower. Bellman-Ford algorithm visits each vertex and successively relaxes or lowers the values of distances assigned to edges to infinity to a lower number. Hence the algorithm is repeated -V-1 times to update the distance values for each edge. Dijkstra, being a greedy algorithm, is much faster. Bellman-Ford is generally only used for either weighted or negative weighted graphs, for example, buying and selling stocks based on arbitrage(you can have negative and positive values for various stocks).
5. Traveling Salesman problem or visiting all cities in record time
Imagine managing a tour business in a tourist-friendly country and tasked with taking the visitors to every town in the country in record time. The visitors are in a hurry and will greatly appreciate minimum touring costs.
This modest and mundane exercise has been one of the most famous unsolved problems in computer science and combinatorial mathematics. It is improbable that there is a polynomial-time (and is an “NP-hard problem”).
The problem can be generalized by minimizing any cost while visiting all nodes or towns/cities. The cost can be distance, fuel costs, time, etc.
A number of implementations exist which uses combination and trade-offs of various greedy, genetic, combinatorial approaches. Below are some of the vanilla and optimized implementation using optimization libraries.
Let's try using a brute force approach.
Using google optimizations tools to implement TSP
# Pseudocode from Google-OR tools1.load TSP data
data = create_data_model()
2.Create the routing index manager.pywrapcp is a Python wrapper for the underlying C++ solver.
manager = pywrapcp.RoutingIndexManager(len(data['distance_matrix']),
3. Create a routingmodel
routing = pywrapcp.RoutingModel(manager)
4. Create a callback function which returns the distance between two nodes
def distance_callback(from_index, to_index):
from_node = manager.IndexToNode(from_index)
to_node = manager.IndexToNode(to_index)
5.Register callback in the routing manager
transit_callback_index = routing.RegisterTransitCallback(distance_callback)
6.Define cost of each arc.
7.Setting first solution heuristic.
search_parameters = pywrapcp.DefaultRoutingSearchParameters()
search_parameters.first_solution_strategy = (
8.Solve the problem.
solution = routing.SolveWithParameters(search_parameters)
9.Print solution on console.
print_solution(manager, routing, solution)