1. Define aclass named WordNetto represent an undirected graph withattribute variables such as words(an array of words/strings), vectors(an N by M matrix/array), A(the binary adjacency matrix), W(the weighted adjacency matrix), N(number of words), and other auxiliary variables. (5marks)
2. Within the class, define a method named read_datawhich reads the text file(wordvector.txt)given with this assignment. In the text file, there are 20,000 rows. For a row, the first element is a word, and the rest 50 values are the corresponding word vector. You are only required to read the 101st–3100throws(1-based indexing).Thus, you willhave 3,000 words and corresponding vectors. In this function, you will calculate the Euclidean distance between any two words among the 3,000 words using𝑊𝑖,𝑗=√∑(𝑥𝑖,𝑚−𝑥𝑗,𝑚)2𝑀𝑚=12.If the distance is greater than a threshold (use 3.0 in your assignment), reset it to infinity, otherwise keep it in matrix Wand set the corresponding value in Aproperly.You should print the number of edges in this graph on screen and show it in your PDF file.(20 marks)
3. Within the class, define a method named find_modulesto identifythe number of connected modules (subgraphs/islands) in the graph. You should print on screen the number of modules you can find and the corresponding sizes (i.e. the number of vertices in a module)(note: only print sizes of top 20 modules), and copy this output to your PDF file.Hint: you can use either the BFS(breath-first search) method or the DFS(depth-first search) method to obtain all vertices in a module.(15marks)
4. Within the class, design a method named find_shortest_pathto find the shortest pathfrom a given word(say fromword)to another givenword(say toword) on the largest module.For example, you are required to find the shortest paths from “money” to “future”, from “village” to “city”, from “bad” to “good”, from “problem” to “opportunity”, respectively. You are required to implement and apply the unweightedBFSmethodand Dijkstra’salgorithms, respectively. Print the paths (in a sequence of words) on screen, and reported them in the PDF file. You should also report whether the results from the two algorithms are different.(20 marks)
5. Within the class, implement Kruskal’salgorithmto find the minimum spanning tree (MST) on the largest module. In thealgorithm, youshould save all the edges and associated weights in the MST to a txt file (attach thistxt file within your zipped folder). Your functionshould print the total length (i.e. sum of weighted distances) of the MST on the screen and report them in your PDF file.(20 mark