Solution: Minimize Malware Spread

Let's solve the Minimize Malware Spread problem using the Union Find pattern.

Statement

You’re given a network of nn nodes as an n×nn \times n adjacency matrix graph with the ithi^{th} node directly connected to the jthj^{th} node if graph[i][j] == 1.

A list of nodes, initial, is given, which contains nodes initially infected by malware. When two nodes are connected directly and at least one of them is infected by malware, both nodes will be infected by malware. This spread of malware will continue until every node in the connected component of nodes has been infected.

After the infection has stopped spreading, MM will represent the final number of nodes in the entire network that have been infected with malware.

Return a node from initial such that, when this node is removed from the graph, MM is minimized. If multiple nodes can be removed to minimize MM, return the node with the smallest index.

Note: If a node was removed from the initial list of infected nodes, it might still be infected later on due to the malware’s spread.

Constraints:

  • graph.length == graph[i].length
  • 2≤2 \leq n ≤300\leq 300
  • graph[i][j] is 00 or 11.
  • graph[i][j] == graph[j][i]
  • graph[i][i] == 1
  • 1≤1 \leq initial.length ≤n\leq n
  • 0≤0 \leq initial[i] ≤n−1\leq n - 1
  • All the integers in the initial are unique.

Solution

Since we have to check for infected nodes in each connected component, we need to find all the connected components in a graph. To do this, we use the UnionFind data structure that finds all of the graphs’s connected components.

The steps of the algorithm are given below:

  1. Find all of the graph’s connected components using the UnionFind data structure.

  2. Traverse the initial array and store all the connected components with at least one infected node in a hash map, infected.

  3. Traverse the initial array again and calculate the number of infections (from the infected hash map) in each connected component containing one or more infected nodes.

    • If a connected component contains more than one infected node, ignore it and move to the next iteration of the loop.
    • Otherwise, calculate the size of the connected component.
  4. Find the connected component with the greatest size, containing only one initially infected node from step 33.

  5. If there are multiple components that are the same size that would count as the largest connected component, choose the component with the smallest index.

The slides below illustrate how we would like the algorithm to run:

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.