Feature #2: Detect Virus
Explore how to detect a virus embedded in DNA by identifying the longest subsequence with up to k unique nucleotides. This lesson teaches you to implement a sliding window and HashMap solution to efficiently solve the problem, preparing you for biological string analysis and related coding interview challenges.
We'll cover the following...
Description
While studying different DNA samples, we observed that a certain virus consists of really long sequences of k distinct nucleotides. The virus infects a species by embedding itself into the species’s DNA. We are working on devising a test to detect the virus. The idea is to analyze the longest string that consists of, at most, k nucleotides from a species’s DNA.
We’ll be provided with a string representing a chromosome from the infected DNA and a k value supplied from a hidden function. Our task will include calculating the longest subsequence from the chromosome string that has k unique nucleotides.
Here is an illustration to better understand this process:
Solution
Since we want to return a substring from a specific window over the original string, we can use a sliding window approach to accomplish this efficiently. We’ll use two pointers, left and right, to denote ...