Feature #2: Return Match

Implementing the "Return Match" feature for our "Plagiarism Checker" project.


Now, we need to identify the plagiarized code snippets from two sets of code sample tokens. We will use the same rules from the previous feature to match the tokens. For a cheating student, we need to locate all the instances of copied content, keeping in mind that some text may have been inserted to make a copied submission look different than the original. Like before, we have to avoid dummy statements or comments. We will do this by returning the copied tokens as a subsequence match for the second student’s code tokens. In the cheater string, there could be many subsequences of different sizes that can match with student. We will have to fetch the smallest of them.

Level up your interview prep. Join Educative to access 80+ hands-on prep courses.