Statement▼
Count and return the number of good subsequences in the given string s. You may return the modulo
A subsequence is a sequence formed from another sequence by deleting some or no elements while keeping the order of the remaining elements unchanged.
A good subsequence is a subsequence of a string if it is not empty and the frequency of each character is the same.
Constraints:
1≤ s.length≤104 swill only contain lowercase English characters.
Solution
The solution to this problem involves calculating the combinations. We will start by calculating the frequency of each character and then iterate over the frequencies, from 1 to the highest frequency of any character. We will count subsequences where each character appears for some or all of its frequency, i.e., calculating combinations. The count is summed up in each iteration, and after the loop terminates, 1 is deducted from the final count. The reason for deducting 1 is that the empty subsequence will also be counted. Because empty subsequence does not qualify as a good subsequence, we will remove its count from the final calculation.
Computing the combinations is a typical
Because calculating the factorial[i] requires calculating the [factorial[i-1]..factorial[2]] again and again, therefore we will use dynamic programming to calculate the factorial[i] by multiplying i only with factorial[i — 1]. That is, we will reuse the existing results instead of calculating them again.
The algorithm to solve this problem is as follows:
Initialize two arrays:
factorialsandinversesof sizen + 1, an integer array,character_frequencies, of the same size to count the frequencies of each character and an integer variablefinalCountwith0 .Start a loop from
1 ton + 1. Inside the loop, compute the factorial and inverses of eachithnumber and store it infactorials[i]andinverses[i], respectively.Start a loop from
0 ton + 1and count the frequencies of each character ofs. Moreover, keep the record of maximum frequency and save it inmax_frequency.Start a nested loop, where the outer loop, controlled by
i, runs formax_frequencytimes and the inner loop, controlled byjruns for26 times.Inside the outer loop, initialize a variable
countwith the value. Perform the following steps Inside the inner loop:Compute the number of combinations, that is,
nchoosek, wherenis the value ofcharacter_frequency[j], andkis the value ofi. The number of combinations can be calculated with(factorials[n] * inverses[k] % MOD) * inverses[n-k] % MOD.Add
1 to the number of combinations. This is because we consider the possibility of not choosing a character when constructing subsequences. In other words, for each character that can be included in a subsequence, there are two choices: either include the character or do not. Adding1 captures these two possibilities. For example, if we have the stringaabbc , we can either choose onea and oneb or onea and nob , or noa and oneb , or neithera norb . If we don't add1 to our calculation, we would miss the possibility of having a non-empty subsequence that excludes botha andb .Update the
countby multiplying this calculation with the current value of thecount.
Deduct
1 from thecount, add it to the current value offinal_count, and take the modulus of this value. Update thefinal_countvalue with this new value. The reason to deduct1 from thecountis that we need to exclude the empty subsequence.
After the loop terminates, the value of
final_countholds the number of good subsequences in the input strings.
Let’s look at the following illustration to get a better understanding of the solution: