Statementâ–¼
Count and return the number of good subsequences in the given string s
. You may return the modulo
A subsequence is a sequence formed from another sequence by deleting some or no elements while keeping the order of the remaining elements unchanged.
A good subsequence is a subsequence of a string if it is not empty and the frequency of each character is the same.
Constraints:
1≤ s.length
≤104 s
will only contain lowercase English characters.
Solution
The solution to this problem involves calculating the combinations. We will start by calculating the frequency of each character and then iterate over the frequencies, from 1 to the highest frequency of any character. We will count subsequences where each character appears for some or all of its frequency, i.e., calculating combinations. The count is summed up in each iteration, and after the loop terminates, 1 is deducted from the final count. The reason for deducting 1 is that the empty subsequence will also be counted. Because empty subsequence does not qualify as a good subsequence, we will remove its count from the final calculation.
Computing the combinations is a typical
Because calculating the factorial[i]
requires calculating the [factorial[i-1]..factorial[2]]
again and again, therefore we will use dynamic programming to calculate the factorial[i]
by multiplying i
only with factorial[i — 1].
That is, we will reuse the existing results instead of calculating them again.
The algorithm to solve this problem is as follows:
Initialize two arrays:
factorials
andinverses
of sizen + 1
, an integer array,character_frequencies
, of the same size to count the frequencies of each character and an integer variablefinalCount
with0 .Start a loop from
1 ton + 1
. Inside the loop, compute the factorial and inverses of eachith
number and store it infactorials[i]
andinverses[i]
, respectively.Start a loop from
0 ton + 1
and count the frequencies of each character ofs
. Moreover, keep the record of maximum frequency and save it inmax_frequency
.Start a nested loop, where the outer loop, controlled by
i
, runs formax_frequency
times and the inner loop, controlled byj
runs for26 times.Inside the outer loop, initialize a variable
count
with the value. Perform the following steps Inside the inner loop:Compute the number of combinations, that is,
n
choosek
, wheren
is the value ofcharacter_frequency[j]
, andk
is the value ofi
. The number of combinations can be calculated with(factorials[n] * inverses[k] % MOD) * inverses[n-k] % MOD
.Add
1 to the number of combinations. This is because we consider the possibility of not choosing a character when constructing subsequences. In other words, for each character that can be included in a subsequence, there are two choices: either include the character or do not. Adding1 captures these two possibilities. For example, if we have the stringaabbc , we can either choose onea and oneb or onea and nob , or noa and oneb , or neithera norb . If we don't add1 to our calculation, we would miss the possibility of having a non-empty subsequence that excludes botha andb .Update the
count
by multiplying this calculation with the current value of thecount
.
Deduct
1 from thecount
, add it to the current value offinal_count
, and take the modulus of this value. Update thefinal_count
value with this new value. The reason to deduct1 from thecount
is that we need to exclude the empty subsequence.
After the loop terminates, the value of
final_count
holds the number of good subsequences in the input strings
.
Let’s look at the following illustration to get a better understanding of the solution: