Count of Distinct Substrings occurring consecutively in a given String

Given a string str, the task is to find the number of distinct substrings that are placed consecutively in the given string.

Examples:

_Input: __str = “geeksgeeksforgeeks” _

_Output: __2 _

Explanation:

geeksgeeks_forgeeks -> {“geeks”} _

geeksgeeksforgee_ks -> {“e”} _

_Only one consecutive occurrence of “e” is considered. _

_Therefore two distinct substrings {“geeks”, “e”} occur consecutively in the string. _

Therefore, the answer is 2.

_Input: __s = “geeksforgeeks” _

_Output: __1 _

Explanation:

gee_ksgeeksforgeeks -> {“e”, “e”} _

_Only one substring {“e”} occurs consecutively in the string. _

Naive Approach:

The simplest approach is to generate all possible substrings of the given string, and for each substring, find the count of substrings in the given occurring consecutively in the string. Finally, print the** count.**

Time Complexity:_ O(N3)_

Auxiliary Space:_ O(N)_

**Efficient Approach: **

To optimize the above approach, the idea is to use Dynamic Programming.

Follow the steps below to solve the problem:

If the length of the string does not exceed 1, then it is not possible to find any such consecutively placed similar substrings. So return 0 as the count.
Otherwise, initialize a memoization table dp[] of dimensions (N+1 * N+1) which is initialized to 0.
Initialize an unordered_set to store the distinct substrings placed consecutively.
Iterate from the end of the string.
While traversing the string if any repeating character is found, then dp[i][j] will be determined considering the previously computed dp value i.e., count of identical substrings up to dp[i+1][j+1] characters and including the current character.
If the character is not similar then, dp[i][j] will be filled with 0.
Similar substrings are consecutively placed together without any other characters and they will be the same for at most (j – i) characters. Hence, for valid substrings, dp[i][j] value must be greater than (j – i). Store those substrings in unordered_set which appears the maximum number of times consecutively.
Finally, return the size of the unordered_set as the count of distinct substrings placed consecutively.

Below is the implementation of the above approach:

C++
Python3

// C++ Program to implement

// the above approach

#include <bits/stdc++.h>

**using** **namespace** std;

// Function to count the distinct substrings

// placed consecutively in the given string

**int** distinctSimilarSubstrings(string str)

{

// Length of the string

**int** n = str.size();

// If length of the string

// does not exceed 1

**if** (n <= 1) {

**return** 0;

}

// Initialize a DP-table

vector<vector<``**int**``> > dp(

n + 1, vector<``**int**``>(n + 1, 0));

// Stores the distinct substring

unordered_set<string> substrings;

// Iterate from end of the string

**for** (``**int** j = n - 1; j >= 0; j--) {

// Iterate backward until

// dp table is all computed

**for** (``**int** i = j - 1; i >= 0; i--) {

// If character at i-th index is

// same as character at j-th index

**if** (str[i] == str[j]) {

// Update dp[i][j] based on

// previously computed value

dp[i][j] = dp[i + 1][j + 1] + 1;

}

// Otherwise

**else** {

dp[i][j] = 0;

}

// Condition for consecutively

// placed similar substring

**if** (dp[i][j] >= j - i) {

substrings.insert(

str.substr(i, j - i));

}

// Return the count

**return** substrings.size();

}

// Driver Code

**int** main()

{

string str = "geeksgeeksforgeeks"``;

cout << distinctSimilarSubstrings(str);

**return** 0;

}

Output:

Time Complexity:_ O(N) _

Auxiliary Space:_ O(N)_

#competitive programming #dynamic programming #hash #strings #cpp-unordered_set #frequency-counting #hashtable #memoization #substring

geeksforgeeks.org

Count of Distinct Substrings occurring consecutively in a given String