1. Introduction
This Java program finds the length of the longest substring in a given string where all characters are unique. The algorithm efficiently solves this problem using a combination of binary search and the sliding window technique.
2. Code Explanation
Here’s a detailed breakdown of the code:
java
import java.util.Arrays;
public class Main {
// Function to check if any substring of length mid contains
// all unique characters
static boolean isValid(String s, int mid) {
int[] count = new int[256];
boolean found = false;
int distinct = 0;
for (int i = 0; i < s.length(); i++) {
count[s.charAt(i)]++;
if (count[s.charAt(i)] == 1) {
distinct++;
}
if (i >= mid) {
count[s.charAt(i - mid)]--;
if (count[s.charAt(i - mid)] == 0) {
distinct--;
}
}
if (i >= mid - 1 && distinct == mid) {
found = true;
}
}
return found;
}
// Function to find the longest substring containing non-repeating
// characters
static int longestUniqueSubsttr(String s) {
int len = s.length();
int ans = Integer.MAX_VALUE;
// Lower bound and Upper Bound for Binary Search
int lowBound = 1, highBound = len;
// Applying Binary Search on answer
while (lowBound <= highBound) {
int mid = (lowBound + highBound) / 2;
// If a substring of length mid is found which
// contains all unique characters then
// update lowBound otherwise update
// highBound
if (isValid(s, mid)) {
ans = mid;
lowBound = mid + 1;
} else {
highBound = mid - 1;
}
}
return ans;
}
// Driver code
public static void main(String[] args) {
String str = "geeksforgeeks";
System.out.println("The input string is " + str);
int len = longestUniqueSubsttr(str);
System.out.println("The length of the longest non-repeating "
+ "character substring is " + len);
}
}
3. Breakdown of Key Functions
longestUniqueSubsttr(String s)
:
- This function is the main part of the program. It uses binary search to efficiently determine the length of the longest substring with all unique characters.
- Binary Search Setup: The algorithm starts with
lowBound = 1
and highBound = len
, where len
is the length of the input string. These bounds represent the range of possible lengths for the unique substring. - Binary Search Execution: It iteratively checks the middle value
mid
and determines whether a valid substring of length mid
exists. Based on whether a valid substring is found, it adjusts the bounds to continue searching for the longest possible substring.
isValid(String s, int mid)
:
- This function checks if any substring of length
mid
contains all unique characters using the sliding window technique. - Sliding Window: It slides a window of size
mid
across the string and maintains a count of the distinct characters in the current window. If the window contains exactly mid
distinct characters, the function returns true
. - Character Count: The
count
array keeps track of how many times each character appears in the current window. If a character repeats, it reduces the number of distinct characters.
4. Example Walkthrough
Let’s walk through an example with the string "geeksforgeeks"
:
- The algorithm starts with a binary search on the length of the substring.
- It tries different values for
mid
(the length of the substring) and uses isValid
to check if a valid substring of that length exists. - Through this process, it narrows down to the length of the longest substring where all characters are unique.
- The final result for the input
"geeksforgeeks"
will be the longest substring with non-repeating characters, which is "eksforg"
or "ksforge"
, with a length of 7.
5. Output
For the input string "geeksforgeeks"
, the program will output:
pgsql
The input string is geeksforgeeks
The length of the longest non-repeating character substring is 7
6. Time Complexity
- Binary Search: The binary search runs in O(log n) time.
- Sliding Window: For each
mid
value, the sliding window operation takes O(n) time, where n is the length of the string. - Overall, the time complexity of the algorithm is O(n log n), which is efficient for this problem.
7. Summary
This program finds the length of the longest substring in a given string that contains only unique characters. The binary search approach optimizes the process of checking possible substring lengths, while the sliding window technique efficiently ensures that each window contains distinct characters. This combination makes the solution both efficient and scalable for larger strings.