Coding Interview PatternsSubstring with Concatenation of All Words
HardSliding Window
Substring with Concatenation of All Words
Explanation & Solution
Description
You are given a string s and an array of strings words. All the strings of words are of the same length.
A concatenated string is a string that exactly contains all the strings of any permutation of words concatenated.
Return an array of the starting indices of all the concatenated substrings in s. You can return the answer in any order.
Input:s = "barfoothefoobarman", words = ["foo","bar"]
0
foo
1
bar
Output:[0,9]
0
0
1
9
Explanation: The substring starting at index 0 is "barfoo", which is a concatenation of ["bar","foo"]. The substring starting at index 9 is "foobar", which is a concatenation of ["foo","bar"].
Constraints
1 <= s.length <= 10^41 <= words.length <= 50001 <= words[i].length <= 30sandwords[i]consist of lowercase English letters.
Approach
Sliding Window pattern
1. Build a Word Frequency Map
- Count the frequency of each word in
wordsusing a map. - Calculate
wLen(length of each word),numWords(number of words), andtotalLen(total window size =wLen * numWords).
2. Iterate Over Starting Offsets
- Since all words have the same length
wLen, we only need to considerwLendifferent starting offsets (0 throughwLen - 1). - For each offset, we process the string in word-sized chunks.
3. Sliding Window of Word-Sized Chunks
- For each offset, maintain a
leftpointer and aseenmap that tracks word frequencies in the current window. - Move
rightforward bywLenat each step, extracting the current word chunk.
4. Handle Valid and Invalid Words
- If the extracted word is in
wordCount, add it toseenand incrementmatched. - If the word appears more times than needed, shrink the window from the left until the count is valid.
- If
matchedequalsnumWords, we found a valid concatenation — recordleftas a result, then shrink from the left to continue searching. - If the word is not in
wordCount, reset the window: clearseen, setmatched = 0, and moveleftpast the current position.
5. Return All Starting Indices
- After processing all offsets, return the collected result array.
Key Insight
- The key optimization is processing by word-sized chunks rather than character by character. By iterating over
wLenoffsets, we ensure every possible alignment is covered. Within each offset, the sliding window moves in word-length steps, making the inner loop efficient.
Time
O(n * wLen) where n is the length of `s`. For each of the `wLen` offsets, we scan through the string once.Space
O(m) where m is the number of unique words.Visualization
Input:
[b, a, r, f, o, o, t, h, e, f, o, o, b, a, r, m, a, n]
—
Press ▶ or use ← → to step through
Left (L)Right (R)WindowDone
17 steps