The Sliding Window Algorithm for string and array (Java codes and Leetcode problems as example)

The sliding window is a very common algorithm and is a handy algorithm.

What is the sliding window algorithm

We know what is a sliding window, just like the photo showing blow.

a Sliding window

When we use a sliding window, we can slide one of the window frame from left to right. You can move the window left and right but you can’t change the width of the window.The sliding window is a very common algorithm and is a very useful algorithm.

But there is some kind of sliding window, you can move one side of the window frame, but another side is fixed. So, you can change the window’s width. Look like this:

a resizable sliding window

When we deal with array or string, we often need to focus on some parts of the array or string, just like a window, to check if the subarray or substring match some pattern we need. We can use a fixed width sliding window or a resizable sliding window (And we may also need to change the position of both end of the window), we usually call this approach sliding window algorithm.

The usage of sliding window

Almost every NLP (natural language processing) problem need some sort of sliding window algorithm, word segment, part-of-speech tagging, and so on.

words counting

Sound, human speech can be look as time series, is a sequence. When we process sound signals we can use sliding window too.

using sliding window to sound

Object detect task of image and video also often use a sliding window algorithm. The picture below is a famous object detect project called YOLO (machine learning, deep learning), it uses a two dimensional sliding window algorithm. First, it cuts the image to a grid, then it creates a lot of sliding windows based on this grid.

sliding windows of YOLO project

When we create an OCR (Optical character recognition) algorithm, we need the sliding window too. In normal printed text OCR problem, we use one-dimension sliding window.

sliding window in OCR problem

And for the real-time text detection and OCR (Now Google translate or other software can provide it) problem, we need some kinds of the two-dimensional sliding window (Just like YOLO).

two-dimensional sliding window

​No matter you deal with an array, string, audio, video, image or text, the sliding window algorithm you need is very similar, and often you can use the same frame codes without any alteration, only need change the detail codes about the specific problem.

1. Fixed width sliding window problem

Now we look at an easy example. Give you a String s, try to find a substring length of k, in this substring we have most x show up.

For example:
s = “abaaaccsddaa”
k = 3
x = ‘a’

Or :
s = “abcdeaaadfewfsaaadassaaaaaddddaass”
k = 5
x = ‘a’

Now we use sliding window to look at example 1.

Look at the picture and data, we know the index of the left end of the sliding window is from 0 to (the length of string 12 – the width of window 3 = 9). And the index of the right end is from 2 to 11. We only need one loop to traverse all the possible windows.

So, we can write a loop:

frame codes of fixed sliding window

Now it is very clear, we need do something in the loop. We can use left end index “i” and right end index “right” to get a substring of s. And we can write a 0 to k loop to get how many “x” in the substring. And compare current “x” count with a variable “max”, if current count bigger than “max”, let “max” be current count.

But it loses the benefits of using a sliding window. Using this approach time complexity is O((N-K)*K). We can use the correct sliding window approach to get an O(N) time complexity.

How to do it? In the loop, every time add the new element from the right and remove the element out of left boundary. For example, if you use a variable Nx store the count of ‘a’, every time if the new element is ‘a’, Nx++. When you remove the element from the window if it is ‘a’, Nx–. Then Nx will always be the count of ‘a’ in the windows. And we only need one loop, so the time complexity is O(N).

Codes :

longest K string (left)

We need two loops because we need to count the Nx before the main loop started. Before we moved the window, the window already has 3 elements. This will not make time complexity raise, but it looks a little mess.

If we use right boundary to start the loop, code will be more clear, like this:

longest K string (right)

2. Resizable Sliding Window algorithm

The fixed sliding window is easy to understand. Resizable sliding windows is more complex. When the left end and the right end of the window both can move freely, how to move? How can we ensure we can traverse all the possible windows in the array?

You may think of two nesting loops, but that will be O(N^2) time complexity.

Now we look at an example from Leetcode, problem 76 Minimum Window Substring. It like this:

Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).

Example:

Input: S = "ADOBECODEBANC", T = "ABC"
Output: "BANC"

Note:

  • If there is no such window in S that covers all characters in T, return the empty string "".
  • If there is such window, you are guaranteed that there will always be only one unique minimum window in S.

Now we analyze this problem. The goal is to find a minimum substring contain all ABC three character. It can be ABC, BAC, AXBC, BCXA, and so on.

First, we get requirement from string T, how many characters, each character need occur how many times. We store it in a hash-map tdict. And we make another hash-map sdict to count the characters in the window.

We make the left and right equal 0. Then we move right. When right is moving, the corresponding character comes in the window. Now we have three characters need to provide, A, B, and C. We use a variable count measure how many conditions are satisfied.

When the corresponding character of right coming in the window, we check if it is in the tdict, if it is, we update the sdict, and compare the value of this character in tdict and sdict. If the values are same, we know one condition is satisfied, and count will be add one.

When the count is 3, we satisfy all three conditions. When count = 3 or count > 3, we move left to shrink the window until we can not meet the requirements. Then we move right to expand the window again.

So the pattern is we move right to expand the window to meet the requirements, then we move left to shrink window until we can’t. When both left and right get the maximum value, the program is over. Now we use a video to demonstrate this approach.

Resizable Sliding Window algorithm

And this is the code:

Function start and variable initialization

Then this is the start of the right loop, move the character at the right position to the window, to count down. If variable count < 3, we continue to move right.

And only when count >= 3 we meet the requirements, program go to this part. When count satisfies, first, if the current substring is shorter than answer, ans= string in the window,  min= the length of the string. Then we move left, cl move out of window, update sdict and count. Then continue loop until count < 3 can’t meet requirements.

This code is on my github: https://github.com/tinyfool/leetcode/blob/master/src/p0076/Solution1.java

In this approach, we use two hash-maps store requirements and current states. But we can use a char array to make the code faster. Because we are dealing with string, and there are only 256 characters, so we can use a char array size of 256 to store all the requirements and current states.

When we deal with number array, if we has already know there are some limits of number range, we can use number array too.

Let’s look at problem 76 again. We can only change codes about the hash-map to array, and without touch any frame codes.

Initialize parts:

hash-map version

char array version

Move character into window:

hash-map version

char array version

move character out of window

hash-map version

array version

Let’s look at an example:

This picture is the initial states of the example, S = “ADOBECODEBANC”, T = “ABC”, the char array dict is filled with zeros.

Then we put T ‘s information into dict, A,B,C each character occurs one time.

Then we let character into the window. Whenever a character moves into the window, we will reduce the value of the corresponding character in dict by one. So we know at the move-into-window step when a number is No-zero turn into zero, it means we meet a requirement. For example we let A into window:

If the character not in T, like D, the original value is 0. We put D into the window, 0 – 1 is -1. So we don’t care it when a value change from 0 to negative number. Look at the picture below.

When we move a character out of the window, We add one to the value of the corresponding character. You will find out for the character in T, in this process, when the value larger than 0, it means we miss one requirement. And for the character not in T, the value will never bigger than 0.

This approach is more complex than the hash-map one, but it is very fast. So you may want to know how to do it. My solution of problem 76 hash-map version costs 34 ms. But array version only costs 4 ms, it is 8.5 times faster. It is worth to try.

I made a full video tutorial of the sliding window algorithm into my youtube channel (LeetCode Problems & Solutions ). Feel free to watch it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.