Monthly Archives: March 2016

Partial sum queries

Given an array of numbers, we need to answer many queries of the form Sum(A[i]...A[j]),

For example, let the array be [15, 6, 1, 12, 7, 4, 10]

Sum(A[1:3]) = 15+6+1 = 22
Sum(A[4:6]) = 12+7+4 = 23
...

The simplest solution is to iterate through the elements from i to j and return the sum.
However this will be good enough if we have a small number of queries. What if we need to run many queries?
In worst case this is takes O(n2) time.

How can we reduce the time complexity?
If we pre-process the array by calculating the prefix array, we can answer each query in O(1) time.

For the above example calculate the prefix array by using the formula Prefix[i] = Prefix[i-1]+A[i]

Prefix = [15, 21, 22, 34, 41, 45, 55]

Now the partial sums can be calculated as follows

Sum(A[i:j]) = Prefix[j] - Prefix[i-1] if i > 1
= Prefix[j] if i=1

Here is the Java function which implements the above

long partialSum(int []prefixArr, int start, int end) {
long sum = arr[end];
if( start > 0)
sum -= arr[start-1];
return sum;
}

Searching for an element in array of adjacent numbers

Given an array of numbers where the difference between any adjacent numbers is 1, how do we efficiently find a given number?

For example consider the array {1, 2, 1, 2, 3, 2, 3, 4, 5, 6}, and we want to search for the number 5.

The simplest solution is to check if the number is present or not using linear search. This algorithm takes O(n) time in the worst case.
But we are not utilizing the fact that the adjacent numbers differ by 1.

How do we utilize this property to improve the performance?

Let us assume we are searching for a number x+3 in an array.
Let us start with a number x at index 0. Here are the some of the possible patterns of numbers that the array can have

x, x+1, x+2, x+3, ...
x, x+1, x, x+1, ...
x, x-1, x, x+1, ...
x, x-1, x-2, x-1, ...
x, x+1, x+2, x+1, ...
etc...

We can safely jump to index 3 and check if the number is present. Why? Because in any pattern there is no chance of having x+3 before the index 3.
If the number is present, that is fine. Otherwise repeat the same procedure until we find the element or reach the end of the array.

The time complexity of the above procedure is still O(n), because in the worst case, we make n/2 jumps. But this is an improvement over the linear search (n comparisons in the worst case).

Example, search for number 6 in the array {4, 5, 4, 5, 4, 5, 4, 5, 4, 5}, we have to make 5 jumps to conclude that 6 is not present.

Here is the C++ implementation of the same.

Check if a string is a double string

Given a string, we need to check if a string is a double string possibly by deleting a character.

A double string is a repetition of two strings affixed together.
For example the following strings are double strings
"aa", "meme", "abbabb"
Where as the follwing are not double strings
"ab", "abc", "acbab"

However "acbab" can be be a double string if we remove "c" from it.

This problem was appeared in Codechef March 2016 long contest. Here is the link.

Let us try to analyze the problem. Consider the base case first. A string of length 0 or 1 can never be a double string. Hence, we have to consider the strings of length > 1. Also we can observe that a double string is always of even length.

Case 1:
If the given string is of even length, we simply need to check if it is a double string by checking if the left half and right half are equal.
No need to consider the case of deleting a character because, if we delete a character, it’s length will be odd which will never be a double string.

Case 2:
If the string is of odd length, we have to delete a character before checking if it is a double string. How to find out which character to be deleted?

Consider the following cases
"cabab", delete first character
"acbab", delete second character
"abcab", delete third character
"abacb", delete fourth character
"ababc", delete fifth character

So it appears that we should try to remove each possible character and check if it is a double string. However since we want to check for a double string,
the left and right half should atleast be similar if not equal.

Consider an example “acbab”, we can divide the string into longer and shorter halves in two ways
("acb", "ab")
("ac", "bab")

Since the first pair of string differ by only character (In other words, their edit distance is just 1)

So we just need to check if the edit distance between the longer and shorter strings is at most 1.

Here is the C++ implementation of the above algorithm. It runs in O(n) time.