Category Archives: Algorithms

Edit distance 1

Given two strings, how do we check if their edit distance is 1?

Edit distance is defined as the number of characters that needs to be modified, inserted or deleted from one string to transform into the other string.

For example let us consider two strings “java”, “lava” they both differ by just one character. we can change the first character in either of them to make them equal.

Similarly, the words “at”, “bat” also have an edit distance of 1 because either we can add “b” to the first string, or delete “b” from the second string to make them euqal.

By looking at the problem statement, one can think of a dynamic programming solution which takes O(n^2) time. However, it would an over kill to solve this problem.

Let’s observe some thing. Two strings can not have an edit distance of 1 if their lengths differ by more than 1 characters.

So we can simplify our approach to work on the two use cases.
which are either of equal length, or
one string longer than the other by just one character.

Lets see some representative test cases we need to handle.

(“abc”, “bc”), (“abc”, “ab”), (“string”, “sxring”), (“algorithm”, “algoritm”) etc.

If the two strings are equal, we just count the number of mismatches at each index in the two strings.

Otherwise, we check if the shorter string is a sub-sequence of the longer string except one character. In this case we calculate the number matching characters which should be equal to the length of the shorter string.

Here is the Java implementation of the above approach. The time complexity would be linear O(n).

Maximum number of contiguous ones with K zeros

Given an array of size N, containing only 0’s and 1’s, and a number K, what is the maximum length of the sub-array with contiguous 1’s with at most K 0’s?

For example consider the array

A = [1, 1, 0, 1, 0, 0, 1], K = 2, the maximum length is 5. i.e the sub-array A[0:4]

This problem can be solved using sliding window algorithm. Take two indexes front and back which defines a window. We keep expanding the window by incrementing the back index as long as the window contains less than or equal to K zeros. That means, at any time, the window contains at most K 0’s. If it exceeds K, we increment the front index to reduce the number of zeros to K. Continue this procedure until the back index reaches the end of the array. We also keep track of the largest window during this process.

This algorithm takes O(n) time.

Here is the C++ code.

Problem Source: SPOJ REPROAD

Maximum elements of array

Given an array A of size N, We want to perform several iterations as described below.

Select any index i such that 1 <= i <= N, Mark A[i] = 0, move left by filling each element one greater than it’s right
Similarly move right, and fill each element which is one greater than it’s left.

For example consider an array of length 5, and we choose the index 3, the array would be

A = [2, 1, 0, 1, 2]

If we perform several iterations by choosing different indexes each time,
we have to output the maximum number, each element can get during all the iterations.

Consider that we repeat the procedure for the above array with Index 2
A = [1, 0, 1, 2, 3]

So after two iterations at indexes 2, and 3, maximum elements would be
MAX_A = [2, 1, 1, 2, 3]

To find MAX_A, at first, it looks like we need to repeat the procedure m times for m indexes,
which leads to O(m*n) complexity solution.

What will be the maximum number an element can get? N-1
How? Lets consider the same array as above, choose index 1, the array would be
A = [0, 1, 2, 3, 4]
Choose index 5, the array would be
A = [4, 3, 2, 1, 0]

So it is enough to perform the iteration with the left most and right most indexes,
the maximum element at each index i would be max( abs(i-left), abs(i-right))

So the time complexity will be O(n).

Here is the C++ code for the same.

Problem Source: The Army – Codechef

Path length between two binary tree nodes

Given a full binary tree (all nodes except leaf nodes have left and right child) represented as shown below.
If the root is labelled i, it’s left child is labelled 2*i and right child is labelled as 2*i+1.
1
/   \
2     3
/  \   / \
4  5  6   7

Given two nodes from this tree, how do we find the length of the path between them?
For example the path length between 2 and 3 is 2 (2->1->3)
Similarly path length between 4 and 1 is 2, and between 4 and 7 it is 4 (4->2->1->3->7)

In a binary tree, two nodes can be connected in two possible ways.

  • Either one of the nodes can be ancestor of the other, or
  • Both of them are connected by some common ancestor.

For example, take 1 and 5, 1 is an ancestor of 5.
Take 4 and 5, they are connected by their lowest common ancestor 2, so the path goes has to via 2.
So this problem can be solved by finding the Lowest Common Ancestor (LCA) of the two nodes and adding the distances between the nodes to their LCA.

We use the bottom up approach here. Start from the given two nodes and try to search for lowest common ancestor by going up since we can easily go the parent node by dividing the node by 2. While finding the LCA, we can also calculate the path length.

Since we traverse through the tree for at most the height of the tree, the time complexity will be O(log n).

Here is the C++ code.

Problem source: Codechef

Billing algorithm

This problem is from Codechef. I am re-wording it for more clarity.

A garments shop has “Buy 2, Get 2” offer for it’s customers. Whatever the items may be, they charge for costliest items from an order and give the cheapest ones free.
However we can divide the items into multiple orders to save more.

For example, Let us say we have 6 items with prices 200, 600, 100, 900, 800, 700. If we take them as one order, we will have to pay 3000.

6 items are divided into two groups {900,800,200,100}, {700,600} because they charge for highest price items. So the total bill would be 900+800+700+600 = 3000
Suppose we divide the items into two groups {900,800,700,600}, {200,100} we need to pay only 900+800+200+100 = 2000

Now the problem is, given a list of prices, how do we calculate the minimum amount we need to pay?

This is a simple greedy algorithm. We just need to sort all the prices in descending order and group them into 4 per order. The last order might contain less than 4 items.

Here is the C++ code for the same

Minimum number of jumps to reach top

There are N steps in a staircase, and a person can jump 5 steps at max, i.e he can jump any number of steps between 1 and 5.
What is the minumum number of jumps he has to make to reach the top?

A simple math problem which can be solved using a simple greedy approach. Try to make as many maximum jumps as possible before reaching the top. Take a smaller jump if needed at the end.

For example if there are 10 steps, he can make two maximum jumps of 5 steps each to reach the top.
If there 8 steps, First he can jump 5 steps and then jump 3 steps. So within 2 jumps, he can reach the top.

The code for this very simple. This problem can also be generalized by customizing the jump capacity to any number instead of a fixed number 5.

#include <iostream>
#include <cstdio>
using namespace std;

int main()
{
int loc;
scanf("%d", &loc);
printf("%d\n",loc/5 + (loc%5 == 0?0:1));
return 0;
}

Partial sum queries

Given an array of numbers, we need to answer many queries of the form Sum(A[i]...A[j]),

For example, let the array be [15, 6, 1, 12, 7, 4, 10]

Sum(A[1:3]) = 15+6+1 = 22
Sum(A[4:6]) = 12+7+4 = 23
...

The simplest solution is to iterate through the elements from i to j and return the sum.
However this will be good enough if we have a small number of queries. What if we need to run many queries?
In worst case this is takes O(n2) time.

How can we reduce the time complexity?
If we pre-process the array by calculating the prefix array, we can answer each query in O(1) time.

For the above example calculate the prefix array by using the formula Prefix[i] = Prefix[i-1]+A[i]

Prefix = [15, 21, 22, 34, 41, 45, 55]

Now the partial sums can be calculated as follows

Sum(A[i:j]) = Prefix[j] - Prefix[i-1] if i > 1
= Prefix[j] if i=1

Here is the Java function which implements the above

long partialSum(int []prefixArr, int start, int end) {
long sum = arr[end];
if( start > 0)
sum -= arr[start-1];
return sum;
}

Searching for an element in array of adjacent numbers

Given an array of numbers where the difference between any adjacent numbers is 1, how do we efficiently find a given number?

For example consider the array {1, 2, 1, 2, 3, 2, 3, 4, 5, 6}, and we want to search for the number 5.

The simplest solution is to check if the number is present or not using linear search. This algorithm takes O(n) time in the worst case.
But we are not utilizing the fact that the adjacent numbers differ by 1.

How do we utilize this property to improve the performance?

Let us assume we are searching for a number x+3 in an array.
Let us start with a number x at index 0. Here are the some of the possible patterns of numbers that the array can have

x, x+1, x+2, x+3, ...
x, x+1, x, x+1, ...
x, x-1, x, x+1, ...
x, x-1, x-2, x-1, ...
x, x+1, x+2, x+1, ...
etc...

We can safely jump to index 3 and check if the number is present. Why? Because in any pattern there is no chance of having x+3 before the index 3.
If the number is present, that is fine. Otherwise repeat the same procedure until we find the element or reach the end of the array.

The time complexity of the above procedure is still O(n), because in the worst case, we make n/2 jumps. But this is an improvement over the linear search (n comparisons in the worst case).

Example, search for number 6 in the array {4, 5, 4, 5, 4, 5, 4, 5, 4, 5}, we have to make 5 jumps to conclude that 6 is not present.

Here is the C++ implementation of the same.

Check if a string is a double string

Given a string, we need to check if a string is a double string possibly by deleting a character.

A double string is a repetition of two strings affixed together.
For example the following strings are double strings
"aa", "meme", "abbabb"
Where as the follwing are not double strings
"ab", "abc", "acbab"

However "acbab" can be be a double string if we remove "c" from it.

This problem was appeared in Codechef March 2016 long contest. Here is the link.

Let us try to analyze the problem. Consider the base case first. A string of length 0 or 1 can never be a double string. Hence, we have to consider the strings of length > 1. Also we can observe that a double string is always of even length.

Case 1:
If the given string is of even length, we simply need to check if it is a double string by checking if the left half and right half are equal.
No need to consider the case of deleting a character because, if we delete a character, it’s length will be odd which will never be a double string.

Case 2:
If the string is of odd length, we have to delete a character before checking if it is a double string. How to find out which character to be deleted?

Consider the following cases
"cabab", delete first character
"acbab", delete second character
"abcab", delete third character
"abacb", delete fourth character
"ababc", delete fifth character

So it appears that we should try to remove each possible character and check if it is a double string. However since we want to check for a double string,
the left and right half should atleast be similar if not equal.

Consider an example “acbab”, we can divide the string into longer and shorter halves in two ways
("acb", "ab")
("ac", "bab")

Since the first pair of string differ by only character (In other words, their edit distance is just 1)

So we just need to check if the edit distance between the longer and shorter strings is at most 1.

Here is the C++ implementation of the above algorithm. It runs in O(n) time.

Facebook Hackercup 2016- Boomerang constellations problem solution

This problem is from the recently concluded Facebook Hackercup 2016 Qualification round. You can take a look at the problem statement in this link.

I am giving the abridged problem statement here. Given a set of coordinates in a 2D-plane, We have to find out the total number of pairs of lines with equal length and also share one end point.

For example, let us take the points {(0,0), (0,1), (1,0)}, there is one pair of lines, with a distance of 1. The line between (0,0),  (0,1) and the line between (0,0), (1,0).

The simplest way to solve this problem is this. Take a point (x,y) and try to calculate the distance to every other point. Maintain a map of distance to the number of other end points with that distance. This is O(n2 log n) solution.

Here is the C++ code for the same.