Category Archives: Search

Finding the least common ancestor (LCA) in a binary search tree

Given a Binary Search Tree (BST), How do we find the Lowest Common Ancestor (LCA) of the given two nodes?

For example in the following BST, the LCA of 5 and 8 is 7 because it is the nearest ancestor common to both the nodes.

Let us assume that the LCA(5,7) is 7 itself ( i.e We can consider that a node is a ancestor of itself).
           4
          /
         2   7
            /
           5   9
              /
             8

We start at the root node and check if the data at root node lies between the given two nodes. If it is we have found the LCA. If the root data is lesser than the given two nodes, we ignore the left sub-tree and continue with the right-sub tree. If the root data is greater than given two nodes, we ignore the right sub-tree and repeat the same procedure with the left sub-tree.

This approach assumes that the given two nodes are present in the tree.

It runs in O( n log n) on average. and O(n) in worst case( skewed BST).

Following is the Java code which implements the above algorithm.

/**
* Created with IntelliJ IDEA.
* User: Ravi
* Date: 9/27/13
* Time: 10:58 PM
* To change this template use File | Settings | File Templates.
*/
public class Main {
public static void main(String[] args)
{
BinarySearchTree tree = new BinarySearchTree();
tree.insert(4);
tree.insert(8);
tree.insert(2);
tree.insert(3);
tree.insert(1);
tree.insert(5);
tree.insert(9);

System.out.println("The lease common ancestor of 9 and 1 is " + tree.findLCA(9,1).getData()) ;
System.out.println("The lease common ancestor of 4 and 8 is " + tree.findLCA(4,8).getData()) ;
}
//Define Binary Search Tree Node
static class BSTNode
{
public BSTNode(int d )
{
data = d;
left = null;
right = null;
}
public BSTNode(int d, BSTNode l, BSTNode r)
{
data = d;
left = l;
right = r;
}
public int getData()
{
return data;
}
public void setData(int d)
{
data = d;
}

public BSTNode getLeft()
{
return left;
}

public void setLeft(BSTNode l)
{
left = l;
}
public BSTNode getRight()
{
return right;
}

public void setRight(BSTNode r)
{
right = r;
}
private int data; //data
private BSTNode left; //left subtree
private BSTNode right; //right subtree
}
//Define Binary Search Tree
static class BinarySearchTree
{
public BinarySearchTree()
{
root = null;
}
public BSTNode getRoot()
{
return root;
}
//Finds the least common ancestor for the given two nodes
public BSTNode findLCA(int a, int b)
{
//Find the two nodes with given values
BSTNode nodeA = find(a);
BSTNode nodeB = find(b);

BSTNode r = root;

while ( r != null )
{
if( r.getData() < a && r.getData() < b )
r = r.getRight();
else if( r.getData() > a && r.getData() > b)
r = r.getLeft();
else
break;
}
return r;
}
//Finds the BST node with the given data
public BSTNode find(int d)
{
BSTNode r = root;

while ( r != null && r.getData() != d )
{
if( r.getData() == d )
break;
else if( r.getData() < d )
r = r.getRight();
else
r = r.getLeft();
}
return r;
}
//inserts the given data into a BST
public void insert(int d)
{
if( root == null )
{
root = new BSTNode(d);
}
else
{
BSTNode current = root;
BSTNode prev = null;
while( current != null )
{
prev = current;
if( current.getData() < d )
{
current = current.getRight();
}
else
{
current = current.getLeft();
}
}
if( d < prev.getData() )
{
prev.setLeft( new BSTNode(d));
}
else
{
prev.setRight(new BSTNode(d));
}
}
}
private BSTNode root;
}
}

 

   

Finding the first repeated element in the array

Given an array which contains some duplicate entries, How do we find the element which appears first and is also repeated.

For example let us consider the array [10, 78, 45, 39, 22, 45, 78, 61].
The element 78 is repeated and it appears before all such repetitive elements. Note that, though two entries of 45 is found before 78, 78 is the required answer because it appears before 45 in the array.

The brute force approach which runs in O(n2) always works for this problem. This method is explained in my previous post.

In this post we will see the Hash map based algorithm. We iterate through the array once to find the frequencies of all the unique elements in the array.

We iterate through the array again to see the first element which appears more than once.

Here is the Java implementation.


import java.util.HashMap;
/**
* Created with IntelliJ IDEA.
* User: Ravi
* Date: 9/20/13
* Time: 12:52 PM
* To change this template use File | Settings | File Templates.
*/
public class Main {
public static void main(String[] args)
{
int[] array = {1,2,3,3,2,4};
int firstRepInd = getFirstRepeatedIndex(array);
if( firstRepInd >= 0 )
{
System.out.println( array[firstRepInd] + " is the first repeated element");
}
else
{
System.out.println("No element is repeated!");
}
}
public static int getFirstRepeatedIndex(int[] array)
{
int resultInd = -1;
HashMap<Integer,Integer> map = new HashMap<Integer, Integer>();
int i;
for( i = 0; i < array.length; i++ )
{
//No entry in the map; add new with count 1
if( map.get(array[i]) == null )
{
map.put( array[i], 1 );
}
else //increment the count if already exists in the map
{
int valCount = map.get(array[i]);
map.put( array[i], ++valCount );
}
}
for( i = 0; i < array.length; ++i )
{
if( map.get(array[i]) > 1)
{
resultInd = i;
break;
}
}
return resultInd;
}
}

Similar Question: How do we find the first repeated character in the string?

Given an array, find two values in the array which sum upto a value K

Given an array, and a value K as input, we have to find a pair of elements such that their sum is K.

There are multiple ways  of solving this problem. We will discuss some of them in this post. Let us implement the function which prints the two values such that array[i] + array[j] = K if there is such a pair exists. If not print “No such pair”.

Method#1: Brute-Force approach

Check each possible pair of values from this array to see if their sum is K. The Java function for this algorithm is given below.

 public static void printTwoValues(int[] array, int sum)
{
int i,j;
for( i = 0; i < array.length - 1; i++ )
{
for( j = i+1; j < array.length; j++ )
{
if( array[i] + array[j] == sum )
{
System.out.println("The two values are (" + i +", " + j + ")");
return;
}
}
}
System.out.println("No such pair exists!");
}

This algorithm runs in O(n2) time complexity and O(1) space complexity.

Method#2: Sorting Approach
This method will modify the sequence of elements in the array as we sort the array first. 
Take two indices one pointing to the beginning and the other pointing to the end of the array. Calculate the sum of these two values and check if it is the given value K. If it is, print the two values and return.
Otherwise check if their sum is greater then given value K, then we need to decrement the second pointer as we are exceeding the target value.
Similarly if the sum is lesser than given value K, we need to increment the first pointer hoping that we move closer to the required sum.

public static void printTwoValues(int[] array, int sum)
{
Arrays.sort(array);
int i = 0;
int j = array.length-1;
while( i < j)
{
if( array[i] + array[j] == sum)
{
System.out.println("The two values are (" + array[i] +", " + array[j] + ")");
return;
}
else if( array[i] + array[j] < sum )
{
i++;
}
else
{
j--;
}
}
System.out.println("No such pair exists!");
}

This algorithm runs in O(n log n) time complexity as we are sorting the array first and O(1) space complexity.

Method#3: Hash set approach
This method uses an implementation of the set data structure to store the values. As we run through the array, we search the set for (sum-array[i]). If it is found, we are done. Otherwise we insert array[i] into the set.

public static void printTwoValues(int[] array, int sum)
{
HashSet<Integer> hSet = new HashSet<Integer>();
int i;
for( i = 0; i < array.length; i++ )
{
if( hSet.contains(sum-array[i]) )
{
System.out.println("The two values are (" + array[i] +", " + (sum-array[i]) + ")");
return;
}
else
hSet.add(array[i]);
}
System.out.println("No such pair exists!");
}

This algorithm takes O(n) time but uses O(n) extra space for the set data structure. 

Variations of the problem: 

  • However if the problem is to find the two indices i,j such that array[i] + array[j] = K, the first and third approaches can easily be modified to solve this problem. But the second approach which uses sorting can not be used because the order of the elements gets changed. 
  • If the problem is to find all pairs whose sum is K, and the array may contain duplicates, first approach does not change. But the sorting approach will not work.

We will discuss these problems in the upcoming posts.

Checking duplicates in an array in O(n) time and O(1) space

Let us consider an array of size N contains numbers in the range [1,N]. We have to find if it has duplicates in linear time – O(n) and by using only constant extra space – O(1). 

Let us also have a restriction that we should traverse the array only once. Take a look at the following algorithm.

When traversing through the elements of the array, we consider each element as an index into the same array and negate the value at that index. If we already find a negative value at this index (i.e we have already seen this element), we can say that the array has duplicates.

Let us see understand this algorithm with a small example. The following table gives a snapshot of the array in each iteration.



Iteration
Array
1
Index
0
1
2
3
4
5
6
7
8
9
Value
5
1
3
2
-1
4
9
6
8
7
2
Index
0
1
2
3
4
5
6
7
8
9
Value
-5
1
3
2
-1
4
9
6
8
7
3
Index
0
1
2
3
4
5
6
7
8
9
Value
-5
1
-3
2
-1
4
9
6
8
7
4
Index
0
1
2
3
4
5
6
7
8
9
Value
-5
-1
-3
2
-1
4
9
6
8
7
5
Index
0
1
2
3
4
5
6
7
8
9
Value
-5
-1
-3
2
-1
4
9
6
8
7

In this example the range is [1,10]
Since the array values are in the range [1,N] and the array stores the values from [0,N-1] we are using index-1 to locate the element to be negated

In the first iteration value is 5, array[4] is negated.
In the second iteration value is 1, array[0] is negated
In the third iteration value is 3, array[2] is negated
In the fourth iteration value is 2, array[1] is negated
In the fifth iteration value is 1 again, We are supposed to make array[0] negative, but we found that it is already negative. Hence we can conclude that it has duplicates.

Here is the Java Implementation.

/**
* Created with IntelliJ IDEA.
* User: Ravi
* Date: 9/13/13
* Time: 11:48 PM
* To change this template use File | Settings | File Templates.
*/
public class Main {
public static void main(String[] args)
{
int[] array1 = {5,1,3,2,10,4,6,8,9,7};
if( hasDuplicates(array1) )
System.out.println("Array has duplicates");
else
System.out.println("Array does not have duplicates");

int[] array2 = {5,1,3,2,1,4,6,8,4,7};
if( hasDuplicates(array1) )
System.out.println("Array has duplicates");
else
System.out.println("Array does not have duplicates");
}
public static boolean hasDuplicates(int[] array)
{
for(int i = 0; i < array.length; i++ )
{
//take the absolute value because it might have been negated by some previous
//operation
int index = Math.abs( array[i] ) - 1;
//If array[index] is negative; we have a duplicate, return true
if( array[index] < 0 )
return true;
else // array[i] is found first time, negate array[ array[i] ]
array[index] = -array[index];
}
return false;
}
}

How does binary search work

Given an array of numbers, how do you search for a number? 
The simplest method is to start at the first element and keep comparing until the end of the array. If it is found anywhere in the middle, return it. Otherwise we would reach the end of the array which indicates that the element in not found in the array. This is called linear search. Since this algorithm examines all the elements in the worst case, it’s time complexity is O(n).

Binary search on the other hand runs in O(log n) which is better compared to linear search. Binary search requires the input array to be sorted

Binary search starts by dividing the array in two halves. Then we compare the search element with the middle element, if it matches, we return that index. If the middle element is less than the search element, then we can ignore the left half of the array because all the elements which precede middle are far away from the search element. Hence we can confine our search to the right half only. Similarly if the middle element is greater than the search element, no need to search the right half. We can confine our search to the left half only. The same procedure is repeated until we find the element or, the array size becomes 1.

Let us look at an example to understand this algorithm
Let us consider the input array as [1,2,3,4,5,6,7,8,9] and we want to search for 6. Here is how the algorithm works

step 1: [1,2,3,4,(5),6,7,8,9] – middle element is 5 which is less than 6. So skipping the left half
step 2: [6,(7),8,9] – middle element is 7 which is greater than 6. So skipping right half
step 3: [6] – middle element is 6. match found, so return it.

#include <iostream>
using namespace std;
int array[100];
//takes the following parameters 
//s: search element, lower index, higher index
int bin_search(int s, int low, int high)
{
while( low <= high)
{
// the middle element
int middle = (low + high)/2;
//if match is found, return it
if( array[middle] == s)
return middle;

//if middle element is less
else if ( array[middle] < s)
{
//search the right half
low = middle+1;
}
else
{
//search the left half
high = middle-1;
}
}
return -1;
}
int main()
{
int n; // array size
cin>>n;
int i;
for( i = 0 ; i < n ; i++)
{
cin>>array[i];
}
//search element
int s;
cin>>s;

cout<<s<<" found at: "<<bin_search(s,0,n-1);
return 0;
}