Monthly Archives: June 2013

How to find the frequecy of a word in a given sentence

Given some text and a word. How do we find the frequency of the word?

In this post we will see a Java program which finds it with just a few lines of code using regular expression matching.

Java provides utilities for matching the regular expressions using java.util.regex package. In our program we utilize two of them namely Pattern, and Matcher. These are the fundamental classes in this package.

In the following code reader is an object of class Scanner which is used for reading the input from the keyboard. The Pattern class does not have a public constructor, we have to create an object of that class using the compile() method by passing the pattern. Similarly the Matcher class also does not have a public constructor, and it’s object should be returned by matcher() method of the Pattern object.

The matcher class provides a method called find() which tries to find the next instance of the pattern in the given text. If it finds the pattern it returns true otherwise it returns false.

Here is the Java code.

import java.util.Scanner;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Main {
public static void main(String[] args)
{
Scanner reader = new Scanner(System.in);
String sentence = reader.nextLine();
String word = reader.nextLine();
Pattern p = Pattern.compile(word);
Matcher m = p.matcher(sentence);
int count = 0;
while( m.find())
count++;
System.out.println("The word '" + word + "' occurred " + count + " time(s)");
}
}

How does a heapsort work

In selection sort, a list is sorted in the following way. Initially there will be an empty sorted list and the un-sorted list. We select the first element(According to the sort order) from the unsorted list and add it to the sorted list. We select the second element and append it to the sorted list and so on until the entire list is sorted.

Heapsort also works in the similar fashion. In heapsort, ‘Heap’ datastructure is used to maintain the unsorted list and hence the name heapsort. A heap is an array-based data structure visualized as a complete binary tree.
For example let us consider the following array and its corresponding binary tree form

Index  0  1   2  3  4  5  6
Array: 7  6   5  1  3  2  4  

                                                 7
                    / 
                  6      5
                 /     /
                1   3  2   4

array[0] contains the root element, and array[2*0+1] contains the left child and array[2*0+2] contains the right child. In general ith node has its left and right children at 2*i+1 and 2*i+2 indices respectively and the parent of ith node will be at array[i/2].

The above heap is also a max-heap in which data at every node is greater than those of its children.

In this algorithm, all the elements are formed into a max-heap by repeatedly using heapify procedure. heapify procedure works on any node which violates the max-heap property. for example let us consider the following heap is not a max-heap because 2 is less than it’s children.
                       2—–> violation
               /
              6   5
             /
            1  4

to correct this we will swap 2 and 6 ( bigger child) and recursively apply the same procedure for the node 2.
                               6
                    /
  violation  <—- 2   5   
                  /
                 1   4

So the following heap is a max-heap now.
         6
     /
    4   5
   /
  1   2

The build heap procedure scans through all the non-leaf nodes and calls the heapify procedure.

Finally the heap contains the maximum element at the root of the heap ( array[0] ). To sort the array. We swap the root element with the end element (array[length-1]). Now the heap may violate the max heap property at the root node. so call the heapify procedure. Next time we can exclude the last element because it is it’s correct place. The heap size is reduced by one.

The following Java Program implements the above algorithm.

import java.util.Scanner;
public class Main {
public static void main(String[] args)
{
Scanner reader = new Scanner(System.in);
int N = Integer.parseInt(reader.nextLine());
int []array = new int[N];
int i;
for( i = 0 ; i < N ; i++)
{
array[i] = reader.nextInt();
}
heapSort(array);
//display the sorted array
for( i = 0 ; i < N ; i++)
{
System.out.print(array[i] + " ");
}
}
public static void heapSort(int[] array)
{
//build the max heap
buildHeap( array );

int i;
for ( i = array.length-1 ; i > 0 ; i--)
{
//swap the max element to the end of the array
int t = array[0];
array[0] = array[i];
array[i] = t;
//heapify since max is removed
heapify( array, 0, i);
}

}
public static void buildHeap(int[] array)
{
if( array.length > 1)
{
int i;
//heapify procedure is called only for half of the elems
//because remaining are leaves and don't need to be adjusted
for( i = array.length / 2 ; i >= 0 ; i--)
{
heapify(array, i, array.length);
}
}
}
public static void heapify(int[] array, int ind, int size)
{
if( ind >= size/2)
return;

int leftInd = 2*ind+1;
int rightInd = 2*ind+2;
int swapInd = leftInd; //initialize swapInd to the left child
//If right child is there, check if it needs to be swapped
if( rightInd < size && array[leftInd] < array[rightInd])
swapInd = rightInd;

//check if swapping is required
if( array[ind] < array[swapInd])
{
//swap parent and child
int temp = array[ind];
array[ind] = array[swapInd];
array[swapInd] = temp;
//recursive call on swapped index
heapify(array,swapInd,size);
}
}
}

simple pattern matching

Pattern matching algorithms are very important in computer science. In this post we will discuss the simplest algorithm for pattern matching. Given some text and a pattern, we have to find where in the text, the pattern is found.

In this brute-force algorithm, we start matching the pattern from the beginning of the text. If a match is found we return the index. Otherwise we slide the matching window by one character and repeat the same process. For example if we are trying to find the pattern “ababb” in the text “abbabaababb

The following depiction gives the steps involved in this algorithm
   a b b a b a a b a b b
   a b a
     a
       a
         a b a b
           a
             a b
               a b a b b — pattern found

The following is the C++ implementation of the above algorithm. Note that this algorithm finds only the first occurrence of the pattern. This can be modified to find all the occurrences also.


#include <iostream>
#include <cstring>
#define MAXLEN 100
using namespace std;

int findPattern(char *t, char *p)
{
int n = strlen(t); //length of the text
int m = strlen(p); //length of the pattern
int i,j; //loop counters

//outer loop tries to match the pattern from index 0 to (n-m)
for( i = 0 ; i <= n-m ; i++ )
{
//inner loop tries to match jth char in pattern
//with (i+j)th char in text
for( j = 0 ; j < m ; j++)
{
if( t[i+j] != p[j])
break;
}
//pattern found; return the index
if( j == m)
return i;
}
//pattern not found
return -1;
}

int main()
{
char text[MAXLEN];
char pattern[MAXLEN];

cin>>text;
cin>>pattern;
cout<<"pattern found at: "<<findPattern(text,pattern);
return 0;
}

Finding the middle of the linked list

Given a single linked list, how do you find the middle of the list?

One simple method is to traverse through the list twice once to find it’s length, and second to navigate to the middle of the list by traversing length/2 nodes. Even though this method runs in O(n) time, there is still a chance of improving this method. 

In this method we traverse the list only once. We take two pointers, one, a slow pointer which will advance one node at a time; and a fast pointer which will advance two nodes at a time. we iterate through the list until the fast pointer reaches the end of the list. At this stage, the slow pointer would be pointing to the middle of the linked list.

Following is the C++ function which implements this method. For working program please visit this post.


 SLLNode* getMiddle()
{
//if the list is empty or has a single node; return head
if( head == NULL || head->getNext() == NULL )
return head;

SLLNode *slow = head;
SLLNode *fast = head->getNext();

while( fast != NULL )
{
//move fast pointer two steps ahead
if( fast->getNext() != NULL )
{
fast = fast->getNext()->getNext();
}
else
break;
//move slow pointer one step
slow = slow->getNext();
}
return slow;
}

Print a string in reverse using recursion

In simple words, Recursion is nothing but a function calling itself. How can we solve this problem using recursion? 
Initially pass the entire string as input to the function. Basically a pointer to the first character is passed. This function examines the next character; if it is not null, it calls itself by advancing the pointer to the next character. This call sequence goes on until the next character is a null.
In the last call, it prints the last character and function call returns. When this call returns, it goes to the previous state. that means it prints the last but one character and so on…

Here the simple C++ code to do this.

#include <iostream>
#define MAXLEN 100
using namespace std;

void printRev(char *str)
{
//return if empty string
if( *str == NULL )
return;

//if the next char is not null, recurse
if( *(str+1) != NULL )
printRev(str+1);

//print char while unwinding the stack
cout<<*str;
}
int main()
{
char str[MAXLEN];
cin>>str;
printRev(str);
return 0;
}

Finding the element occurring odd number of times

Given an array of positive integers, all numbers occur even number of times, except one which occurs odd number of times. How do you find the number in O(n) time and O(1) space?

You can find a similar question in interviews which is infact a specific problem of the above category. An array contains 2n+1 positive integers where n integers appear twice, and one integer appear only once. Find that integer.


For example if the input array is [1,1,2,2,3] the output should be 3.

Whenever you encounter these kind of problems, think of XOR based solution. The following table shows the XOR logic of two bits

A  B   Result
————–
0  0     0
0  1     1
1  0     1
1  1     0

This problem can easily be solved using the same logic. We all know that, performing XOR between the same bit pattern results in all Zeroes.
For example let us consider the bit pattern for 133

10000101
10000101
———
00000000

Performing XOR between a bit pattern and all zeros gives the same bit pattern.
For example let us XOR 133 and 0

10000101
00000000
——–
10000101

And we also know that XOR is an associative and Commutative operation.
i.e A XOR (B XOR C) = (A XOR B) XOR C
    A XOR B = B XOR A

This means that if we XOR all the elements in the array, all the elements occurring even number of times nullify each other and only the number appearing odd number of times remains.

So the following simple program solves this problem.

#include <iostream>
#define MAXSIZE 100
int array[MAXSIZE];
using namespace std;
int findOddOccur(int n)
{
int res = 0;
for( int i = 0 ; i < n ;i++)
{
res = res ^ array[i];
}
return res;
}
int main()
{
int n;
cin>>n;
for( int i = 0 ; i < n ; i++)
{
cin>>array[i];
}
cout<<"Odd occuring element is: "<<findOddOccur(n);
return 0;
}

Finding the median of an array

Median of an array is defined as the middle element when it is sorted. The middle element is the one which divides the array into two equal halves if the array is of odd length. If the length is even, the average of two middle elements is median.

For example the median of
[4,3,2,5,1] is 3
Median of [6,1,3,5,4,2] is 3.5
The following is the Java code to calculate the median of an array.

import java.util.Arrays;
import java.util.Scanner;

/**
*
* @author Ravi
*/
public class Median {

/**
* @param args the command line arguments
*/
public static void main(String[] args) {
int nSize;
int [] array;
Scanner reader = new Scanner(System.in);
nSize = reader.nextInt();
array = new int[nSize];
for( int i = 0 ; i < nSize; i++)
{
array[i] = reader.nextInt();
}

System.out.println("Median: "+getMedian(array));
}
public static double getMedian(int [] array)
{
Arrays.sort(array);
if( array.length % 2 == 0)
{
int m1 = array[ (array.length/2)-1];
int m2 = array[ array.length/2];
return (m1+m2)/2.0;
}
else
{
return array[ array.length /2 ];
}
}
}

How does insertion sort work

Insertion sort is also a comparison based sorting. It is also an in-place sorting algorithm i.e it does not require extra space. We can re-arrange the elements within the array itself.The algorithm works as follows.

The array is divided into two parts one sorted and one un-sorted. Initially the sorted part will be empty and the unsorted part will be the entire array. We assume that the first element is sorted and start with the second element. we will insert that into the correct place in the sorted part. The same procedure is repeated for the remaining elements also until the entire array is sorted. In each iteration, when a new element is to be inserted into the sorted array, we need to shift the greater elements one step right to make space.

For example let us consider a simple array [3,1,5,2,4]. Following is the state of the array in each iteration. I have divided the array in two parts for convenience.

Step   |Sorted    | Unsorted
—————————–
1:     [3]        | [1,5,2,4]
2:     [1,3]      | [5,2,4]
3:     [1,3,5]    | [2,4]
4:     [1,2,3,5]  | [4]
5:     [1,2,3,4,5]| []

Following is the C++ code which implements insertion sort.

#include <iostream>
using namespace std;

void insertion_sort(int * array, int n)
{
int i,j;
//assume first element (index:0) is in correct position
//start from the 2nd element (index:1)
for( i = 1 ; i < n ; i++)
{
int toInsert = array[i];
//find correct place for toInsert in arra[0:i-1] sub-array
//j >= 0 ; make sure you have not gone beyond the beginning
//array[j] > toInsert ; array elements greater than toInsert
for( j = i-1 ; j >= 0 && array[j] > toInsert ; j--)
{
//shift array element one step right to make room for toInsert
array[j + 1] = array[j];
}
//place toInsert in it's correct position
array[j+1] = toInsert;
}
}
int main()
{
int n;
cin>>n;
int *array = new int[n];
int i;
for( i = 0 ; i < n ; i++)
{
cin>>array[i];
}
insertion_sort(array,n);
//print the sorted array
for( i = 0 ; i < n ; i++)
{
cout<<array[i]<<" ";
}
delete[] array;
return 0;
}

How does binary search work

Given an array of numbers, how do you search for a number? 
The simplest method is to start at the first element and keep comparing until the end of the array. If it is found anywhere in the middle, return it. Otherwise we would reach the end of the array which indicates that the element in not found in the array. This is called linear search. Since this algorithm examines all the elements in the worst case, it’s time complexity is O(n).

Binary search on the other hand runs in O(log n) which is better compared to linear search. Binary search requires the input array to be sorted

Binary search starts by dividing the array in two halves. Then we compare the search element with the middle element, if it matches, we return that index. If the middle element is less than the search element, then we can ignore the left half of the array because all the elements which precede middle are far away from the search element. Hence we can confine our search to the right half only. Similarly if the middle element is greater than the search element, no need to search the right half. We can confine our search to the left half only. The same procedure is repeated until we find the element or, the array size becomes 1.

Let us look at an example to understand this algorithm
Let us consider the input array as [1,2,3,4,5,6,7,8,9] and we want to search for 6. Here is how the algorithm works

step 1: [1,2,3,4,(5),6,7,8,9] – middle element is 5 which is less than 6. So skipping the left half
step 2: [6,(7),8,9] – middle element is 7 which is greater than 6. So skipping right half
step 3: [6] – middle element is 6. match found, so return it.

#include <iostream>
using namespace std;
int array[100];
//takes the following parameters 
//s: search element, lower index, higher index
int bin_search(int s, int low, int high)
{
while( low <= high)
{
// the middle element
int middle = (low + high)/2;
//if match is found, return it
if( array[middle] == s)
return middle;

//if middle element is less
else if ( array[middle] < s)
{
//search the right half
low = middle+1;
}
else
{
//search the left half
high = middle-1;
}
}
return -1;
}
int main()
{
int n; // array size
cin>>n;
int i;
for( i = 0 ; i < n ; i++)
{
cin>>array[i];
}
//search element
int s;
cin>>s;

cout<<s<<" found at: "<<bin_search(s,0,n-1);
return 0;
}

Program to find nth Fibonacci number

Fibonacci number in mathematics is defined as the sum of the two previous elements in the series. Formally this is represented as 
f(n) = f(n-1) + f(n-2) where f(1) = 1 and f(2) = 1.

In this post we will see how to generate nth fibonacci number. The algorithm is self-explanatory from the program itself. So jumping in to the code directly..

#include <iostream>
using namespace std;
int fib(int n)
{
if( n < 3)
return 1;

int a = 1;
int b = 1;
int c = a+b;

int i = 3;
while ( i < n)
{
//store sum of two previous values in c
c = a + b;
a = b; //b is first and
b = c; //c is second for next iteration
i++;
}

return c;
}
int main()
{
int n;
cin>>n;
cout<<fib(n)<<" ";
return 0;
}