In programming, we often need to represent a list of integers in some range using a compact structure to reduce memory consumption.
For example, if your program needs to find missing numbers from a huge file which can’t fit in memory, and all the numbers are in range of say [1, (2^64)-1]
We need to find an efficient way of processing the huge file because we can’t load all of it’s contents in memory at once. One way is to load the file into memory in smaller chunks which can fit in memory and see if there is any missing number. However, in this approach, we still need to store all the number in the chunk into memory. Suppose if we are loading 100,000 numbers in one go, assuming 8 bytes for each number (Typical size for a long datatype in many languages). we need 800,000 bytes (~0.8MB) memory.
Can we reduce this memory? Here Bitsets
come in handy. With Bitsets 100,000 numbers can be represented using 12.5 KB or 0.0125 MB of memory.
How do Bitsets reduce the memory? It stores the presence or absence of an number by setting corresponding bit to 1 or 0 in a binary sequence. Let us take simple example. Suppose we need to find missing numbers in the range [0, 7]
. We can represent all these numbers in one Byte. Assume that right most bit represents the number 0, and left most bit represents 7.
1 1 0 1 1 1 1 - Indicates that 4 is missing
1 1 1 1 1 0 1 - Indicates that 1 is missing
C++, and Java offers standard library classes/methods to support Bitsets. Following is a sample C++ program with some examples to use bitset. There are many more methods supported by bitset, please check this link for reference. Java Bitset documentation can be found here.
#include <iostream>
#include <bitset>
using namespace std;
int main() {
bitset<8> bits; //Create a bitset of length 8
cout << bits << endl; //Print 00000000
bits.set(); //Sets all the bits to 1
cout << bits << endl; //Print 11111111
bits.reset(); //resets all the bits to 0
cout << bits << endl; //Print 00000000
bits.set(2); //Sets the second bit (from right) to 1
cout << bits << endl; //Print 00000100
bits.reset(2); //Clears the second bit
cout << bits << endl; //Print 00000000
bits.set(1);
bits.set(4);
cout << "Number of set bits " << bits.count() << endl;
return 0;
}