In this module I learned a lot about common string search and pathfinding algorithms. My assignment was to choose one of these and demonstrate I can implement reasonably complex algorithms by myself, showing my understanding and justifying data structures used. I chose to compare Boyer-Moore-Horspool and Rabin Karp algorithms. Although to note, I didn't implement the rolling hash version of Rabin Karp. Another requirement is to time how long it takes each program to search, this will make it easier to compare the efficiency of both. The program is also coded with C++.
Below is a video demonstration of the string search for both algorithms, with Boyer-Moore-Horspool searching first. Both algorithms search through the same text document called "jute-book" for the user entered string. The total matches are counted and displayed to the user.
In the video above I searched "Dundee". The longer the string and bigger the text to search is, the faster Boyer-Moore-Horspool searches and the slower Rabin-Karp searches. This is how each algorithm was described to me, in lectures and from my own research, so it was fun seeing it happen real-time in this program. When doing this project it became clear how the length of a string can affect the speed of each algorithm, in this video you can even notice how much longer it takes Rabin-Karp to find all 23 Dundee's.
The below image displays the text files the program creates. The time is printed out in milliseconds. "BM_Timings" is Boyer-Moore-Horspool's results, and "RK_Timings" is Rabin-Karp's results.
In this presentation I have more timing results with boxplots of each pattern and number of iterations.
Please DO NOT plagiarise my presentation. You will be caught, it's not worth it.
If you would like to check out the code, or download and try yourself, have a look at the repository below.
Please DO NOT plagiarise my code. You will be caught, it's not worth it.
Project's Github Repository
Comparing efficiency of Boyer-Moore-Horspool and Rabin-Karp algorithms.