String Search Algorithms

About

Module: CMP201
Title: Data Structures and Algorithms 1
Grade: A

In this module I learned a lot about common string search and pathfinding algorithms. My assignment was to choose one of these and demonstrate I can implement reasonably complex algorithms by myself, showing my understanding and justifying data structures used. I chose to compare Boyer-Moore-Horspool and Rabin Karp algorithms. Although to note, I didn't implement the rolling hash version of Rabin Karp. Another requirement is to time how long it takes each program to search, this will make it easier to compare the efficiency of both. The program is also coded with C++.

Below is a video demonstration of the string search for both algorithms, with Boyer-Moore-Horspool searching first. Both algorithms search through the same text document called "jute-book" for the user entered string. The total matches are counted and displayed to the user.

In the video above I searched "Dundee". The longer the string and bigger the text to search is, the faster Boyer-Moore-Horspool searches and the slower Rabin-Karp searches. This is how each algorithm was described to me, in lectures and from my own research, so it was fun seeing it happen real-time in this program. When doing this project it became clear how the length of a string can affect the speed of each algorithm, in this video you can even notice how much longer it takes Rabin-Karp to find all 23 Dundee's.

Timing

The below image displays the text files the program creates. The time is printed out in milliseconds. "BM_Timings" is Boyer-Moore-Horspool's results, and "RK_Timings" is Rabin-Karp's results.

Presentation

In this presentation I have more timing results with boxplots of each pattern and number of iterations.

Please DO NOT plagiarise my presentation. You will be caught, it's not worth it.

<> Code

If you would like to check out the code, or download and try yourself, have a look at the repository below.

Please DO NOT plagiarise my code. You will be caught, it's not worth it.

Project's Github Repository

CMP201 String Search Algorithms

Comparing efficiency of Boyer-Moore-Horspool and Rabin-Karp algorithms.