BRUTE-FORCE SUBSTRING SEARCH
In this article we are going to consider the naive substring search algorithm: the so-called brute-force substring search. What is the aim of substring search? As the name suggests, we want to find a given pattern in a given text. String matching is important for text processing. By the way, every modern progpramming language has several functions for text processing that can be helpful. But it may be useful to understand the principles.
We just have to check every single character from the text to match against the pattern. The algorithm keeps iterating through the text and if there is a mismatch, we shift the pattern one step to the right. It is very intuitive, on the other hand it is extremely inefficient. Especially when there are a lot of matching prefixes.
The principle of brute-force search is quite simple. We check whether the actual character in the text matches the give character in the pattern. So, let’s see the concrete illustration how brute-force substring search works.
So here is the implementation for brute-force substring search. We have the ‘text’ and the ‘pattern’ we are looking for. We have a nested for loop: first we iterate through the characters of the text and on every iteration we have to check the characters in the pattern accordingly. If j == lengthOfPattern it means that we found the pattern.
What is the main problem concerning the running time of brute-force substring search algorithm? The algorithm needs backup for every mismatch. If there is a mismatch, we jump back to the next character in the text. So there are lots of compares of characters. If we have a text with length N and a pattern with length M, then the running time is O(N*M). Despite of this, there are some advantages. It does not need any extra memory. More important, it does not require pre-processing the text. The ultimate question: can we do better? Yes, we can with Boyer-Moore algorithm.