CodexBloom - Programming Q&A Platform

implementing Implementing the Boyer-Moore String Matching Algorithm in C# - Unexpected Index Out of Range handling

👀 Views: 67 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-05
algorithm string-matching exceptions C#

I'm updating my dependencies and I'm working on a personal project and I'm currently implementing the Boyer-Moore string matching algorithm in C# for a text processing application, and I keep running into an `IndexOutOfRangeException` when searching for a pattern in certain texts....... The algorithm works as expected for most cases, but it fails on texts where the pattern appears multiple times in overlapping positions. Here is the core of my implementation: ```csharp public List<int> BoyerMooreSearch(string text, string pattern) { List<int> occurrences = new List<int>(); int[] badChar = BuildBadCharTable(pattern); int patternLength = pattern.Length; int textLength = text.Length; int shift = 0; while (shift <= textLength - patternLength) { int j = patternLength - 1; while (j >= 0 && pattern[j] == text[shift + j]) j--; if (j < 0) { occurrences.Add(shift); shift += (shift + patternLength < textLength) ? patternLength - badChar[text[shift + patternLength]] : 1; } else shift += Math.Max(1, j - badChar[text[shift + j]]); } return occurrences; } private int[] BuildBadCharTable(string pattern) { int[] badChar = new int[256]; // Assuming ASCII for (int i = 0; i < badChar.Length; i++) badChar[i] = -1; for (int i = 0; i < pattern.Length; i++) badChar[pattern[i]] = i; return badChar; } ``` I'm perplexed because the `badChar` array should handle all character mismatches correctly. However, when the search reaches the end of the text, the exception arises on this line: `j - badChar[text[shift + j]]`. It seems like the index `shift + j` goes out of bounds, especially when the pattern is near the end of the text. I've tried adding bounds checking before accessing the array, but I still encounter the behavior. Additionally, I've tested with different patterns and texts, but the overlap scenario seems to be the consistent factor. Any insights on why this could be happening or how to properly handle the edge cases without causing an exception would be greatly appreciated! I'm using .NET 6.0 for this project. I'm working on a application that needs to handle this. Any help would be greatly appreciated! I'd really appreciate any guidance on this. Am I missing something obvious? Has anyone else encountered this?