CodexBloom - Programming Q&A Platform

implementing Implementing the Knuth-Morris-Pratt Algorithm in Python - Incorrect Index Returned

👀 Views: 89 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-08
algorithm string-matching kmp Python

I've searched everywhere and can't find a clear answer. I'm currently trying to implement the Knuth-Morris-Pratt (KMP) string matching algorithm in Python, but I'm running into issues with the index returned when the pattern is found in the text. I have followed standard practices and created the LPS (Longest Prefix Suffix) array, but it seems to be off by one in certain cases. For example, when searching for the pattern 'ABABC' in the text 'ABABABABABC', the algorithm should return index 8, but it returns 7 instead. Here's the code snippet I've written: ```python def KMPSearch(pattern, text): M = len(pattern) N = len(text) lps = [0] * M j = 0 # Index for pattern computeLPSArray(pattern, M, lps) i = 0 # Index for text while N - i >= M: if pattern[j] == text[i]: i += 1 j += 1 if j == M: print(f'Pattern found at index {i - j}') # <-- This is where I get index 7 instead of 8 j = lps[j - 1] elif i < N and pattern[j] != text[i]: if j != 0: j = lps[j - 1] else: i += 1 return -1 def computeLPSArray(pattern, M, lps): length = 0 # Length of the previous longest prefix suffix i = 1 lps[0] = 0 # LPS[0] is always 0 while i < M: if pattern[i] == pattern[length]: length += 1 lps[i] = length i += 1 else: if length != 0: length = lps[length - 1] else: lps[i] = 0 i += 1 ``` I've double-checked the computation of the LPS array, and it seems correct. I'm using Python 3.9.6, and I've also tried debugging by printing the values of `i`, `j`, and the LPS array at each step, but I need to seem to pinpoint the scenario. Any insights on what might be going wrong or if I need to modify this part of the code? I'd appreciate any suggestions or corrections! Any ideas what could be causing this?