LeetCode 28. Implement strStr()
题目描述
Implement strStr().
Return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.
Clarification:
What should we return when needle is an empty string? This is a great question to ask during an interview.
For the purpose of this problem, we will return 0 when needle is an empty string. This is consistent to C's strstr() and Java's indexOf().
Example 1:
Input: haystack = "hello", needle = "ll"
Output: 2
Example 2:
Input: haystack = "aaaaa", needle = "bba"
Output: -1
Example 3:
Input: haystack = "", needle = ""
Output: 0
Constraints:
- 0 <= haystack.length, needle.length <= 5 * 104
- haystack and needle consist of only lower-case English characters.
解题思路
一道基础的字符串匹配的题目。
这道题设定的难度是 Easy,所以简单的两层循环暴力算法也能通过,事件复杂度 O(M*N)。
但是这种算法显然时间复杂度比较高,如果设定为 Hard 的话就无法通过了。
我们希望能用的是线性时间的字符串匹配算法,常见的有 KMP、BM(Boyer Moore)、Sunday 算法等。KMP 算法是教科书上的经典算法,但是比较晦涩,手写记忆都比较麻烦,面试中几乎不会用到这一算法;BM 算法对 KMP 进行了改进,性能有数倍提升,文本编辑器和IDE中常用的查找功能就是基于BM算法。
参考代码
- 乘法的溢出问题,步步取模
- 哈希值用有符号整数,因为减法会导致出现负数
/*
* @lc app=leetcode id=28 lang=cpp
*
* [28] Implement strStr()
*/
// @lc code=start
class Solution {
public:
int64_t R_n(int R, int nl, int MOD) {
int64_t Rn = 1;
int64_t base = R;
while(nl) {
if (nl & 1) {
Rn = (Rn * base) % MOD;
}
nl >>= 1;
base = (base * base) % MOD;
}
return Rn;
}
int strStr(string haystack, string needle) {
if (needle.size() == 0) return 0; // !!
int hl = haystack.size();
int nl = needle.size();
constexpr int MOD = 1e9+7;
constexpr int R = 26;
// const int64_t Rn = (int64_t)pow(R, nl) % MOD;
const int64_t Rn = R_n(R, nl, MOD);
int64_t ns = 0;
int64_t hs = 0; // must be signed, for hs may < 0
for (int i=0; i<nl; i++) {
ns = (ns * R + (needle[i] - 'a')) % MOD;
hs = (hs * R + (haystack[i] - 'a')) % MOD;
}
if (hs == ns) return 0;
for (int i=nl; i<hl; i++) {
hs = (hs * R + (haystack[i] - 'a') - (haystack[i-nl] - 'a') * Rn) % MOD;
hs = (hs + MOD) % MOD;
if (hs == ns) return i-nl+1;
}
return -1;
}
};
// @lc code=end