力扣labuladong一刷day21天滑动哈希算法共2题
文章目录
- 力扣labuladong一刷day21天滑动哈希算法共2题
- 一、187. 重复的DNA序列
- 二、28. 找出字符串中第一个匹配项的下标
一、187. 重复的DNA序列
题目链接:https://leetcode.cn/problems/repeated-dna-sequences/description/
思路:字符串序列里找重复出现的子串就是子串匹配问题,正常的思路是截取所有长为L的子串,看set里是否重复,可是截取的过程是O(n)每个都截取就成了O(x*n)。采用另外一种思路,也是滑动窗口,只不过把滑动窗口中的字符串映射成数字,这种映射的方式就是right右移尾部加数,left右移头部减数,复杂度是O(1)。这样只需要比较set中的映射值即可。
class Solution {public List<String> findRepeatedDnaSequences(String s) {Set<String> list = new HashSet<>();Set<Integer> set = new HashSet<>();int[] nums = new int[s.length()];for (int i = 0; i < s.length(); i++) {char c = s.charAt(i);if (c == 'A') {nums[i] = 0;} else if (c == 'C') {nums[i] = 1;} else if (c == 'G') {nums[i] = 2;}else {nums[i] = 3;}}int hashcode = 0, left = 0, right = 0, r = 4;int rl = (int) Math.pow(r,9);while (right < nums.length) {hashcode = hashcode * r + nums[right];right++;if (right - left == 10) {if (set.contains(hashcode)) {list.add(s.substring(left, right));}else {set.add(hashcode);}hashcode = hashcode - nums[left] * rl;left++;}}return new ArrayList<>(list);}
}
二、28. 找出字符串中第一个匹配项的下标
题目链接:https://leetcode.cn/problems/find-the-index-of-the-first-occurrence-in-a-string/
思路:滑动窗口使用needmap和windowmap要求需要匹配的是子串的排列,如果不是排列是无法区分ab与ba的。本题是完全匹配就只能用滑动哈希算法了。
选择一个较大的素数Q,这样会减少哈希碰撞。
class Solution {public int strStr(String haystack, String needle) {long Q = 1658598167, L = needle.length(), R = 256;long hashCode = 0, patCode = 0;long RL = 1;for (int i = 0; i < L - 1; i++) {RL = (RL*R) % Q;}for (int i = 0; i < L; i++) {patCode = (patCode * R+ needle.charAt(i)) % Q;}int left = 0, right = 0;while (right < haystack.length()) {hashCode = ((hashCode * R) % Q + haystack.charAt(right)) % Q;right++;if (right - left == L) {if (hashCode == patCode) {if (needle.equals(haystack.substring(left, right))) return left;}hashCode = (hashCode - (haystack.charAt(left)*RL) % Q + Q) % Q;left++;}}return -1;}
}