Leetcode 10. Regular Expression Matching

题目描述

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Note:

  • s could be empty and contains only lowercase letters a-z.
  • p could be empty and contains only lowercase letters a-z, and characters like . or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

解题思路

正则表达式中最复杂的操作应当是 "",因为很难判断其匹配字串的长度。但是对于该字符有一个关键信息。"" 不会单独出现,其之前必有一个字符与其配对。所以题目的关键就是如何把这一对放到合适的位置。

考虑特殊情况

情况1:

"aaaaaaaaa"
"a*aaa*"

情况2:

"aaaaaaaaaaaaaa"
"a*ab*"

在不知道后面情况的时候,如何匹配 "a"?因此考虑用回溯的方法,考虑每种 "" 的情况,看哪种能成功,如果其中出了问题,马上回溯,换下一种情况。

回溯

我们可以暴力的测试所有的 "*" 的可能性,用两个指针 i, j 来表示当前 s 和 p 的字符。考虑到从前往后匹配时,每一次都有判断其后是否有 "*",因此我们用从后往前匹配的方式,如果遇到 "*",其之前必有一个字符。

如果 j 遇到 "*",判断 s[i] p[j-1] 是否相符。相同,可以先尝试匹配掉,i--,然后看之后能不能满足条件。若不满足,回溯。不管相不相同,都不匹配这个字符直接 j -= 2。

如果 j 遇到的不是,直接比较 i j 指向的字符。

再考虑退出。若j < 0,说明 p 已经匹配结束,若 s 还有剩余则错误,若无则正确。若 i < 0,说明 s 已匹配完,若 j 大于 0 且当前 j 指向元素不是 "*",则错误。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution:
def isMatch(self, s: str, p: str) -> bool:
return self.match(s, len(s)-1, p, len(p)-1)

def match(self, s, i, p, j):
if j < 0:
if i < 0:
return True
else:
return False
if i < 0:
if p[j] != "*":
return False

if p[j] == "*":
if i >= 0 and (p[j-1] == "." or s[i] == p[j-1]):
if self.match(s, i-1, p, j):
return True
return self.match(s, i, p, j-2)

if p[j] == "." or p[j] == s[i]:
return self.match(s, i-1, p, j-1)

return False

DP

空间换时间,使用2D布尔数组,dp[i][j]表示s[0-i]与p[0-j]是否匹配。有三种情况

  1. p[j] == s[i]: dp[i][j] = dp[i-1][j-1]
  2. p[j] == '.': dp[i][j] = dp[i-1][j-1]
  3. p[j] == '*':
  4. p[j-1] != s[i]: dp[i][j] = dp[i][j-2] // a* 匹配空白
  5. p[j-1] == s[i] or p[j] == '.':
    dp[i][j] = dp[i-1][j] // a* 匹配多个 a
    or dp[i][j] = dp[i][j-1] // a* 匹配单个 a
    or dp[i][j] = dp[i][j-2] // a* 匹配空白
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public boolean isMatch(String s, String p) {

if (s == null || p == null) {
return false;
}
boolean[][] dp = new boolean[s.length()+1][p.length()+1];
dp[0][0] = true;
for (int i = 0; i < p.length(); i++) {
if (p.charAt(i) == '*' && dp[0][i-1]) {
dp[0][i+1] = true;
}
}
for (int i = 0 ; i < s.length(); i++) {
for (int j = 0; j < p.length(); j++) {
if (p.charAt(j) == '.') {
dp[i+1][j+1] = dp[i][j];
}
if (p.charAt(j) == s.charAt(i)) {
dp[i+1][j+1] = dp[i][j];
}
if (p.charAt(j) == '*') {
if (p.charAt(j-1) != s.charAt(i) && p.charAt(j-1) != '.') {
dp[i+1][j+1] = dp[i+1][j-1];
} else {
dp[i+1][j+1] = (dp[i+1][j] || dp[i][j+1] || dp[i+1][j-1]);
}
}
}
}
return dp[s.length()][p.length()];