天天看点

[LeetCode] Tag Validator 标签验证器

Given a string representing a code snippet, you need to implement a tag validator to parse the code and return whether it is valid. A code snippet is valid if all the following rules hold:

The code must be wrapped in a valid closed tag. Otherwise, the code is invalid.

A closed tag (not necessarily valid) has exactly the following format : <TAG_NAME>TAG_CONTENT</TAG_NAME>. Among them, <TAG_NAME> is the start tag, and </TAG_NAME> is the end tag. The TAG_NAME in start and end tags should be the same. A closed tag is valid if and only if the TAG_NAME and TAG_CONTENT are valid.

A valid TAG_NAME only contain upper-case letters, and has length in range [1,9]. Otherwise, the TAG_NAMEis invalid.

A valid TAG_CONTENT may contain other valid closed tags, cdata and any characters (see note1) EXCEPTunmatched <, unmatched start and end tag, and unmatched or closed tags with invalid TAG_NAME. Otherwise, the TAG_CONTENT is invalid.

A start tag is unmatched if no end tag exists with the same TAG_NAME, and vice versa. However, you also need to consider the issue of unbalanced when tags are nested.

A < is unmatched if you cannot find a subsequent >. And when you find a < or </, all the subsequent characters until the next > should be parsed as TAG_NAME (not necessarily valid).

The cdata has the following format : <![CDATA[CDATA_CONTENT]]>. The range of CDATA_CONTENT is defined as the characters between <![CDATA[ and the first subsequent ]]>.

CDATA_CONTENT may contain any characters. The function of cdata is to forbid the validator to parse CDATA_CONTENT, so even it has some characters that can be parsed as tag (no matter valid or invalid), you should treat it as regular characters.

Valid Code Examples:

Invalid Code Examples:

Note:

For simplicity, you could assume the input code (including the any characters mentioned above) only contain letters, digits, '<','>','/','!','[',']' and ' '.

这道题让我们给了我们一个字符串,其实是html的代码,让我们验证其写法是否正确。规定了八条规则,比如说必须是封闭的,标签名必须都是大写,并且不能超过9个字符,还规定了CDATA的一些格式规范,并且给了一些实例,但是说实话,题目中给的这些例子完全不能覆盖OJ中的各种情况,博主这次完全被OJ教育了,每次submit都被OJ打回来,然后分析未通过的test case,修改代码,再提交,再打回,折腾了十几次,终于通过OJ,绿色的Accepted出现的那一刹那,无比的快感,这也是博主能坚持到现在的动力之一吧,当然最主要的动力还是大家的支持与鼓励,博主很喜欢跟大家留言互动哈。下面呈上博主fail过的case,并来分析原因:

错误原因:没有以Tag开头

多余的'>'不会影响

错误原因:最后一个闭标签只能闭合首标签

没有content data也没关系

CDATA中间的内容可以是任意字符

错误原因:没有标签存在

错误原因:末尾存在多余的'>'

注意content data的干扰字符

错误原因:标签的字符长度不能超过9个

错误原因:标签字符必须都是大写

错误原因:没有正确的match上"<![CDATA[",也不能当做标签

错误原因:不能以非标签开头

错误原因:不能以非标签结尾

如果我们只匹配到了"</",说明是个结束标签,那么我们用find来找到右尖括号'>',如果没找到直接返回false,找到了就把tag到内容提出来,然后看此时的stack,如果stack为空,或者栈顶元素不等于tag,直接返回false,否则就将栈顶元素取出。

如果我们只匹配到了"<",说明是个起始标签,还是要找右尖括号,如果找不到,或者标签的长度为0,或者大于9了,直接返回true。然后遍历标签的每一位,如果不全是大些字母,返回false,否则就把tag压入栈。那么你可能会有疑问,为啥在处理结束标签时,没有这些额外的判断呢,因为结束标签要和栈顶元素比较,栈里的标签肯定都是合法的,所以如果结束标签不合法,那么肯定不相等,也就直接返回false了。最后我们看栈是否为空,如果不为空,说明有未封闭的标签,返回false。参见代码如下:

<a>class Solution {</a>

参考资料:

<a href="https://discuss.leetcode.com/topic/91473/clean-c-solution" target="_blank">https://discuss.leetcode.com/topic/91473/clean-c-solution</a>

<a href="https://discuss.leetcode.com/topic/91446/c-clean-code-recursive-parser" target="_blank">https://discuss.leetcode.com/topic/91446/c-clean-code-recursive-parser</a>

<a href="https://discuss.leetcode.com/topic/91505/6-lines-c-solution-using-regex" target="_blank">https://discuss.leetcode.com/topic/91505/6-lines-c-solution-using-regex</a>

<a href="https://discuss.leetcode.com/topic/91300/java-solution-use-startswith-and-indexof" target="_blank">https://discuss.leetcode.com/topic/91300/java-solution-use-startswith-and-indexof</a>

<a href="https://discuss.leetcode.com/topic/91406/java-solution-7-lines-regular-expression/2" target="_blank">https://discuss.leetcode.com/topic/91406/java-solution-7-lines-regular-expression/2</a>

,如需转载请自行联系原博主。

继续阅读