昨天憋了一下午也没搞定,今天早上终于解决了多行匹配的问题。
protected static string parseHtml(string html)
{
//前面去掉空格,中间(.|n)*?为非贪婪匹配
string clearScriptPattern = @"<s*script[^>]*>(.|n)*?</s*scripts*>";
string clearStylePattern = @"<s*styles*>(.|n)*?</s*styles*>";
string clearHtmlPattern = @"<[^>]*>";
string clearSpacePattern = @" | |s";
RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Compiled;
string parseResult = Regex.Replace(html, clearScriptPattern, "", options);
parseResult = Regex.Replace(parseResult, clearStylePattern, "", options);
parseResult = Regex.Replace(parseResult, clearHtmlPattern, "", options);
parseResult = Regex.Replace(parseResult, clearSpacePattern, " ", options);
return parseResult;
}
推荐一个代码重构的工具Refactor!™ Pro,如果想自己的代码写得漂亮一点,用它没错。