Remove HTML Tags from the string

Some times we require to remove a particular tag from string. Suppose you need only content, no need to HTML tag present in string. then call following method, give tag name which is to be removed from html string.

Private Shared Function StripHtmlTag(ByVal tagName As String, ByVal html As String) As String
Dim reg As New Regex(String.Format("<{0}[^>]*?>[\w|\t|\r|\W]*?</{0}>", tagName), RegexOptions.Singleline Or RegexOptions.IgnoreCase)
Dim regt As New Regex("\n")
Dim m As Match = reg.Match(html)
While m.Success
html = html.Replace(m.ToString(), New String(Chr(10), regt.Matches(m.ToString()).Count))
m = m.NextMatch()
End While
Return html
End Function

For example, you have string "<div> This is test</div>" Now you have to remove div tag from this string then call above method pass div tag as tagname and string as html. then it yields " This is test ".

1 comment: