Extract Links from the string

I saw a lot of methods or regular expressions to extract all links from source of web page. but problem is that they use anchor href attribute. what would happen if links are in javascript. So I changed this criteria and search links using extensions. Here is the code


Dim mc As MatchCollection = Regex.Matches(stringObject, "(?:\b[A-Z0-9_/.:]+\.(aspx|htm|html|php|asp|js|ascx)(\S+=\S+\"")?\b)|(?:<a[^>]*?>)", RegexOptions.IgnoreCase Or RegexOptions.ExplicitCapture)

No comments:

Post a Comment