It completely misses the point of the question though. The question is not askin...

It completely misses the point of the question though.

The question is not asking about parsing in the sense of matching start tags with end tags, which is indeed not possible with a regex.

The question is about lexing, for which regex is the ideal tool. The solution is somewhat more complex than the question suggest since you have to exclude tags embedded in comments or CDATA sections, but it is definitely doable using a regex.