I think the regex is a good start, but that you need to make it the first in a two-pass process. Put a word boundary on the regex so that it doesn’t allow for “awwww…”, then use regex groups to give you a quick and easy way to check a valid TLD.

If your TLD list is out of date then you accept you may not catch 100%, but you’re going to get a much more accurate set of results compared to regex alone, and in the long run people will appreciate it.