[MC-8437] URL parser does not detect 4 letter TLDs Created: 26/Jan/13  Updated: 23/Jun/13  Resolved: 28/Jan/13

Status: Resolved
Project: Minecraft: Java Edition
Component/s: None
Affects Version/s: Minecraft 1.4.7, Snapshot 13w04a
Fix Version/s: Snapshot 13w05a

Type: Bug
Reporter: Jake Fisher Assignee: [Mojang] Nathan Adams
Resolution: Fixed Votes: 1
Labels: chat
Environment:

Windows, Java 7 x64


Issue Links:
Relates
relates to MC-15077 Clickable links in Chat are broken fo... Resolved
relates to MC-18898 URL parser will not detect TLDs of ov... Resolved
CHK:
Confirmation Status: Confirmed

 Description   

The URL parsing in chat dislikes Top Level Domains with more than 3 characters (eg a .com works fine, but a .info does not), and does not permit clicking on the link in game to take the user to a web address.

The problem is easily reproduced, after trying the domains in the chat box:

jameo.com and other variations (eg prepending http://) functions correctly.

jameo.info and other variations do not.



 Comments   
Comment by Kumasasa [ 27/Jan/13 ]

... there are as many regex as there are URLs : http://regexlib.com/%28X%281%29A%28w_EUcf1Bu6N3WFkuVKjGepFvJE2idKTn6XdSnjWUuNtd1un_Eb4FoDkOVAR8SRiMGGBF5Tbw1UonIHucQzOcdTgoYTvkxhpmUooMy90R-WvvA2aGvfMrruwlVe85XxuakVN09q6CjmodpNV4bx0z13Hec3uqWgCLn6W8bpOnukXFFUETJvMR0rkxKy7Z6EVW0%29%29/Search.aspx?k=URL&c=-1&m=-1&ps=100

Comment by MiiNiPaa [ 27/Jan/13 ]

http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains
Not every TLD will catched by this RegEx. You should avoid "TLD can be only 2-4 characters long" type of thinking.

Comment by Kumasasa [ 26/Jan/13 ]

Confirmed.

Comment by Markku [ 26/Jan/13 ]

Bug confirmed, found and busted. Erm, fixed.

Bug

"ChatClickData"
    public static final Pattern pattern = Pattern.compile("^(?:(https?)://)?([-\\w_\\.]{2,}\\.[a-z]{2,3})(/\\S*)?$");

Fix

"ChatClickData"
    public static final Pattern pattern = Pattern.compile("^(?:(https?)://)?([-\\w_\\.]{2,}\\.[a-z]{2,4})(/\\S*)?$");

Fix tested on 1.4.7

(Edit: minor technical detail: It does allow clicking the text in any case, it just won't recognize it as an URL without that small change, and thus will not react any further.)

Generated at Sun Jan 12 12:17:47 UTC 2025 using Jira 9.12.2#9120002-sha1:301bf498dd45d800842af0b84230f1bb58606c13.