two quotes calling for more information
The reviewers write: "the IETF is currently dealing with the final draft of a successor document to that BCP. This will provide expanded means for specifying languages, including designations for script and orthographic authority as components of a language tag".
This is a very odd reading of the RFC 3066 bis Draft.
1) The "script" indication has strictly no interest as far as IDNs are concerned, except to provide a short (100 codes) list of names to possibly designate charsets (experts think there are around 200 scripts in use). For example, the debate on the Unicode list shown a deep disagreement over the simple question "can I write French in using the Latn script?".
This may introduce incredible confusion (we already met) as the same name can be used for different charsets by the TLD Managers. It is also of low interest because of the need to support foreign script characters used in commercial denominations, TM or nicknames. The need is for a naming format for TLD table.
2) I wander what can be called "orthographic authority" in the RFC 3066 language tag. That tag uses a mix of UN.49 and ISO 3166 codes and numbers to designate what they call a "region". The only "orthographic authority" could be the legal authority of the considered country. By nature the ccTLD (RFC 1591, GAC declaration) the ccTLD Manager as the trustee of the national authority IS that authority, its representative or its legal "colleague".
As a general comment, this WG decided non to consider IDNs, DNS, lingual community needs and convergence with other Registries (ISO 11179) as irrelevant to the scope of their WG. This will lead to appeals by lingual organisations and ccTLDs should this Draft be approved without supporting URI/IRI-tags. The conflict is between two different layers (computers interoperability and users interintelligibility):
- a sabilisation of the internationalisation approach in constraining the current pratice for a convergence (through the denomination) with locales files (this is the Unicode CLDR project quoted in the Charter)
- and a multilingualisation approach supporting the users lingual communities, including ccTLDs.
There are many reasons to use URI-tags (specified by an accepted non published RFC) and IRI-tags to be specified. One is homographs. The recommendations given by the Guide Lines only concerns SLDN. Phishing has nothing to do with IDNs but with the misconcepts of IDNA. Phishing uses IDNA possibilities on nLD labels and work with _every_ domain name. IRI-tags permit zone managers (any level) to specify their own tables for their own zone. A zone manager of the DN xxx.yyy.zz can define a Zone Table for Arab characters as "ar-0-xxx.yyy.zz:ar.txt".
The reviewers also note "The discussion that is in progress about permitting a more extensive character repertoire in top-level labels can result in a change to this condition, as well as raising need for further guidelines specific to the new situation".
Could they document this "discussion in progress"?