ICANN Email Archives: [ft-implementation]

ICANN ICANN Email List Archives

[ft-implementation]

<<< Chronological Index >>> <<< Thread Index >>>

Re: Public Comment: Fast Track Proposed Solutions

To: ft-implementation@xxxxxxxxx
Subject: Re: Public Comment: Fast Track Proposed Solutions
From: Eric Brunner-Williams <ebw@xxxxxxxxxxxxxxxxxxxx>
Date: Fri, 20 Feb 2009 13:59:11 -0500

The following are some text change suggestions and supporting rhetoric(or rationale if the reader is feeling generous) for the suggested

changes to the recent Proposed Solutions.

Page 2, Section I

The first bullet item could be qualified, the IDN Table is not a tabularlisting of all characters available for some purpose, but a listing ofall characters, other than the ASCII LDH set, from some characterrepertoire other than the ASCII set. Mentioning Unicode as the characterrepertoire would be useful too.

The fourth bullet is incorrect. It places the utility for definition ofvariant characters in typographic similarities. The set oftypographically dissimilar characters 'that have "the same meaning" whenused in domain name registrations' is vastly greater than the set oftypographically similar characters. The utility for recognizingequivalent meanings, e.g., between typographically dissimilar SimplifiedChinese and Tradition Chinese characters, and for typographicallydissimilar Abenaki orthographic convention equivalence class {8, w, ou,and U+0222, U+0223}, is not to prevent confusion arising from dissimilarmeaning associated with visually similar characters, but to preventconfusion arising from dissimilar meaning associated with visuallydissimilar characters.

This error is repeated in the expanded discussion in page 4, SectionIII, which states the benefits of equivalence only in terms of visualsimilarity. This is equivalent to claiming that the only motivation forexercising any care with any subset of the character repertoire(s), orfor IDNs in the first place, is typo-squatting on famous marks.

The test for true meaning is either to state that SC/TC and similarcharacter equivalencies are not "variants" and ICANN will discard anySC/TC and similar table data submitted as IDN table data.


Page 3, Section I, continued

The sixth bullet is unclear. What support will ICANN provide toapplicants when requested?

The seventh bullet could be restated as "If a sequence of charactersapplied for by a Fast Track Program applicant ccTLD, or any subsequentapplicant, contains any characters which are defined as being in avariation set in any IDN table previously submitted to ICANN, theapplication would result in an entry into the IANA root only if anundefined "user confusion" test has an under-defined outcome."

Page 4, Section IV, para 1 refers to a "speech community". This shouldbe "writing community", or simply "language community", and "languageauthority or authorities" is even better. The plural is probably thebetter choice, I couldn't abide by George W. Bush's "English" and I'msure the US isn't the only odd-stakeholder that is intellectually brokenfrom time to time, or permanently on language authority issues.

For instance, in Khmer there are several orthographic conventions (seealso Abenaki, supra) and the character variation issue has nothing to dowith spoken Khmer, only the orthographic choice problem.

Para 3 mentions the ASIWG in a context similar to the CDNC/JET work thatresulted in rfc3743. I participate in the ASIWG and unfortunately thereis no IETF draft to point to which incorporates thescript-development-based policy serving the Arabic, Farsi, Urdu, ...language authorities (or "language communities").

The final para of page 4, continued on page 5, makes no mention of theArabic Script use by the diaspora communities in Europe and theAmericas. While the Yiddish language (Hebrew Script) is not mentioned,it is an example of a language authority (YIVO) existing in diaspora.

The final para before the unnumbered subsection "Usage of IDN Tables andvariant characters in domain name registrations" has the following:

"Regardless of the language or script basis, domain names do not alwaysrepresent [words] ..."

This fails to state that domain names are persistent identifiersassociated via resolvers using the DNS (rfc1034/35 et seq) to transientresources, in particular an IPv4 or IPv6 address of a network attacheddevice which may have many IPv4 or IPv6 addresses, in sequence orsimultaneously, or both. There is a problem that many, for instance, thegroup organized three years ago by the Arab League, view "domain names"as "names" existing in some external universe of "meaningful names",rather than as unique LDH (and now over a larger character reperatoire)character sequences slightly more memorable and potentially morepersistent than IPv4 and/or IPv6 dotted decimal addresses.

This misconception leads to over-specification attempts and suggestingthat identifiers nearly always represent words simply props up amisunderstanding that never should have existed, and that is the levelof misunderstanding that exists in some of the groups attempting toinform or capture ICANN and the IETF, though not the UTC which has ahigher entry clue barrier than either ICANN or the IETF's IDNAv2 WG.

Section IV, continued on page 6, provides a "primary goal" that "... alllanguage communities have an equal opportunity ... " for the proposalthat follows. This is misleading in two parts. First, the Fast Track isnot open to "all language communities", it is only open to those (a)which are state sponsored, by some state, and (b) which use a scriptother than Latin. Second, the goal is not that some language communityhas formal equity of opportunity with the early-adopter Latin-centricuser community of the United States, the 53 member states of theCommonwealth and the Western Europe NATO states, that is, with ASCII,rather it is that the non-Latin scripts are preferentially available tolanguage communities who's orthographic conventions have been includedin the current version of the reference character repertoire. We'retrying to make it so that Maylay writers using Jawi Script need neveragain use Latin characters to form identifiers mediated by the DNS,unless they choose to, whether the namespace Maylay writers using JawiScript is managed by a national operator, or a non-national operator.

The Proposed IDN Table usage for TLD Registrations, beginning on page 6,penultimate paragraph, refers to "variant strings" which is not definedor is a roundabout way to refer to the SC/TC mapping problem, andcontinues to mingle a technical issue, the lack of a standard mechanismfor aliasing delegations at the root of a DNS tree, and the policychoice to limit the number of delegations to "one per language perscript per IDN ccTLD. Note that the one per policy is not deleterious tousers, in the unlikely possibility that the state associated with .ilrequest both Hebrew and Yiddish (two languages using the same script),but is deleterious to users in the more likely possibility that thestates using Chinese and "unified Han" script seek both an SC and a TCentry in the IANA root.

Continuing on page 7, the lettered recommendations for proposed IDNtable usage at (c) differs from the IDNC Final Report recommendation of“one string per territory per official language” in the area of variantstrings. This is an improvement over the IDNC Final Report, which wasmarred by an excessively narrow definition of linguistic diversity bymembers from linguistically homogeneous (in fact or in official fiction)stakeholders. The IDNC recommendation is simply incomprehensible inSouth Asia and should not have survived without improvement the ICANNDelhi meeting.

Recommendation (d) is perplexing. No standard mechanism exists foraliasing delegations in the IANA root, see the preamble to the letteredrecommendations, supra, yet the applicant for a variant string mustagree to (somehow) effect an identical zone to the zone(s) associatedwith all other strings in the "variant string set".

As policy, this is also remarkable. As a hypothetical, suppose that(somehow) there were applications for "US" in both Traditional Chineseand Simplified Chinese. While most multi-character sequences in writtenChinese have equivalent meaning despite being visually dissimilar whenrendered in SC or TC, the meaning of "mei guo" to multi-generational TCusers is unlikely to be indistinguishable to recent immigrant SC users.The requirement that SC and TC be (mechanism ignored) recursivelyindistinguishable zones is an over-reaching application of the principlethat character-by-character there exists equivalencies between the SCand TC character repertoires.

Recommendation (e) is a restatement of the reality that there is noalias mechanism available, but it misses a larger opportunity, that morethan the incumbent ASCII ccTLD operator, and more than one policy model,are possible for countries with multiple scripts, languages, or evenjust variations with a difference such as the US-based SC/TC example. Itis unfortunate but there are many ccTLDs in which destruction ofminority languages is state policy. The US will not allow an SC or a TCvariant string for "mei guo" to entered into the IANA root, thoughChinese is the 3rd largest language community in the United States. Thesame situation exists in Canada. Further, common to both, neither statehas applied for the only non-Latin scripts which exist as the officialscripts of government, nor is likely to -- this refers to Northern andCherokee Syllabics.

While the problem of language suppression by Latin, and non-Latin Scriptusing governments may not easily fall within the white picket fence ofthe IDN ccTLD Fast Track Program, it is a problem that ICANN's IDNprogram must address, or consent to assisting nation states extinguishexisting languages.

I remain concerned that the Twomey invite letter allowed respondingASCII ccTLD operators and governments to disclose their script an stringpreferences, and alternatively, to keep their preferences confidential.The letter did not state that all responses would be held confidential,regardless of the desires of the responding ASCII ccTLD operators andgovernments, and that the disclosure of strings sought, and theresulting issues, such as the possibility of zero-width joiners orzero-width non-joiners, is still just guesswork.


Eric Brunner-Williams
in a personal capacity

<<< Chronological Index >>> <<< Thread Index >>>

Privacy Policy | Terms of Service | Cookies Policy