Re: Cost of Transliteration from ASCII to IDNs
At the Paris meeting of staff and the Registry Constituency the issue of equivalent labels was brought up by Edmon Chung, and the response by staff was not as reflective as I'd hoped for.
The immediate issue then was, as everyone who worked through the Simplified Chinese and Traditional Chinese mapping issue in the 2000-2003 IETF IDN Working Group should recall, that Chinese literacy for a substantial portion of the CJK Script user base means simultaneous literacy in Traditional and Simplified Chinese, so any two character "word" can be written in four distinct forms without any change of meaning.
To use the standard UPPER/lower convention to represent TRADITIONAL/simplified Chinese, the problem faced by the CDNC set of registries was that any AB label was, to most DNS users, visually distinguishable from the labels Ab, aB, and ab, and that all of these labels are equivalent. The problem was not resolved in IDNA2003, and was resolved in RFC 4713, Registration and Administration Recommendations for Chinese Domain Names.
Edmon's point was if there is an application for a Chinese name, and the name consists of two characters, a reasonable assumption, then o the application for AB also submits three additional applications for Ab, aB, and ab, for a total of $740,000, and a five year commitment for $1,500,000, plus all the insurance and other minor, but non-negligible costs, or risks that o some other applicant submits one or more applications for Ab, aB, and ab, in the same or subsequent rounds, with the obvious application costs.
The staff response was "its business, pay up", which misses several points I hope to make here.
RFC 2826 IAB Technical Comment on the Unique DNS Root means something. We are not benefited by a wide variety of intentional exploits of non-uniqueness, from "alternate roots" to "browser interposition" to "NXDOMAIN redirection" to "Fast Flux" to ... There are a very, very few instances of intentional temporal, or spatial incoherence in the DNS that pass the sniff test -- load balancing via the DNS being the notable exception to a generally very, very bad practice.
I mention this because I cannot see how we are benefited by, nor can I see how "its business, pay up", is anything but, intentional incoherence in the DNS. If Ab and aB and AB and ab are A records that map to different resources, how is this any different from "alternate roots" or "browser interposition" or "NXDOMAIN redirection" or ...? Is it different because the affect class of users are "far away" from Marina del Rey?
So, I suggest that the better answer to Edmon's question was "Good point, lets see if we can't make policy that is (a) Unicode aware, (b) CJK aware, and (c) modern literacy aware."
But the problem of inadvertent creation of incoherence equivalent to, as far as the end users are concerned, with "alternate roots" or "browser interposition" or "NXDOMAIN redirection" or ... is not confined to the SC/TC mapping problem.
Suppose what staff understood was that Edmon was asking for "ASIA" in all the scripts and languages of Asia at little or no cost, a possibility given the interest in understanding what rights, if any, VGRS and PIR have for "COM" and "NET" and "ORG" in scripts other than Latin.
However, there is a case where the current pricing risks introducing intentional incoherence. An application for "AFRICA" will be submitted in the current round. There may be others, that is not the point. .africa will be added to the root under current draft contention rules. What do we, the ICANN community, responsible for writing the rules, and we, the internet users, benefit by Arabic Script Africans getting a different "AFRICA" than Latin Script Africans? Will Anglophone Africans get a different AFRICA than fracophone Africans? Are we imposing a second and subsequenet application costs because we need the money? Are we imposing a second and subsequent application costs because the applicant needs to discharge the money? Do we claim that Africa "in Arabic" and Africa "in English" and Africa "in French" and ... are usefully different things, and usefully correspond to different underlying resources? As the UTC scripts group adds African Scripts to Unicode does the answer change?
This brings me to Ron Andruff's "Cost of Transliteration from ASCII to IDNs". A project to serve a mono-lingual demographic is inherently cheaper under ICANN's draft rules, than a project to serve a multi-lingual demographic. This isn't the right design goal for the rules. We should have a good answer for the SC/TC problem at the IANA registry level, where ICANN makes policy. We should have a good answer for the multi-script problem generally, a better answer than "pony up". And we should be able to distinguish between need, where additional cost is a barrier, and opportunity, where additional cost is not.
Eric Brunner-Williams as an individual