Re: [gnso-idn-wg] One string per application
RE: One string per application This is one example of why it is absolutely crucial to involve language/script groups in the deployment of IDN gTLDs. Why? For Chinese language operators of the Han Unified character set in Unicode, they have sat around the table to discuss how to square off traditional and simplified characters and to avoid conflicts with Japanese Kanji and Korean Hanja who use the same Han Unified character set in Unicode. The JET have discussed this many years ago and documented their findings. Mainland China has already established their rules for deployment of .com, .china, .org in Chinese Han characters since 2003. The way Chinese Simplified character folks in Mainland China and Singapore, Chinese Traditional character folks in Taiwan, and Japanese Kanji users and Korean Hanja users deal with Han character variants, equivalents and whatever you want to call them, does not simply equate with how the Arabic script users square off with Persians script users in Iran, or how the Urdu folks in Pakistan and India use Arabic-based scripts. So to invent policy on IDN rollout to fit the Han characters and to fit the Arabic-based scripts or any other non-ASCII IDN strings, and everything else, by folks like us sitting in front of an ASCII keyboard or meeting in Lisbon, and expecting it to be legitimately acceptable by those who would be using the IDNs or by those who would be providing the services, is simply asking for trouble, not to mention being seen as naive. Single one-size-fits-all IDN policies will incur the ire of multiple groups of people and may cause outrage and international furore if we are not careful. And we can't hide from the excuse often cited "we cannot please them all, so let's just adopt our policy and see how that goes". To be sure, it will not go down well. I wish to apologise in advance if this sounds a bit harsh, but comments like "I donÂÂt see that a registry operator should be compelled to apply for, or be granted, every possible typographic variation of their chosen string." do not help, but cause more problems. It is almost equivalent (I stress almost and not exactly for technical reasons) to saying that one registry can register .seattle, but another competing registry can register .Seattle and yet another could try for .SEATTLE for English. For some, it is even more sensitive than this simple example. One solution to reconcile this with not compelling all registries to apply for and be granted every possible "typographic variation of their chosen string" would be for such variations to be reserved. In other words, no competing registry should be trying to pass off as a .ÃÃÃÃËD as .ÃÃÃÃÃÂ, or using a more dramatic but hypothetical example, if PR China gets .zhongguo(CHINA) in simplified characters, there may be an international diplomatic situation if Taiwan were to subsequently get .zhongguo in traditional characters. And since Iran is so topical, in the case of the example .IRAN in Farsi (Parsi or Persian), their TLD deployment has to cover both Arabic and Persian lookalikes. [xn--mgba3a4fra ] [xn--mgba3a4f16], so the IRNIC colleagues at the Institute for Studies in Theoretical Physics & Mathematics (IPM) say. So it is equally worth saying: It is worth combining the issue of being granted a particular TLD string, and the issue of strings that may be typographically similar *because the are one and the same thing in the case of .zhongguo or .iran.* And one would be equally comfortable saying: In the example quoted by William Tan, a registry operator could apply for one or both strings, and both will be equivalent, as equivalent as .net is the same as .NET or .NeT etc etc etc in the case of ASCII uppercase-lowercase equivalence. If the operator were granted both strings then the operator would not need to decide whether example.ÃÃÃÃËD and example.ÃÃÃÃÃÂ would map to the same nameserver or not - they would be the same. And I agree with Bruce Tonkin that it would be worth saying too, that: If the registry operator applied for only one string and was granted the string, then it would be impossible for another registry operator to gain approval for the other string given its typographical similarity/linguistic and semantic equivalence, and in fact, the registry operator should automatically get the other equivalent strings or combinations. So given all the diversity of languages and scripts, how is it possible for a single committee meeting in Lisbon, or exchanging emails in English, determine a basic set of universal principles without consultation with the users/experts or gaining an appreciation of the languages and scripts that the policy is planning to regulate successfully. If something that doesn't make sense in a language/script is going to be rolled out, like the current Unicode for Tamil which none of the Tamil Speaking software developers could really use or implement, they will just go look somewhere else. Since 1998 when I implemented IDNs out of Singapore, to the long drawn-out record-breaking IETF standards process ending in 2003 co-chaired by my former student James Seng, to the policy making committees from 2003 to date, the tardiness of the Internet process, exacerbated by the expectation of Internet speed in the global community, has already led to the Internet fragmentation which has already reported and well documented in ITU/UNESCO/MINC meetings http://www.ngp.org.sg/events/2nd_SEAGF/SEAGF-TanTinWee.pdf http://www.itu.int/ITU-T/worksem/multilingual/index.html and reported in Newsweek International. http://www.msnbc.msn.com/id/12666393/site/newsweek/ Imagine what would happen if we try to recommend and impose IDN deployment policies on communities that will question the legitimacy or even doubt the competence of this process that did not involve their language users, script experts or authorities or worse, a process that makes it a long drawn procedure to have their representatives join the policy making process? bestrgds tin wee Tan, William wrote: Bruce Tonkin wrote:It is worth separating the issue of being granted a particular TLD string, and the issue of strings that maybe be typographically similar.I agree that they are separate issues.
|