Re: [gnso-idng] Draft on String Similarity
Thanks Eric. I have a (rather crude) initial reaction: this email is way too long! If you want anyone on the Council to actually take the time to sink their teeth into this, I suggest we find a way to cut this down a lot. Otherwise, the expertise that has gone into drafting this message to Council will be lost on the people we are hoping will read it. Thanks, Stéphane Le 11 déc. 2009 à 15:11, Eric Brunner-Williams a écrit : > > Councilors, > > During the past weeks the participants in the gnso-idng@xxxxxxxxx mailing > list (IDNG) have discussed, on the mailing list and in conference calls, > aspects of the situation which exists following the Board's vote at Seoul. > > One area of discussion which raises a policy issue is confusingly similar > strings. Because this seems an area where the obvious right thing has already > been done we need to draw attention to two aspects which have been overlooked. > > First, the current meaning of "similar" is now broader than "visual > similarity", and appears to include "meaning". > > Second, the underlying assumption in the evaluation process is that each > evaluation is independent of all other evaluations. > > These, a rule (about a string in an application) and a meta-rule (about all > applications), have a consequence which we suggest is not desirable. > > In the following example we use "china" and "中国" (chung guo), simply as a > well-known example of two strings. We ignore that both are part of the > universe of strings more likely to pertain to ccTLD and their policy body > than the GNSO, for the purpose of illustrating the policy problem. > > The string "china" and "中国" (chung guo) are "similar" in meaning, therefor > they form a contention set. Under the current rules in DAGv3, only one > application who's string is a member of a contention set may proceed towards > delegation. Whether the choice is by order of creation, or amongst > contemporaries, by community evaluation and/or auction, the result is the > same. One member of an (extended, in the sense of including existing > registries) contention set thrives. All others fail. > > This is the proper and correct end, except for one case which is more likely > to exist for applications for IDN strings than for restricted ASCII (letters, > digits, hyphen) strings. That case is where two, or more, applications for > similar strings are advanced by a single applicant, or two or more > cooperating applicants. > > Returning to our "china" and "中国" (chung guo) example, if XYZ Co. applied for > both "china" (application #1) and "中国" (application #2), > the current rules can not allow both strings to exist in the root, though > both are brought by the same applicant. > > The fundamental rational is that confusion is harmful. This rational is not > universally correct. There are instances where confusion results in no harm, > and more importantly, where "confusion" creates benefit. > > Because "beneficial confusion" is not obvious to users of Latin Script, an > example, we offer the original example of cooperation among "applicants" to > benefit their registrants and users, through "similarity". > > In 2001, the registries for China, Taiwan, Hong Kong and Macao discussed > cooperation so that mixing of Simplified Chinese, prevalent in China, and > Traditional Chinese, prevalent in Taiwan, but interchangeable without loss of > meaning, would not result in user confusion. These "applicants" cooperated to > create "beneficial confusion", so that "similar strings" actually had similar > meaning, that is, resolved as expected by their user community. > > No user "confusion" resulted from this multi-applicant cooperation, except > perhaps in Marina del Rey. > > Coordination to create "beneficial confusion" may exist where one applicant > submits two or more applications, as in the "china" and "中 国" (chung guo) > example, or where two or more applicants submit two or more applications, as > the four cooperating Chinese registries did, almost a decade ago. > > It is possible that applicants for two or more similar strings could, upon > failure, resort to extended evaluation, where the cause of the failure is > similarity with an existing TLD. Present registries seeking similar IDN > delegations could simply cost in the extended evaluation cost as part of the > application cost. This is inelegant, but not fatally so. > > Unfortunately, for applicants simply seeking two or more delegations with > similar meaning, independent of script, as in the "china" and "中 国" (chung > guo) example, initial evaluation failure and extended evaluation are not > available. The contention set consisting of two strings and one actual > applicant go to auction, with absurd outcome from the business perspective, > and tragic outcome from the language perspective, as one script choice > eliminates all others, for some meaning defined construction of "similarity". > > We suggest the Council consider the following to cure this defect. > > 1. that the meta-rule that all applications are independently evaluated be > modified so that cooperating applications may, if the contention set they > form contains no non-cooperating applications, proceed in the evaluation > process. > > 2. that the rule that applications for strings which are "confusingly > similar" to existing registries, where the application is brought by the > existing registry to which "confusing similarity" exists, be modified so that > these applications do not fail the initial evaluation and require extended > evaluation, or some other heroic measure. > > Both of these recommendations have generalizations. > > The independent applications presumption overlooks the certainty that the > legacy operator and application authors of 2000 and 2003-2006, and new > authors, will each author multiple 2010 applications, some of which have > common characteristics, such as "similarity" and "cooperation", and absent a > mechanism to "signal" the similarity to the evaluation process, adverse > outcomes and process inefficiencies are certain. > > A property common to two or more applications should be discoverable by the > evaluation process, especially when the applicants desire this common > property to be known to the evaluation process. > > The presumption of user confusion and harm where similarity exists overlooks > the utility of similarity, and more profoundly, substitutes a Marina del Rey > centric meaning and utility test for the meanings and uses that exist at > large. > > Correct use by two or more applications, that is, a property common to only > the strings sought by two or more applications, can, and should be > discoverable by the evaluation process. > > The Council need not necessarily look to the generalizations of the > modifications of rule and meta-rule we suggest to cure the anticipated > problem of similar strings in IDN applications, but should it do so, the IDNG > participants are prepared to further inform the Council. > > As it stands, the IDNG participants now understand the unintended effect of > the "similarity" rule, and the "independent application" meta-rule, to be > that only one of the six UN languages may be used for any identifier in the > root, with adverse consequences for cooperation and harmonization of > operational practice and user expectations. > > The IDNG participants thank the Council for its time and attention > considering the its initial work product. > > > This ends the draft. Chuck, Edmond, Avri, edit to your heart's content. > > Eric > > Attachment:
smime.p7s
|