Re: [gnso-idn-wg] single script adherence across labels
- To: edmon@xxxxxxxxxxx
- Subject: Re: [gnso-idn-wg] single script adherence across labels
- From: Tan Tin Wee <tinwee@xxxxxxxxxxxxxx>
- Date: Sun, 11 Mar 2007 15:41:12 +0800
Good points, Edmon, we should reexamine the approach of
"single script adherence across all labels"
Some more reasons why...
Single script adherence vs Single language adherence
1.Same Script, Same or Different Language to different people
http://ååå.ææ; Unicode Han Characters can be Japanese.Chinese
single script, , but same/different language.
Single Script adherence across all labels conformed to.
2. Same Language different scripts
Japanese language as pointed out in the discussion,
naturally uses a combination of Kanji, Katakana, Hiragana
and Romaji. eg. http://æãæ.ææ;
So, something natural to them will trigger problems
under the single script rule
Registration may not be approved. End users astonished.
If we make it unlimited, then, as Edmon pointed out,
there can be more opportunities for phishing or passing off.
If taking the middle ground, if we have a a rule allowing a
limited reasonable N number of scripts that may fix this problem
but how would we determine N?
Consultation with end user communities is one way, but it
will be an arbitrary number which will make some happy
and make others upset (not that we have to please everyone,
but that we should take into consideration their situation)
How about we change the rule from "single script adherence"
to "single language adherence", whereby if one language
uses mixed scripts naturally, it is of course natural to allow
them to use mixed scripts in IDN labels, we won't get into this problem.
But it will still be a problem to predict when it will
cause a problem. So end user language community input would be
most invaluable, in addition to being transparent in our
policy making processes.
If more than one language community uses a particular script
or set of scripts, to resolve script conflicts, there has to
be consultative mechanisms.
For example ææè in ææè.ææ (meaning "Japanese Language" in
Chinese, Japanese and Korean), Japanese users might consider this
to be a completely Japanese TLD, but a Taiwanese person seeing è
in traditional Chinese would think immediately that is a Chinese
Traditional label. We have gone through this many years ago
and JET and their members have even written an IETF RFC.
Edmon will tell you more since he went through that process.
JET is one example of how JDNA (Japanese Domain Name consortium),
KRNIC and CDNC people sit down and sort things out.
So we should either don't make any such rules, and leave it to delegated
parties to figure out whatever scripts they want in their repertoire,
or pull in specific language committees to make meaningful policy input.
What works in one language, or one set of languages may not work
for other groups of languages.
Edmon Chung wrote:
As mentioned in the call, I do not believe that the blanket enforcement
for "single script adherence across all labels" is a good approach, as I
do not believe this is the end user expectation, nor is it prudent for
an IDN TLD or LDH TLD registry to arbitrarily determine. Certain
registries may decide to actually enforce it by their means, and that
may be a model they can pursue. However, enforcing it across all IDN
TLDs (and non-IDN TLDs for that matter) would not in my mind be appropriate.
To give a few examples:
1. LDH TLD with IDN 2LD:
This is a rather obvious case. Users today expect IDN.LDH.
2. IDN TLD with LDH 2LD:
http://dominio.espaÃa is a perfectly sensible "single script" domain.
However, in general IDN implementations an LDH-only domain is not
usually associated with a specific IDN-script/language-tag. This means
that the registry would treat the domain as having mixed script. Of
course this can be remedied by forcing registrants to associate a tag
for LDH-only domains as well, or the registry can try to intelligently
guess that they are supposed to be of the same script/language, but
personally, I think it is better that we allow such registration as a
practically mixed script registration.
3. IDN TLD with IDN 2LD (and 3LD)
- http://èæ.CafÃ; <http://èæ.CafÃ;> --[chinese.french]
- http://ExposÃ.ååå; <http://ExposÃ.ååå;> --[french.chinese]
- http://éæ.æãæ.CafÃ; <http://éæ.æãæ.CafÃ;>
All of the above make a lot of sense to me and are in themselves useful
urls and domains. They also are not abusive cases.
Furthermore, as also mentioned in the call, I believe it could be a good
model for certain TLDs in the future to serve the same zone for all its
"matching" IDN TLDs. Taking the above example for a ".CafÃ" TLD:
- http://ExposÃ.ååå; <http://ExposÃ.ååå;>
- http://ExposÃ.åèå; <http://ExposÃ.åèå;>
This would be a very interesting and good model I believe to explore.
Although of course this is not a necessary model (or perhaps an
appropriate one for that matter) for all TLDs.
I therefore think rather than suggest or enforce a single script
adherence across labels, the focus should be on guidelines and policies
for curbing phishing attacks and other abusive registrations by variant
policies and other measures. For example, it is not impossible for a
TLD to adopt a policy where the Greek letter <Alpha> is a variant of the
English letter <A>.
*From:* owner-gnso-idn-wg@xxxxxxxxx [mailto:owner-gnso-idn-wg@xxxxxxxxx]
*On Behalf Of *Tan, William
I think there are two issues whenever we discuss the topic of
"single-script adherence" and I asked for clarification on the last
teleconference. However, I suspect we still have not grounded the
discussions on one or the other. To be clear, there are two possible way
one can interpret "single-script adherence across all labels":
1. Every label in a domain name string is composed of characters from a
single script. However, one label may belong to a different script than
another. E.g. ããã.espaÃa - there are two labels with one containing
only Katakana and the containing only Latin.
2. All characters in every label of a domain name string is composed of
characters from a single script. The example above ããã.espaÃa would
be violating this policy. OTOH, ããã.ããããã would be ok since both
labels are Katakana.
We need to make it clear in our recommendations if we mean either 1 or 2
#1 above has already been somewhat covered by the ICANN IDN Guidelines.
I don't think anyone would argue against this. Whether it could/should
be enforced as a contractual requirement for new TLDs is up for discussion.
#2 is what I believe we have been discussing on the call and the list. I
am of the view that restrictions should be applied using "SHOULD"
language, just so as to discourage abuse. I'm sitting on the fence as
far as whether we should enforce it.