ICANN ICANN Email List Archives

[gnso-idn-wg]


<<< Chronological Index >>>    <<< Thread Index >>>

Re: [gnso-idn-wg] Re: Banning CCHH anywhere in a label

  • To: edmon@xxxxxxxxxxx
  • Subject: Re: [gnso-idn-wg] Re: Banning CCHH anywhere in a label
  • From: Tan Tin Wee <tinwee@xxxxxxxxxxxxxx>
  • Date: Wed, 07 Mar 2007 08:06:00 +0800

Ram Mohan wrote:
>>  > Are you saying â something like <*CITIBANKchina.TLD*> where âchinaâ is
>>  > in local script while CITIBANK is in Latin script should be banned,
>>  > because its Punycode translation would result in an <xn--> midway
>>  > through the string?

I agree with the comments made so far.
xn-- in the case mentioned by Ram won't happen in the
current way Punycode works as William and Edmon pointed out.

Having said that, I agree that for the moment we may not want to
add more complication by recommending to split
the IDN label with xn-- embedded inside
because xn-- can occur in punycode within a label like (using Ram's example
and modifying it...)
citibankxn-<China> will appear as xn--citibankxn--b28qq03g
(e.g. http://mct.verisign-grs.com/conversiontool/convertServlet?input=xn--citibankxn--b28qq03g&type=PUNYCODE
converting from Punycode xn--citibankxn--b28qq03g
to Unicode: citibankxn-äå or use http://www.afilias.info/cgi-bin/convert_punycode.cgi)
...
which I think was the nub of Shahram's point:
> <CCHH>citibank-<CCHH><encodedCHINA>.tld


Of course, xn-- at the prefix will cause the rest of the label
"citibankxn--b28qq03g" to be processed as such, but still
xn-- as mentioned by Sophia will pop up here and there by
accident or by deliberate design by non-bonafide registrants.

I think what Sophia meant which Ram misunderstood was for
some mechanism to trap xn-- inside labels to ensure
that for instance, it doesn't confuse software programmers with sloppy
programming that picks out xn-- inside an xn-- prefixed
string (non-greedy algorithm) like in the case I mentioned,
and display the wrong IDN label; or that with the mixed
scripts thing, if we don't look carefully in the xn-- or CCHH
issue, if the next Unicode version pops up that
is of enough drastic change, and we need to migrate,
and in the process change xn-- to some other CCHH
for example by way of illustration, we may lose
the option if xm-- or xe-- etc was already registered as
axe-äå with xn--axe--3f5fw08b at the back end encoding
or AXN-äå  with xn--axn--3f5fw08b which is a conceivable
registration by the AXN satellite channel.
OR in cases of spoofing or passing off by confusing people with
citibankxn-äå and citibank.xn-äå which
look pretty close, that may get punycoded to
http://mct.verisign-grs.com/conversiontool/convertServlet?input=citibankxn-%E4%B8%AD%E5%9B%BD&type=UTF8
xn--citibankxn--b28qq03g
and
http://mct.verisign-grs.com/conversiontool/convertServlet?input=citibank.xn-%E4%B8%AD%E5%9B%BD&type=UTF8
citibank.xn--xn--x68dy61b
respectively.
Try these two labels with Affilias converter and the second one
will generate block,
while http://www.nameisp.com/punycode.asp will work just like the
verisign converter... So these are programming variations
we may need to follow though.

If we recommend against AXN-äå because it generates a potentially
confusing xn-- string inside a punycode label, then AXNäå could
be an option, as it will generate xn--axn-x68dy61b, which is xn-
and not xn--.

Finally,
Edmon Chung wrote:
> Nevertheless, with regards to our discussion at hand, I am quite certain
> we have comprehensive protection with the CCHH reserved as a prefix.

Yes, I suspect this might be the case, but somebody might want to get a
team of programmers to run a check on some test cases. Does anyone know if
this kind of scenario is being tested at the moment in
the ICANN testing contract?

bestrgds

tin wee


Edmon Chung wrote:
Hi Shahram,



There was an extensive discussion in the original IDN protocol development about the use of the prefix (or suffix or other possible identifiers), and finally CCHH was chosen. I highly doubt that we would be choosing a scheme that would split up a label (for many good reasons including bidi and single script considerations) into different chunks with different prefixes, but no one can predict the future I suppose :-)



Nevertheless, with regards to our discussion at hand, I am quite certain we have comprehensive protection with the CCHH reserved as a prefix.



Edmon











*From:* owner-gnso-idn-wg@xxxxxxxxx [mailto:owner-gnso-idn-wg@xxxxxxxxx] *On Behalf Of *Shahram Soboutipour
*Sent:* Tuesday, March 06, 2007 4:50 PM
*To:* owner-gnso-idn-wg@xxxxxxxxx
*Cc:* 'Sophia Bekele'; gnso-idn-wg@xxxxxxxxx; gnso-rn-wg@xxxxxxxxx
*Subject:* RE: [gnso-idn-wg] Re: Banning CCHH anywhere in a label




Dear Edmon



Regarding the sample CITIBANKchina.TLD (where china is in Chinese charset), I think there is a 3^rd possibility which might be Sophiaâs idea:

<CCHH>citibank-<CCHH><encodedCHINA>.tld

It means that every separate part of a label in non-ascii strings be translated with a CCHH at first. I am not sure if there is a rule for this right now or not, but I myself do not agree with this type. I prefer <CCHH>citibank-<encodedCHINA>.tld cause:

1. I think there is enough space for possible further changes and developments in IDNA standard in CC part of CCHH, so there must be no worries.

2. the CCHH (at first) is a good rule to define an IDN , and I think it can be a rule in all the levels of a url (not only 2^nd and 3^rd ) but seems higher levels other than 3^rd are out of scope of ICANNâs policy, BUT must be mentioned in their own technical decision makings.





Regards,



/*Shahram Soboutipour*/ <BLOCKED::mailto:soboutipour@xxxxxxxxxxx>

*President and CEO*

*Karmania Media* <BLOCKED::http://www.karmania.ir/>

Tel: +98 341 2117844,5

Mobile: +98 913 1416626

Fax: +98 341 2117851

-----Original Message-----
From: owner-gnso-idn-wg@xxxxxxxxx [mailto:owner-gnso-idn-wg@xxxxxxxxx] On Behalf Of Edmon Chung
Sent: Tuesday, March 06, 2007 6:09 AM
To: 'Tan, William'; rmohan@xxxxxxxxxxxx
Cc: 'Sophia B'; gnso-idn-wg@xxxxxxxxx; gnso-rn-wg@xxxxxxxxx
Subject: RE: [gnso-idn-wg] Re: Banning CCHH anywhere in a label




I dont think you are missing anything William.

Was trying to speak up during the call earlier, I dont think the concern Sophia was articulating should be an issue.



If I am not mistaken, Sophia was asking whether it would be necessary to reserve names such as:



abc<CCHH>xyz.tld



These names would NOT be considered IDN nor parts of which IDN, but are simply ASCII domains. <CCHH> can be best seen as a prefix to denote that a domain label (i.e. between two dots) has at least one non LDH (letter-digit-hyphen) character.



Using the example described:



citibank<CHINA>.tld



where <CHINA> is in Chinese, William's explanation is correct, it should become:



<CCHH>citibank-<encodedCHINA>.tld



And NOT



Citibank<CCHH><encodedCHINA>.tld



So, by reserving <CCHH> at the front (i.e. first 4 characters, or more precisely, hyphens in the third and fourth character> we cover all cases of intended IDN expressions.



Edmon









-----Original Message-----

From: owner-gnso-idn-wg@xxxxxxxxx [mailto:owner-gnso-idn-wg@xxxxxxxxx] On

Behalf Of Tan, William

Sent: Tuesday, March 06, 2007 7:47 AM

To: rmohan@xxxxxxxxxxxx

Cc: 'Sophia B'; gnso-idn-wg@xxxxxxxxx; gnso-rn-wg@xxxxxxxxx

Subject: Re: [gnso-idn-wg] Re: Banning CCHH anywhere in a label



Hi all,



I believe the motivations behind banning strings with hyphens in the

*third *and *fourth *positions are:

1. to protect registries who do not offer IDN registrations from

unknowingly registering IDNs; and

2. to reserve future revisions to the IDNA standard where a different

prefix might be assigned.





Ram Mohan wrote:

>

> Are you saying â something like <*CITIBANKchina.TLD*> where âchinaâ is

> in local script while CITIBANK is in Latin script should be banned,

> because its Punycode translation would result in an <xn--> midway

> through the string?

>

I'm not sure I follow this. CITIBANKchina.TLD would translate to

xn--citibank-encodedchunk.TLD, so xn-- would not occur midway in the ACE

string.



> In general, the rationale for banning âCCHHâ at a position other than

> the beginning of a string/label is unclear.

I have not seen any documents that suggest banning CCHH at anything but

the beginning of a string. Am I missing something?





Sophia said:

> All registrations should

> be in the IDN label, and that the ACE label should be internal to the

> operations of the registration. *One should not be offering to

> register xn--.... as a label or any ACE label since it is an internal

> encoding, so as to prevent confusion and other malfeasance (phishing)*.

Many registries today use the ACE string at the registration protocol

level, so your statement would essentially be advising against that

practice. Personally, I don't think it is a problem unless the registry

does NOT offer IDN and is accepting xn-- labels (in which case it

probably simply treats the registration as ASCII and does not check for

IDNA validity.) We may be in agreement here, but I wanted to further

qualify your statement.





In table 4.4 of "Recommendation Tables for RN-WG Reports.doc":

> For each IDN gTLD proposed, applicant must provide both the "ASCII

> compatible (ACE) form of an IDNA valid string" (âA-labelâ) and in

> local script form (Unicode) of the top level domain (âU-labelâ).

I would also add that the applicant should provide additional strings

that, after applying IDNA ToASCII operation, result in the A-label.



Additionally, there may also be complications where the U-label could be

entered into an application using an input method editor ("keyboard")

that may produce a sequence of Unicode characters that may not convert

to the A-label (either becomes a different A-label or fails conversion.)

This may be due to user perception that a character is what one thinks

it is, but when entered using the local input software produces a

different character due to locale differences. I will try to dig up some

examples. This is not a technical / policy issue, but is a usability

issue that affects the stability of IDNs.







Best,



=wil








<<< Chronological Index >>>    <<< Thread Index >>>

Privacy Policy | Terms of Service | Cookies Policy