ICANN Email Archives: [ssac-gnso-irdwg]

ICANN ICANN Email List Archives

[ssac-gnso-irdwg]

<<< Chronological Index >>> <<< Thread Index >>>

Re: [ssac-gnso-irdwg] Draft: Questions for ICANN IDN staff - Tina Dam - from the WhoIs IRD WG

To: "Robert C. Hutchinson" <rchutch@xxxxxxxxx>, Ird <ssac-gnso-irdwg@xxxxxxxxx>
Subject: Re: [ssac-gnso-irdwg] Draft: Questions for ICANN IDN staff - Tina Dam - from the WhoIs IRD WG
From: Steve Sheng <steve.sheng@xxxxxxxxx>
Date: Tue, 8 Feb 2011 11:45:13 -0800

Hi Robert, I have checked your question with Kim Davies, manager of root zone 
services. Here is what he said:


 *   IANA uses RFC 5646 (Tags for Identifying Languages) for tagging of IDNs in 
the fast track process (see the Root DB for IDN tags 
http://www.iana.org/domains/root/db/). RFC 5646 is a superset of ISO 639-2 
(tagging languages), and ISO 15924 (tagging scripts).
 *   RFC 5646 is also used by other protocols to indicate languages (e.g. HTTP).
 *   The database for RFC 5646 is can be found here: 
http://www.iana.org/assignments/language-subtag-registry, and rfc 5646 explains 
how to process tags using the database.

Hope this helps. If you have any other questions, please don't hesitate to ask.

Warm regards,
Steve


On 1/25/11 1:12 AM, "Robert C. Hutchinson" <rchutch@xxxxxxxxx> wrote:

Hello WhoIs IRD WG,
Here is my suggested questions for discussion between the Whois IRD WG and 
ICANN IDN Staff / Tina Dam.
Reply with your clarifications and suggestions.
Thanks,
Bob Hutchinson




The WhoIs IRD WG is requesting expertise/assistance from the IDN team.
The WhoIs IRD WG is considering recommending that WhoIs Internationalized 
Domain name registrant data [name and address] for owner and contact be tagged 
with language.   Furthermore, it would be advantageous to constrain the content 
of language tagged fields to only the legitimate characters of the tagged 
language.   Ideally we would like to locate existing UTF-8 language tables and 
reference them, rather than creating "ICANN WHOIS language tables".


Based on reviewing the  IDN ccTLD Fast-Track Workshop slides,  
http://sel.icann.org/node/6740/,  the IDN team addressed similar issues 
surrounding the use of scripts, languages and character sets.
Apparently the IDN team decided that each TLD/registry would define the 
language character sets acceptable for 2nd-level domain names.  Those files are 
stored at IANA:  http://www.iana.org/domains/idn-tables/  and reference linked 
character code pages.  This system provides the flexibility for each TLD to 
define each language, but has the disadvantage [for example] of defining the 
Swedish character set three different ways.



We would like to invite members of the IDN team to discuss the following 
questions with the Whois IRD WG:
1) Given the current state of IDN language definitions - are there 
ways/suggestions that the existing IANA-IDN language definitions could be 
leveraged to help with WhoIs  IRD?
2) Did the IDN team explore or select a suitable established "standard" 
language tags/code? Like ISO 639-3   
http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes  for  designating which 
language a domain name [TLD or second-level] is encoded in?
3)  Are there other [ISO{8859/2022}/HTML?] language code page standards which 
are UTF-8 based, which could be used/leveraged to easily define WhoIs IRD 
language character sets?
4) Help?  Any suggestions are greatly appreciated.

References:
- [ssac-gnso-irdwg] Draft: Questions for ICANN IDN staff - Tina Dam - from the WhoIs IRD WG
  - From: Robert C. Hutchinson

<<< Chronological Index >>> <<< Thread Index >>>

Privacy Policy | Terms of Service | Cookies Policy