<<<
Chronological Index
>>> <<<
Thread Index
>>>
Re: [ssac-gnso-irdwg] Draft: Questions for ICANN IDN staff - Tina Dam - from the WhoIs IRD WG
- To: "Robert C. Hutchinson" <rchutch@xxxxxxxxx>, Ird <ssac-gnso-irdwg@xxxxxxxxx>
- Subject: Re: [ssac-gnso-irdwg] Draft: Questions for ICANN IDN staff - Tina Dam - from the WhoIs IRD WG
- From: Steve Sheng <steve.sheng@xxxxxxxxx>
- Date: Tue, 8 Feb 2011 11:45:13 -0800
Hi Robert, I have checked your question with Kim Davies, manager of root zone
services. Here is what he said:
* IANA uses RFC 5646 (Tags for Identifying Languages) for tagging of IDNs in
the fast track process (see the Root DB for IDN tags
http://www.iana.org/domains/root/db/). RFC 5646 is a superset of ISO 639-2
(tagging languages), and ISO 15924 (tagging scripts).
* RFC 5646 is also used by other protocols to indicate languages (e.g. HTTP).
* The database for RFC 5646 is can be found here:
http://www.iana.org/assignments/language-subtag-registry, and rfc 5646 explains
how to process tags using the database.
Hope this helps. If you have any other questions, please don't hesitate to ask.
Warm regards,
Steve
On 1/25/11 1:12 AM, "Robert C. Hutchinson" <rchutch@xxxxxxxxx> wrote:
Hello WhoIs IRD WG,
Here is my suggested questions for discussion between the Whois IRD WG and
ICANN IDN Staff / Tina Dam.
Reply with your clarifications and suggestions.
Thanks,
Bob Hutchinson
The WhoIs IRD WG is requesting expertise/assistance from the IDN team.
The WhoIs IRD WG is considering recommending that WhoIs Internationalized
Domain name registrant data [name and address] for owner and contact be tagged
with language. Furthermore, it would be advantageous to constrain the content
of language tagged fields to only the legitimate characters of the tagged
language. Ideally we would like to locate existing UTF-8 language tables and
reference them, rather than creating "ICANN WHOIS language tables".
Based on reviewing the IDN ccTLD Fast-Track Workshop slides,
http://sel.icann.org/node/6740/, the IDN team addressed similar issues
surrounding the use of scripts, languages and character sets.
Apparently the IDN team decided that each TLD/registry would define the
language character sets acceptable for 2nd-level domain names. Those files are
stored at IANA: http://www.iana.org/domains/idn-tables/ and reference linked
character code pages. This system provides the flexibility for each TLD to
define each language, but has the disadvantage [for example] of defining the
Swedish character set three different ways.
We would like to invite members of the IDN team to discuss the following
questions with the Whois IRD WG:
1) Given the current state of IDN language definitions - are there
ways/suggestions that the existing IANA-IDN language definitions could be
leveraged to help with WhoIs IRD?
2) Did the IDN team explore or select a suitable established "standard"
language tags/code? Like ISO 639-3
http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes for designating which
language a domain name [TLD or second-level] is encoded in?
3) Are there other [ISO{8859/2022}/HTML?] language code page standards which
are UTF-8 based, which could be used/leveraged to easily define WhoIs IRD
language character sets?
4) Help? Any suggestions are greatly appreciated.
<<<
Chronological Index
>>> <<<
Thread Index
>>>
|