ICANN ICANN Email List Archives

[ssac-gnso-irdwg]


<<< Chronological Index >>>    <<< Thread Index >>>

Re: [ssac-gnso-irdwg] Draft: Questions for ICANN IDN staff - Tina Dam - from the WhoIs IRD WG

  • To: "Robert C. Hutchinson" <rchutch@xxxxxxxxx>, Ird <ssac-gnso-irdwg@xxxxxxxxx>
  • Subject: Re: [ssac-gnso-irdwg] Draft: Questions for ICANN IDN staff - Tina Dam - from the WhoIs IRD WG
  • From: Dave Piscitello <dave.piscitello@xxxxxxxxx>
  • Date: Tue, 25 Jan 2011 04:20:40 -0800

Hi all, 

Again, apologies for missing yesterday's call.

I have a question related to this discussion. In composing language tables
with "legitimate" characters for a language, I began to wonder whether there
are real world constraints on mixed scripts in the composition of names.

For example, can a US citizen have a birth certificate where the given or
surname contains letters other than A-Z? I believe a US citizen can have a
name containing characters from extended ASCII sets (umlauts, tildes, etc).
People often name their children unconventionally: could someone compose a
name for my child that contained both an umlaut and tilde?) and would this
be accepted as a legal name in the US (or other country)? Would a "yes"
answer to these questions influence this discussion?

Can a Chinese citizen have a surname that is composed of characters from one
accepted Chinese script and a given name composed using characters from a
second? 

Apologies if this is off topic. Feel free to send me away for more coffee.

On 1/25/11 4:12 AM, "Robert C. Hutchinson" <rchutch@xxxxxxxxx> wrote:

> Hello WhoIs IRD WG,
> Here is my suggested questions for discussion between the Whois IRD WG and
> ICANN IDN Staff / Tina Dam.
> Reply with your clarifications and suggestions.
> Thanks,
> Bob Hutchinson
> 
> 
> The WhoIs IRD WG is requesting expertise/assistance from the IDN team.
> The WhoIs IRD WG is considering recommending that WhoIs Internationalized
> Domain name registrant data [name and address] for owner and contact be tagged
> with language.   Furthermore, it would be advantageous to constrain the
> content of language tagged fields to only the legitimate characters of the
> tagged language.   Ideally we would like to locate existing UTF-8 language
> tables and reference them, rather than creating "ICANN WHOIS language tables".
> 
> Based on reviewing the  IDN ccTLD Fast-Track Workshop slides,
> http://sel.icann.org/node/6740/,  the IDN team addressed similar issues
> surrounding the use of scripts, languages and character sets.
> Apparently the IDN team decided that each TLD/registry would define the
> language character sets acceptable for 2nd-level domain names.  Those files
> are stored at IANA:  http://www.iana.org/domains/idn-tables/  and reference
> linked character code pages.  This system provides the flexibility for each
> TLD to define each language, but has the disadvantage [for example] of
> defining the Swedish character set three different ways.
> 
> We would like to invite members of the IDN team to discuss the following
> questions with the Whois IRD WG:
> 1) Given the current state of IDN language definitions ­ are there
> ways/suggestions that the existing IANA-IDN language definitions could be
> leveraged to help with WhoIs  IRD?
> 2) Did the IDN team explore or select a suitable established ³standard²
> language tags/code? Like ISO 639-3
> http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes  for  designating which
> language a domain name [TLD or second-level] is encoded in?
> 3)  Are there other [ISO{8859/2022}/HTML?] language code page standards which
> are UTF-8 based, which could be used/leveraged to easily define WhoIs IRD
> language character sets?
> 4) Help?  Any suggestions are greatly appreciated.





<<< Chronological Index >>>    <<< Thread Index >>>

Privacy Policy | Terms of Service | Cookies Policy