ICANN ICANN Email List Archives

[At-Large Advisory Committee]


<<< Chronological Index >>>    <<< Thread Index >>>

[alac] More or maybe less on IDNs

  • To: alac@xxxxxxxxx
  • Subject: [alac] More or maybe less on IDNs
  • From: John L <johnl@xxxxxxxx>
  • Date: Sun, 22 Oct 2006 21:06:35 -0400 (EDT)

I had a most interesting chat about IDNs with Paul Hoffman who tells me that the IDN situation is not as bleak as I had painted it. Basically, the Unicode crowd relented somewhat, and one of the aforementioned western white guys, John Klensin, wrote this draft:

http://www.ietf.org/internet-drafts/draft-klensin-idnabis-issues-00.txt

Previous IDN drafts said that IDNs could use every Unicode character except for a list of forbidden ones. This one takes a much less ambitious approach, flips it around, and lists the groups of characters that are permitted. That solves the upward compatibility problem since if future versions of Unicode add new characters that cause ambiguity, you won't be able to use them in IDNs, at least not until they add them to a future version of the permitted list. This doesn't deal with semantic ambiguity, where languages permit multiple ways to write the same character, e.g., German ae and a with an umlaut but those are apparently under control.

The tables of characters are supposed to be here:

http://www.ietf.org/internet-drafts/draft-faltstrom-idnabis-tables-00.txt

but it got line folded into illegibility so he'll have to upload a clean copy in the next day or two.

There's still some issues about bidirectional text. If you speak Yiddish or Dhivehi, the official language of the Maldives, you may want to check out this not by Harald Alvestrand and Cary Karp:

http://www.ietf.org/internet-drafts/draft-alvestrand-idna-bidi-00.txt

This approach seems a lot more likely to work than earlier ones. It may leave out some characters in obscure languages when there's a new version of Unicode, at least until they add them to the code point tables, but it does seem to address most of the problems that were tying the IDN crowd in knots.

R's,
John

Note to Vittorio: they have been thinking about version numbers. The prefix xn-- in front of every IDN is in practice a version number which they could roll to something like yn-- if they had to, but they think they can avoid doing so, thereby avoiding a lot of software upgrade pain.





<<< Chronological Index >>>    <<< Thread Index >>>

Privacy Policy | Terms of Service | Cookies Policy