Unicode Security Subcommittee comments on ICANN Draft Revised IDN Guidelines v2.0
The Unicode security subcommittee welcomes the progress that is being made on the ICANN "Guidelines for the Implementation of Internationalized Domain Names" (http://icann.org/general/idn-guidelines-20sep05.htm). There have been important improvements in the text that should significantly contribute to the security of IDN.
However, there are a number of areas where the text needs further improvement. First, it should be better coordinated with the work that the Unicode Consortium is doing in Unicode Technical Report #36: Unicode Security Considerations (http://www.unicode.org/reports/tr36/) and Unicode Technical Standard #39: Unicode Security Mechanisms (http://www.unicode.org/reports/tr39/). In addition, the current draft of the ICANN guidelines is progressing towards a script-based, rather than language-based approach to character repertoires. This is going in the right direction, but needs to go further, as outlined in our comment 1. The proposed guidelines are also better defined in their approach to punctuation and other symbols, but also do not go far enough; see our comment 3.
Our proposals for changes in the text are listed below.
1. In Clause 3, replace the use of "language" with "script" as an appropriate designator. Then allow for restrictions of the character repertoire within the script. In concert with this, remove the text at the end: "Unless there is need to associate individual labels in an IDN with different scripts, even where script-based policies are otherwise applied, ...to the new situation." and remove the 2nd paragraph of the additional remarks, since it is no longer relevant.
Rationale: Clause 3 still refers to languages as an acceptable designator to determine a character set suitable for IDN registration. The difficulty in using language for such purpose is exposed in Annex G of the Unicode Technical Report #36: http://www.unicode.org/reports/tr36/tr36-4.html#Language_Based_Security.
2. Replace "Unicode Technical Report #23 (http://www.unicode.org/erports/tr23)" by "Unicode Standard Annex #24: Script Names (http://www.unicode.org/reports/tr24", and replace "UTR#23" by "UAX #24"
Rationale: From context this simply appears citing the wrong document (and misspelling the URL). Note that the two documents have different status: #24 is a UAX, whereas #23 is a UTR.
3. Replace "Exception to this is permissible for languages with established orthographies and conventions that require the commingled use of multiple scripts. Visually confusable characters from different scripts must not appear in a single label unless there are overriding legitimate linguistic reasons for doing so." by references to the visual confusability tests in UTS #39.
Rationale: As it stands, these subclauses are essentially impossible to test for in practice. A registry cannot hand-examine all registration proposals; there must be a clear, mechanical test for validity.
Rationale: Clause 4 in many aspect weakens the punctuation restriction from the previous version of the guidelines as it move the text from informative notes to a more prescriptive guideline status. In that aspect it is a regrettable regression from the previous ICANN guidelines. It also does not endorse the "inclusion-based" approach requested in Clause 2. Again, the recommendation from UTR#36 should be followed in removing all punctuation not related to a specific script usage, while preserving a well documented set of very limited exceptions. The data files and mechanisms are specified in UTS #39.
5. In "*Additional remarks", replace "The deceptive use of visually confusable characters from different scripts is discussed in detail in the Unicode Technical Report #36 on âUnicode Security Conditionsâ at http://www.unicode.org/reports/tr36/"
"The deceptive use of visually confusable characters from different scripts is discussed in detail in the Unicode Technical Report #36 on âUnicode Security Considerationsâ at http://www.unicode.org/reports/tr36/. Limitations to the character repertoire available for IDNs are provided in tables presented under the heading âData filesâ in Unicode Technical Standard #39: Unicode Security Mechanisms (http://www.unicode.org/reports/tr39/)"
Rationale: The Unicode Consortium has split the security related material into a Technical Report, describing the issues (UTR #36) and a Unicode Technical Standard (UTS #39) which gives precise specifications and data files.
6. The text should be reorganized for clarity. For example, both clause 3a and 7 deal with publishing tables; they should be in the same section; the 3rd paragraph of "*Additional remarks"* does not make much sense as currently written. At minimum it should be split by concept and developed in separate paragraphs (script consideration, idn.idn, etc...)
Mark Davis Chair, Unicode Security Subcommittee President, Unicode Consortium