ICANN ICANN Email List Archives

[gnso-contactinfo-pdp-wg]


<<< Chronological Index >>>    <<< Thread Index >>>

RE: [gnso-contactinfo-pdp-wg] Wednesday 12 November 23:59 UTC soft deadline for comments

  • To: Emily Taylor <emily.taylor@xxxxxxxxxxxxx>
  • Subject: RE: [gnso-contactinfo-pdp-wg] Wednesday 12 November 23:59 UTC soft deadline for comments
  • From: "Dillon, Chris" <c.dillon@xxxxxxxxx>
  • Date: Tue, 11 Nov 2014 11:57:29 +0000

Dear Emily,

I would like to thank you on behalf of the Group for this large amount of work 
both in summarizing colleagues’ comments and providing your own.

I hope both that it will make the non-mandatory arguments stronger and 
stimulate more discussion of the mandatory arguments on this list and in the 
meetings.

With all best wishes,

Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, 
Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) 
www.ucl.ac.uk/dis/people/chrisdillon<http://www.ucl.ac.uk/dis/people/chrisdillon>

From: Emily Taylor [mailto:emily.taylor@xxxxxxxxxxxxx]
Sent: 11 November 2014 11:20
To: Dillon, Chris
Cc: Lars Hoffmann; gnso-contactinfo-pdp-wg@xxxxxxxxx
Subject: Re: [gnso-contactinfo-pdp-wg] Wednesday 12 November 23:59 UTC soft 
deadline for comments

Dear Chris

Thank you for this timely reminder.  Over the past few days, I have been 
gathering input from colleagues in the Registrar Stakeholder group.  There was 
a rich discussion on the list, with many participants.  These are less comments 
on the paper itself than contributions to the general discussion of the issues.

Here is a synthesis of the comments. I hope that they will be useful in 
cross-checking against the "arguments opposing mandatory transformation" on 
pages 11-12:

1. Costs:  This proposal essentially externalises translation costs from LEA/IP 
to Registrars, and none of the commentators were convinced that the costs for 
contracted parties are justified by benefits to others.  Those requesting the 
data can pay for the translation.

2. Scale:  Why translate/transliterate all WHOIS data, rather than simply those 
names that are of interest on-the-fly?  Status quo is several orders of 
magnitude more efficient

3. Accuracy and responsibility: If the premise of  WHOIS data is that it is 
provided (and declared accurate) by the Registrant, then who accepts 
responsibility if Registrars are required to alter that data? How would the 
proposals impact whois data accuracy complaints and whois verification 
requirements?

4: Data integrity: The whois should be displaying what the client entered.   
Our trying to interpret that only leads to more data errors, and less accurate 
data. If we change what the client enters it will only lead to errors:

a.       Will there be rules on how transliterate non-ascii characters so that 
it can be done programmatically? Is there some standard system to be used, or 
are we all just counting on Google Translate?

b.       If human judgment is required, who is responsible for doing it?

c.       If the registrant is responsible, what if they do not know what it 
should be?

d.       What if a third-party disagrees with the accuracy of a transliteration?

e.       Is the registrant’s consent required before a transliteration is 
published in the whois?

f.       Can a registrant withhold consent?

g.       What if a registrant wants to change an “approved” transliteration?

h.       Is a whois verification required every time one of these 
transliterated fields are updated?

i.    Where does the requirement for data transformation end? Could Chinese LEA 
require a contracted party to translate/transliterate existing English contact 
details into Mandarin? Or, what if the original registration was in a third 
language/script (Russian Cyrillic), would that skip English and go directly to 
Chinese?
5.  Compliance: "who will and how will this be policed?”  If ICANN are making 
cutbacks in their budget, how are they going to afford the human resources to 
check every Whois transliteration is correct? It doesn’t make much operational 
sense, and will likely end up with the registrant paying higher fees for 
something that they never asked for.

6. Internationalisation: The concept starts to erode the “my language, my 
Internet” / IDN principle of ICANN, by compelling the use of 
English/Latin/ASCII by people and locations not using those language/script 
combinations.  One commentator put it as "Sadly, it is North American thinking 
I suspect. 'We must translate everything into English'.

7: Competition: If a contracted party does not want to support a language that 
should be their prerogative. They can turn away business if they decide that 
they won’t be able to service that customer appropriately.

---------------

General comments

Taking into account the above input, I have the following observations to make 
on the draft paper.

First, thank you Chris and the ICANN team for your work in the unenviable task 
of fairly summarising the arguments on both sides.  I appreciate that it is an 
important step in the process to try and understand the arguments on both sides.

A general point: I have no sense from the paper, or from the discussions in the 
group, of the scale of the problem we are addressing here.  Do we have any 
stats for the following:

(1) a breakdown of WHOIS data by country of registrant - and can we infer what 
language WHOIS data is likely to be in?  The nearest I can get to is this map 
from OII which shows the predominance of Latin script / English language 
countries in the current domain market 
(http://geography.oii.ox.ac.uk/?page=geography-of-top-level-domain-names) .  
However, if you look at growth potential, clearly that is not the case.  And 
IDN registrations by country show a different pattern (see page 17 at 
http://www.eurid.eu/files/publ/IDNWorldReport2014_Interactive.pdf)

(2) an estimate of what is likely to be the language of WHOIS data if multiple 
languages were enabled in these fields.  For example, we could perhaps draw 
some inferences from the IDN registrations in ASCII TLDs.  Approximately 1% of 
.com and .net registrations are IDNs, and the majority of those are Latin 
script.  This may not be representative in that the Latin script ending for 
.com is more likely to be attractive to Latin script IDNs than, say, right to 
left scripts or pictograms.  There are currently just shy of 900,000 Russian 
ccTLD IDNs.  Of these over 800,000 has a registrant based in Russia, and uptake 
in other countries is low (even former Soviet Union).  See 
http://statdom.ru/tld/%D1%80%D1%84/report/summary/. There are approximately 
12,000 IDNs in Arabic script ccTLDs.  Uptake of IDN new gTLDs has been fairly 
limited.  I don't think that anyone is claiming that the IDN market has even 
nearly fulfilled its market potential, but can we have some statement of the 
scale of the problem?

(3) Do we have a sense of how many WHOIS look-ups are performed by law 
enforcement and IP interests, what percentage that represents of all WHOIS look 
ups, and how many prove to be problematic in terms of language of contact?  On 
the other hand, what problems are currently created by not having the ability 
to record contact details in the script of the domain name (eg for IDNs)?

(4) There have been a number of studies on different aspects of WHOIS data in 
the last couple of years - do any of these help to guide us?



Specific comments

Page 11 - as you say there is disagreement on "ease" of search.  If you're 
English mother tongue, then it might be "easier" to understand the output of a 
search, but any string is searchable, and you can interpret the search results 
whatever their script/language.

I find the first bullet point unconvincing - it's like saying "why doesn't 
everyone just learn English?  It's such a mess having all these languages"

On the second bullet point, p11 - I appreciate that a counter argument is 
stated to the "transformation will to some extent facilitate communication" 
argument.  The communication argument is a difficult one.  On one level - as 
demonstrated within this working group and many others - we default to English 
in order to communicate with one another across different languages.  However, 
this is also (to some extent) a factor that deters input from those who are not 
confident in English as a second language - who may be able to give valuable 
insights into the debate.  I believe that this is captured in "to some extent" 
but would welcome more acknowledgement that this cuts both ways.

The third bullet point does not explain why it is also necessary to 
transliterate/translate *all* data for this benefit to be felt. We need some 
consideration of proportionality here.

Fourth bullet - define "least translatable" - for whom? Is this truly posed as 
a barrier to law enforcement and others?

To balance the "cyberflight" argument in the fourth bullet point, could we also 
point out that in general people tend to register and host locally.  This is 
perhaps a surprising phenomenon given the strength of some registrars 
internationally.  For example, on page 5 at 
http://www.eurid.eu/files/publ/IDNWorldReport2014_Interactive.pdf) we have an 
analysis of country of hosting for gTLD IDNs plus .eu IDNs.  This was done 
based on the IP ranges associated with the domain names.  You can see that 
countries and regions with strong international registrars (eg North America, 
UK) don't really show any "winner" script.  In contrast, Chinese script, 
Cyrillic, Han (plus Katakana, Hiragana), Thai, Hangul, Arabic script domains 
tend to be hosted in countries where associated languages are spoken.

Could I also add that you can see within large IDN namespaces which offer 
multiple scripts (eg .com and .net) that registrations cluster strongly around 
popular scripts.  There are very small numbers indeed outside of them.  I can 
produce some more analysis on that point if people like.

I hope these inputs are helpful to the working group in its deliberations, and 
I look forward to joining the discussions.



Best wishes,

Emily


<<< Chronological Index >>>    <<< Thread Index >>>

Privacy Policy | Terms of Service | Cookies Policy