<<<
Chronological Index
>>> <<<
Thread Index
>>>
RE: [gnso-contactinfo-pdp-wg] Transformation required at the validation stage?
- To: James Galvin <jgalvin@xxxxxxxxxxxx>
- Subject: RE: [gnso-contactinfo-pdp-wg] Transformation required at the validation stage?
- From: "Dillon, Chris" <c.dillon@xxxxxxxxx>
- Date: Fri, 18 Jul 2014 10:44:03 +0000
Dear Jim,
That should certainly have been a qualified "no".
Original data are the best for validation. For example, a Chinese address is
basically Chinese characters and that is the primary form. If data have been
transformed, the transformation has to be high quality (i.e. probably not
automated except possibly for some alphabetic scripts) and a check needs to be
made that the transformed data match the original data.
Regards,
Chris.
From: James Galvin [mailto:jgalvin@xxxxxxxxxxxx]
Sent: 17 July 2014 14:17
To: Dillon, Chris; gnso-contactinfo-pdp-wg@xxxxxxxxx
Subject: Re: [gnso-contactinfo-pdp-wg] Transformation required at the
validation stage?
On 7/14/14, 11:46 AM, Dillon, Chris wrote:
At one point Jim Galvin asked whether transformation would be required at the
validation stage. I would say no to that question. The original language/script
data are primary and suitable for validation. It's not that transformed data
cannot be validated, but they need to be high quality (outside Greek, Cyrillic
and alphabetic scripts often created by human being rather than a computer).
Actually I think Rudi made a similar point later in the call, but just to
clarify that.
Perhaps I should listen to the recording to get more context. However, one
point that occurs to me is that I don't think you can give an unequivocal no
just yet.
Consider the question of whether or not a single script and language is
required. If we decide that all registration information is to be in a single
script and language, then we can consider the question of which presentation of
the data to validate: the original input data or the transformed data.
I would argue that if the chosen single script and language is to be
"official", then that presentation of the data should be subject to all the
validation requirements. The input data will need to be normalized so that it
can be transformed. We will need to keep the input form of the data for audit
purposes. However, validating the input form seems to me to be tantamount to
simply supporting all languages and scripts, which makes me wonder why we would
choose a single language and script for all data.
This is my current thinking.
Jim
Regards,
Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL,
Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599)
www.ucl.ac.uk/dis/people/chrisdillon<http://www.ucl.ac.uk/dis/people/chrisdillon>
<<<
Chronological Index
>>> <<<
Thread Index
>>>
|