[gnso-idn-wg] Item 4.5.4
you asked this question below regarding 4.5.4.
Ram: I am in general support of this also. Are there dissenting views in our WG?
I must say that I am strongly against this. If we start straying too far from the limited cases of visually confusing spoofing to blocking "phonetically confusing" (i.e. sounding the same) we are going into run into a lot of problems. The next thing, after sound we will be blocking is any gTLDs that smells the same, or elicit the same emotional response from humans (like anger or laughter).
The sounds of one language are not owned by another. Moreover when a person sees another language and converses in that the speaker contextually understands what is being said in that language - phonemes are processed in the context of the language being used. If that were not the case the following situation would have merit.
First, the voice-box of a human is limited to only a small and finite set of phonetic sounds exist across all languages. Second, I also understand that we wish to keep the number of syllables in a gTLD label short (typically one or two syllables - phonemes) - otherwise we could all be typing xn--abcgtf for the gtld instead. Taking the two together will leave us with a small set of acceptable phoneme combinations for IDN gTLDs across all Unicode langauges - probably around a 1000 or so. Now I can absolutely assure you that most langauges have several short (for presumably efficency of use in anger, usually one or two syllables) perjorative terms/sounds. Thus its extremely likley that many reasonable candidates for a gTLD in one langauge will end up being completely unacceptable phonetically in another langauge. To illustrate we can use Tamil and English with real life examples that are only mildly objectionable.
The Tamil word for flower (a 1 character word in Tamil) sounds like "poo" (3 characters in English) while in English its baby-talk for "sh.t". Of course helpfully its English baby-talk 3-character cousin "pee" is, if not in detail meaning the same, is at least categorically correct in Tamil as it means "sh.t" (again single character in Tamil). As a native speaker of both, if for one moment I were to keep the concepts/phonemes of the first langauge in my head while i speak the second, I will start talking nonsense and end up embarrasing myself, figuratively, not literally :-)
So the language context takes precedence over phoneme usage itself...
Thus if we place limits based on phonetic similarity we will find many many things to be disallowed and I am certain almost any one or two-syllable string will be objectionable in at least one other langauge.
Therefore on the grounds of both logic and simplicity I strongly disagree that the notion of "phonetically confusing" should be entertained as basis of any IDN gTLD selection limiting criteria.
Dear Charles and WG members, Please find below my responses to your proposals made yesterday to the WG list.