About Recent Articles Regarding Phishing Using Homographs among IDNs
- Countermeasures Already in Place, and .JP Follows Them -
Recently, several articles pointed out increased possibilities of phishing/attacking using homographs through introduction of IDNs. However, the essence of such problems is not rooted on the IDN itself or its applications. Rather, it is based on how domain name registries handle homographs among domain name strings. Here, explanation is given from the following viewpoints:
- Root of the Problem
- Existing Countermeasures Applied to IDN Registration by Domain Name Registries
- Measures already taken in Japanese .JP domain name registration from its beginning
It is worth stating here that, although homographs among domain names are focused here, such visual illusion is not an effective means for phishing, since actual phishing uses more sophisticated tricks such as camouflaging or concealing false URIs.
Root of the Problem
Domain name is a character string. The variety of characters in domain names expands and hence the number of similar-looking characters may increase when IDN is introduced. Phishing using homographs among IDNs, reported these days, is a trick performed by ill-willed website owners by making bad use of similar-looking characters. Especially, the example of recent articles claims that users of IDN-enabled browsers may be visually illuded and phished by a false URL containing a non-ASCII character which closely resembles an ASCII letter (for example, Cyrillic 'ａ').
The root of this problem is a visual illusion, which already existed and was not originally introduced by IDN specifically. For example, among ASCII characters, you will find 1 (digit) and l (letter l) are similar-looking and so as 0 (digit) and O (letter O). These character pairs can be used for visual tricks. However, it is true that combinations of similar-looking characters increase when IDN is introduced. For example, dash mark for prolonged sound and Kanji character for 'digit one', which are both used in Japan, are very similar-looking.
This problem was already identified when IDN was standardized and introduced. Countermeasures to suppress the problem were already investigated and published as RFC by IETF. In addition, guidelines for domain name registries to conduct such countermeasures have already been set up by ICANN. As subscribed, the countermeasures already exist and how they are effective depend on how domain name registries utilize these countermeasures in their IDN registration services considering the balance between the usability and constraints of IDNs.
Existing Countermeasures Applied to IDN Registration by Domain Name Registries
As stated above, this problem had already been identified, and the following guidelines were already published to solve it:
- JET Guidelines (RFC3743)
Guidelines for IDN registration. They request registries to define languages to be registered as IDNs; define character code points allowed in IDNs; define variants (if any) to each character; and tag a language name to each IDN at registration to exclude inappropriate characters. These guidelines are defined along with table formats and algorithms.
- ICANN Guidelines
Guidelines for the Implementation of IDNs by registries. They guide registries to follow the IDN technical standards; define allowed character code points; associate a single language to each IDN; cooperate with relevant and interested stakeholders to develop language-specific registration policies, etc.
If each registry follows these guidelines in defining their IDN registration services, IDNs containing characters in two or more languages are excluded, and this results in a situation where possibility of visual illusion with similar-looking characters is dramatically reduced. For example, if a TLD registry defines Cyrillic character 'ａ' to be a variant of ASCII 'a' following these guidelines, 'Paypａl' is regarded as identical to 'paypal' under the TLD.
Most of the registries currently providing IDN registration follow these guidelines or plan to do so. Therefore, as for the access by IDN-aware browsers, phishing which uses non-ASCII characters similar to ASCII characters, which was reported, is extremely suppressed.
Measures Already Taken in Japanese .JP domain Name Registration from Its Beginning
Only Kanji, Hiragana, Katakana, and LDH, which all are usually used in Japan, are allowed to be used in Japanese JP domain name. Non-ASCII characters which are visually similar to ASCII alphabets, i.e., some part of Cyrillic characters is not allowed; and thus IDNs that are similar-looking to ASCII domain names do not exist under .JP TLD. For example, 'Paypａl.jp' cannot be registered and cannot be used as the site for phishing.
Therefore, it will hardly happen in Japanese JP domain name that ill-willed websites use domain names containing characters of different languages which are visually similar to each other.
The above-mentioned countermeasures have been applied to Japanese JP domain name from the start of its registration. On the other hand, the problem tend to take place more easily in services carried out by the registries not following the guidelines listed above.
In summary, the problem is rooted in IDN registration policies of each registry, but not in IDN-aware applications such as browsers. Japanese JP domain name, introduced in consideration of the above possible problems, can be used without too much worry.
- Past policy of .com
example.com Can be registered. All charecters are ASCII characters. exａmple.com Can be registered. The character which looks like ASCII character 'a' is a Cyrillic character.
- Policy of .jp
example.jp Can be registered. All charecters are ASCII characters. exａmple.jp Cannot be registered. The character which looks like ASCII character 'a' is a Cyrillic character.
As discribed above, JPRS recognizes the problems that can be caused by the IDN implementation and has taken measures from the start of Japanese JP domain name registration, allowing only Kanji, Hiragana, Katakata and LDH to be registered.
Measures against the problem of visually similar characters which exist in one language except for the reported problem of ill-willed websites that use domain names containing characters of different languages that are visually similar to each other, problems may be caused by visually similar characters in the script(s) of one language.
For example, visually similar characters in Japanese language include:
へ (Hiragana)/ ヘ (Katakana)
ソ (Katakana)/ ン (Katakana)
ロ (Katakana)/ 口 (Kanji)
* There are other examples.
These characters are treated as different characters in Japanese language we use daily and also in Japanese writings. Therefore, posing character restrictions which are widely different from the real-life situation or qualifying two or more characters to be the same character can seriously damage user-friendliness of Japanese JP domain names. For this reason, Japanese JP domain names do not have registration restrictions for visually similar characters within Japanese scripts.
In case the domain names which misuse these visually similar characters are registerd and used aiming to have people mix up them with third party's trade marks or trade names, problems can be resolved in accordance with the procedure of JP-DRP and such domain names can be canceled or transferred.
- Make Sure You Do Not Encroach on Other People's Rights When Registering Domain Names (In Japanese)
Note: On 8 June 2006, the following descriptions are added for clarification:
- Visually confusable characters other than Kanji, Hiragana, Katakana and LDH cannot be used in Japanese JP domain names.
- In Japanese JP domain names, visually similar characters within Japanese scripts are not restricted for usability, and disputes arising out of the similarity can be resolved by JP-DRP.