Well well well, all network admins would admit about the usefulness of a domain whois query.
A quick search on online whois query will fetch quite an enormous list and if you are a programmer and want to develop your own whois query application either web-based or system based, you will admit its like going through hell, just to get the whois-client working.
A few days ago, while working on an internal research project, I felt the need to query whois servers in order to extract some data for analysis, I met my first stumbling block. Got hold of a sample source-code, compiled it and did a few test runs for .com domains and everything was ok, untill i did a search for the good old ‘google.com’.
To make the matters worse,
1: There are numerous TLDs and every TLD is managed by numerous registrars.
2: There are numerous ccTLDs and how to procure this list was one of the favourite queries with no viable option. The only solution was by way of – sharing of txt files containing the list of individual whois servers with no gurantee what-so-ever about the authenticity of that data. Mozilla has one great list , so does invalument.
3:There are numerous gTLDs, again the same problem data retrieval, how to retrieve.
4: Every whois server follows its own rules for data storage, no proper guidelines, only guidelines which every whois server follows religiously is ensuring that it listens on port 43
5: ICANN though responsible for domain names and now-and-then issues guidelines regarding whois-servers and tlds, cctlds or gtlds, doesnt offer much help on data storage.
All the source codes pertaining to whois clients, have hard-coded whois-server list, due to which they all are limited in their output and input, some accept only .com and .net domains while others rely on the end-user to provide SUFFICIENT information eg. “command domainname whoisserver”. Over here the application requires only the domain name and nothing else.
Identifying a domain-name from the provided FQDN (Fully Qualified Domain Name) , is another problem quite often faced.
eg. blog.escanav.com is the FQDN where escanav.com is the domain name which can be queried. Another complicated example is mail.nic.in (nic.in is the verifiable domain) and gov.bih.nic.in (nic.in is the verifiable domain-name), now let us complicate this a little bit –
Back to the problem as shown in the previous query links. Initial perception paints a very grave picture but as the saying goes “Truth is stranger than Fiction“. Now let us explore the truth.
This happens because, the .com .net and few other TLDs related whois server requires an additional parameter ‘domain’ when conducting a whois query. i.e. whois domain google.com, In case you have forgotten to provide the additional parameter then the whois server will provide you with the list of domains registered with it. Logically, this is incorrect cause if a server is requesting for parameters then even for displaying the list there should be a parameter, secondly, upon close inspection we find that a lot of domains which start with google.com have been registered and to make the matters worse these are all TLDs ie. google.com.is.hacked.com
A TLD should have only one subdomain preceding it and and further division should require the deployment of DNS server based sub-domaining.
eg. google.com.is.hacked.com – google.com.is should be residing on DNS servers of hacked.com
A note to all programmers : consider revising your algorithm to utilize ‘parameter’, from where to get the list of parameters ? from a linux system : whois -h whois.crsnic.net help (for .com)
And to find the exact whois server, which will provide you with the registrants information for the particular TLD
whois -h whois.iana.org nic.in
whois -h whois.iana.org google.com
whois -h whois.audns.net.au embassy.gov.au
In short, every whois client requires to connect to two different whois servers to retrieve the registrant’s information.
Last but not the least: The data which is stored and presented by the whois servers differ, no standard has been maintained / followed. Just take a look at the date formats and you will realize, that it is very very difficult and almost next to impossible to programmatically retrieve creation-date of a domain with 100% success rate.
For past so many years we have been using Internet and domains with loads of RFCs and Industry standards but sadly IANA/ICANN is yet to set guidelines for storage of information, the only standard which has been successfully implemented is port no 43 = whois server.
Some of the registrars, do not even provide correct information to IANA pertaining their whois server eg. .pk – .pk’s registrar does not even have a whois server, and even if it does have one, its not published in the recommended IANA directories.
The only web-site which has provided consistent information is who.is but even this web-site fails in certain test scenarios.
Summary of this Blog:
Unless and until there is an RFC for data-storage and retrieval for whois-servers with all the registrars implementing these standards, only the devil’s programmer can find a way out.