Linked-in delinked! What on earth happened to it?

This morning  a strange sight greeted me when I tried to get on to Linked-in –  a HTTP 500 error!! Never have I encountered such a thing with any major online service that is delivered from a worldwide CDN (Content Delivery Network).

A couple of servers down here and there and minor disruptions in service (slow load-time due to rerouting) is understandable, but a 500 error? As far as I know, this occurs only if there’s something severely wrong with the application’s code causing Apache to hiccup and die while trying to deliver it.

And for Linked-in to be knocked out on this scale, the same code-base must have propagated to all their servers across the world.

Linked-in down

If you ask me it’s outright callousness on part of their dev team to let something like this happen. Even if I were to consider that someone’s been playing naughty with ’em, I can’t imagine on what scale the hacking would have happened to make their whole network behave the same way!!

P.S. By the time I got over with writing this post (11:20am UTC+7), they’re still in the same state. Someone probably needs to stick a finger in their eyes and point it out… they can at least afford to put up a graceful “Maintenance mode” page, till they rectify this.

What is www1 / www2 etc. ?

Regular surfers no doubt often come across sites which seem to defy the standard format for a web-address, i.e. www.some-site.com and take up forms like www1.some-site.com. It gets people wondering what actually is this www1 ! I’ve heard some really funny and odd explanations regarding this www1, www2 etc. and most of them border around the being the “second version of web/internet” 😀

In this article I’ll try to explain the concept behind this whole domain naming scheme as lucidly as possible.

First and foremost, every computer that is somehow part of a network requires:

  • A unique IP address which establishes the identity of that computer on the network. No two computers on the same network may have the same IP address. Consider this similar to having a personal phone number. An example of an IP address is 202.212.121.100.
  • Secondly, every computer requires a unique NAME too, which again is another way identifying the computer on the network. Once again, no two computers can share the same name on a common network.

This brings us to the topic of domain names… the various domains like .com, .net, .org etc which you come by on the net are called Top Level Domains or TLDs. You can purchase (or rather lease, since domains aren’t really sold to you but leased for a certain period) such domains from internet naming authorities like ICANN. But a .com or .net alone isn’t sufficient to uniquely identify you on the net. Hence, along with the TLD you also purchase a secondary name which usually is related to your company or the product you’re selling or the overall theme of your website. You secondary name and the top level domain together give you your unique web identity. An example: Say I bought a .com domain and along with that a secondary name mycompany. So mycompany.com is what I will go by on the web.

ICANN maintains a network of DNS (Domain Name Services) Servers which is sort of an online telephone directory for computers. Usually ICANN’s servers or whoever maintains servers with similar authority are called root servers since they’re right at the top of the DNS pyramid. When a domain name of another computer is fed into your computer, it contacts these root servers in search of the target computer. The idea behind this is to query the DNS Servers and find out the IP address of the target, which can then be used to connect directly with it. The domain name alone is worthless as it can give no indication where the actual target computer resides.

If ICANN can provide the IP address of the target all’s well. If it is unable to – nothing to worry. Their DNS servers will always contains lists of other secondary DNS Servers maintained by other organizations and can redirect your query to these secondary servers. If the secondary ones cannot find your target, they handle the request down to yet a third level (tertiary) servers and so on. Think of the whole DNS chain like a pyramid, where the root servers sit right on top. Anyway, this cycle goes on till one of the DNS Servers finds the web-address you’re looking for in it’s own directory and then returns the result in the form on an IP address to your computer. This whole mechanism is absolutely transparent and occurs by the cartload every second.. with so many millions of pages being loaded worldwide.

As a base case, if your own company or hosting service maintains it’s own DNS Servers, this request might even reach up to those in search of the target address. This whole process of querying by domain names and getting the IP Address in turn is called Domain Name Resolution or simply Name Resolution.

The order of name resolution is right to left. For example, for the address www.mycompany.com the resolution will happen like:
.com >> .mycompany >> www

Now I’ve explained what the “com” and “mycompany” a short while back, but what is this www before your domain name? While being a acronym for World Wide Web, www also represents something known as the CNAME or Canonical Name for a domain. The CNAME is almost aways “www” as an adopted standard. You’ll notice that in 99% of the cases, both www.mycomany.com and simply mycompany.com will take you to the same site. In reality the DNS Servers list your IP address against this domain itself. The higher level DNS Servers don’t have anything to do with the www part. It is your own company’s/web-host’s DNS that deals with the www part.
A typical DNS Server conversation might look like this…

My Computer to Root DNS: “Hello, do you know www.mycompany.com?”

Root DNS: “Sorry, I know only about .com, .net and .org, and I can see the computer you’re trying to reach is a .com but I don’t have any further information on www.mycompany. However, I have a friend who might know about all the systems under .com domains. Here’s my friend’s address…(secondary DNS).”

My Computer to Secondary DNS: “Hello, do you know www.mycompany.com?”

Secondary DNS: “Yes, I do know mycompany.com but not aware of the www computer. I’ll refer you to the company’s DNS (Tertiary) who might be able to tell you further about the www”

My Computer to Company’s DNS: “Hello, do you know www.mycompany.com?”

Tertiary DNS: “Yes, you’ve reached the right place. Hold on a moment, I’ll direct you to the www computer.”

If you have your own DNS Servers running and your site draws good traffic, you might have setup 4-5 different computers in your company’s network to handle various kind of requests like web, ftp, email etc. individually so as to not put too much stress on a single machine.

It’s the www part that identifies the machine which is supposed to handle web (http) requests – but this happens internally on your network. DNS Servers outside your network do not need to know which machine is handling www and which one, ftp. Their job is simply to direct someone to your main IP. Your own DNS takes over from there. So www, ftp, mail, pop3 etc. actually represent various computers on your network handling those services.

As you can see, mycompany.com is good enough to reach your company’s site. The www part is pretty much unnecessary. As long as you have your own DNS server’s running, you can instruct it to resolve ANYTHING in the CNAME part. You can instruct your DNS to redirect all web requests to some computer called aaa and ftp to another one called bbb. In such a case your web-address could be http://aaa.mycompany.com and ftp address could be ftp://bbb.mycompany.com. It wouldn’t matter at all as long as your own DNS can resolve it. As I said earlier, since this is a world-wide adopted standard almost any and every web-site uses the www. But occasionally some sites use www1 or www2 or even www9 … whatever they feel like using.

There’s practically no limitation on what you can use as your Canonical Name. However, do not confuse this with sub-domains, which we’ll cover in another topic some other time.

Any questions / comments ?