From Entering a URL to Receiving Root File

While many of us rely on the internet, what goes on ‘behind the screen’ eludes most.

We rely on the internet for just about everything: checking email, calling an Uber, ordering food and clothes. However, how the Internet actually works is not known by most.

Fundamentally, this process can be broken down into a few steps:

  1. Get an IP address from the URL. We discuss the structure of URLs in the upcoming lesson The Anatomy of a URL.
  2. Use HTTP/S to shoot off the request to the server.
  3. Get a response from the server.
  4. Make requests for external resources from the root file.
  5. Parse - HTML, CSS, JS.
  6. Render the website on the browser.

We’ll be focusing on the first four steps in this lesson. We’ll cover the last two in the next.

DNS

So how does your browser figure out the IP address from a supplied URL? Well, that is where the Internet’s directory service, Domain Name System, or DNS, comes in. DNS is used to find IP addresses from domain names.

This process is much like using a phone book to find your favorite pizza place’s phone number!

Domain name to IP address mappings are called ‘DNS records.’

These DNS records are part of a distributed database, which means that all the records are not stored at any one server, but are distributed amongst several servers.

Checking locally

We tend to revisit websites, so DNS entries can be locally cached to prevent latency arising from DNS requests over the network. This is done through the following steps:

  1. The browser’s local cache is checked.
  2. If the browser’s local cache does not have the record, the OS’s cache is checked.
  3. If the record does not reside in the OS’s cache either, the DNS server configured on the system is checked, which may be the DNS server in the home router, the ISP (Internet Service Provider), or the public DNS server.

DNS servers

If the record is not found at any of these places, then DNS servers are checked. DNS servers are divided into zones that form a hierarchy.

The servers at the top are called ‘root servers,’ and they store the IP addresses of other DNS servers, called top-level domain servers. Top-level domain (TLD) servers are divided by type, e.g. .com, .edu, .org, etc.

TLD servers have mappings to ‘second-level domain’ servers, such as a to a server for wikipedia.com and educative.io. These DNS servers contain mappings to the actual servers that host the website in question.

Hence, if the record cannot be found locally, a full DNS resolution is conducted as follows:

  1. The first point of contact for a full resolution is a root server. As of the writing of this course, 1084 instances of root servers exist. Check out https://root-servers.org/ for more details, including an interactive map!

  2. The root server returns the IP address of the relevant top-level domain server.

  3. The top-level domain server returns the IP address of the second-level domain server.

  4. The second-level domain server contains the DNS record of the server we are looking for. The second-level domain server returns the IP address to the browser.

A good way to think of this is that a domain name is resolved in reverse:

Sending the request

Now that we have an IP address and we know who to fire off the request to, we can do so with HTTP. We have learned all about HTTP in previous lessons. Check them out for a refresher!

Making requests to other assets

In response to a request to the home page of a website, you generally get a root HTML file that subsequently makes requests to the other assets that are needed, such as images. These requests are also made via HTTP, but the browser fulfills these requests automatically.

Check out the following app. It makes a request to an image hosted at imgur and Educative’s icon. So, if it were returned as a home page to a website, the page would automatically get these images.