Uniform Resource Locator

The Uniform Resource Locator or URL is an address that identifies an available source and where that source can be located. URL’s indicate the server location, subfiles, and file names where specified web pages can be found on the internet. However, with the onset of malware on the web it is important to understand the format of URL’s in order to keep from being tricked into going to a malicious website. You need to be sure that you are going exactly where you want to be going and the only way to know that is by understanding how to interpret what the URL means.

Before we get started, you will need to click on the “Title” of this post so that the URL will make sense to you. My “Home Page” will show you the URL labeled “https://macarooni.wordpress.com/” but by clicking on the title which is named “Uniform Resource Locator”, it will lead you to a page named “https://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/”. You will need to see this entire address in order to comprehend what is coming up next.

The first item in a URL is the “http://” (http://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/) which stands for Hypertext Transfer Protocol. The actual understanding of the technology behind this is not important, however, this technology was what led the the creation of the World Wide Web. All URL’s, except those using Secure Socket Layer (SSL) all begin with “http://”.

The next item in a URL, will be the “www.”, (http://www.macarooni.wordpress.com/2010/01/13/uniform-resource-locator/) which stands for World Wide Web. This is saying that your query will be on the web. What? You may be saying, I do not see a “www.” in this post’s URL. This can be a little deceiving. Most DNS Server’s do not require that the www. be a part of the web address. It can, in most instances, be left out, as the DNS  and your browser will recognize that you obviously want to go to the web. If you type “http://google.com”, your DNS Server and browser will know that you are looking for Google and will direct you to “http://www.google.com”. Likewise, should you just type “Google.com” in the URL, your DNS Server should be smart enough to direct you to Google’s home page. The more popular the site is, the more likely that typing in vague URL’s will direct you to the correct page, without going to a search result page.

I know what you are saying…..Why is this so confusing?

It is confusing and it will only get more so, but this is something that needs to be understood in order to surf the web safely.

In most cases, the next item will be the server in which you are looking to connect to. In the case of Google, the server name is “Google.com”. In the case of this page in which you see “macarooni.wordpress.com”, the “macarooni” is stating a specific area on the server in which to connect.  WordPress’s servers are shared by many other blogs besides this one, so each blog will have it’s own section on the server in which the data will be stored. So “macarooni.wordpress.com” is sending you to my section of the wordpress.com server. Here is what it should look like….http://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/

Folders and Subfolders. These will follow the naming of the server and location on the server and is always separated with a slash (/). So if you look at the URL for this post, you have the http:// and then the www. and then the server that you are going to, which is named “macarooni” and the domain or server  is “wordpress.com, meaning that you are connecting to servers at WordPress.

I did not mention this and it is an important point. All domains consist of a name followed by a dot and then a Generic Top Level Domain (gTLD) name. These are web site categories maintained by a certified authority, namely the IANA (Internet Assigned Number Authority) which are used by the DNS (Domain Name System) for use on the internet. Some of these categories are unrestricted, like .com (commercial sites), and .info (information sites). These unrestricted categories can be used (registered) by anyone. Others are restricted, such as .gov (government), .mil (military), and .edu (education), which can only be used by sites that fall into their particular category. There are also categories based on location, such as .ru (Russia), .fm (Federated States of Micronesia), and .tv (Tuvalu). Now there are many other categories that you may encounter, but these are a few of the main ones. To get more information on the categories, you should visit the Wikipedia site HERE.

Ok, back to Folders and Sub-Folders. Like I said, these are always separated with a Slash (/) mark. The easiest way to explain these is to think of a file cabinet and I am going to use the URL for this site as an example. Think of the domain (http://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/) as being the File Cabinet itself. Now this file cabinet has drawers in it. After the first slash is a folder named “2010” (https://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/). So that represents the drawer in your file cabinet that is labeled “2010”.

Now you open up that drawer and find that you are looking at a bunch of hanging folders in that cabinet drawer. One of those folders will be labeled “01” (https://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/). This is known as a sub-folder. There can be a lot of different sub-folders in a URL, but most sites try to keep these as minimal as possible to make it as easy as it can be.

Next in this sites URL is the “/13/”. Think of this as being a manila folder, that is labeled “13”, stored within the hanging folder that is labeled “01”. https://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/

The last item in the URL is the actual file that you are going to display. In the case of this site, it is a file named “uniform-resource-locator”. In our file cabinet example, this would be the actual document that you pull out of the manila folder.

The actual file that displays on your computer….

(https://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/).

in the manila folder….

(https://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/),

in the Hanging Folder…..

(https://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/),

in the drawer

(https://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/)

that is in the file cabinet

(http://macarooni.wordpress.com/2010/01/13/uniform-resource-locator/).

The first most important thing to understand is to how the gTLD is displayed. It will always be followed by the last dot and, secondly, that all folders and sub-folders are separated by a slash (/).

Why is this so important?

Check for my next post which will explain how these URL’s can be manipulated to redirect you to sites that contain malicious software. I will also explain a little about Secure Socket Layers (SSL) encryption.

Let me know what you think…..Does this make sense?

Feel free to leave a comment….

Comments are closed.