HTTP Protocol overview – part I
Topic: Resources|Today I’m talking about HTTP protocol because in my experience some web designer/developers are creating webapplications without understanding how web works. So you can ask “Why is this guy picking on me? Why I really need to know about this protocol? If works, it’s fine!”. You’ll have these answers during this post and if, even so, you don’t understand something, feel free to post back or contact me!
First of all, what’s HTTP stand for? Ok, this is too easy HyperText Transfer Protocol. Let’s increase the difficulty of the question. What’s the paradigm that’s HTTP is oriented? Hum… Request/Response, Client/Server, Obvious? I think so too.
So HTTP runs over TCP using by default a well-know-port 80 and till now it’s the major protocol used on Internet. I think that everyone that has a computer connected to Internet already used a browser to access it. I truely hope so! But, few web designer/developers can answer what’s an Internet browser. Most of them will say that “A program that accesses and displays files and other data available on the Internet and other networks” [http://www.yourdictionary.com/ahd/b/b0511650.html, 19-09-2007] but I expect to hear that from someone else that’s not a web designer/developer. Ok, webdesigner I can understand, but webdeveloper or a system engineer no way! So, someone that develops on web I really like to hear that an Internet browser is a parser application that uses BNF (Backus–Naur formalism) notation to describe the formal languages - HTML. This is the essence of a browser! If you don’t know how to use this notation I can give some advises: read Programming Languages: Principles and Practice, Second Edition: Principles and Practice and use flex/Bison or ANTLR parsers generators to put on practice. My first parser was made with flex/bison to parse a XML file with a known DTD and it was great. Maybe soon I’ll post it!
At the beginning, HTTP didn’t have an official version and the very first one to be deployed was HTTP/1.0 (first HTTP protocol described in a RFC). Therefore, all work that was made before was labelled as HTTP/0.9. If I didn’t mention the creator of this protocol I would be very unfair for him and the scientific community: thank you Tim Berners-Lee for help changing the world! Let’s leave the history to Wikipedia and talk about real stuff!
It’s important to understand the basic of HTTP/0.9 because we have to ride a bike before riding a motorcycle. I think so!
HTTP/0.9 is depreciated because the lack of features that’s necessary to have the web working as we know today. In HTTP/0.9 the only existing method was GET and the type of the message that the client/requester sent was:
MESSAGE FORMAT:
HTTP-message = Simple-Request l | Simple-Response
REQUEST:
Simple-Request = “GET” SP Request-URI CRLF
Request-URI = absoluteURI | abs_path
absoluteURI = scheme “:” *( uchar | reserved )
abs_path = “/” rel_path
note: absoluteURI is used on proxies.
e.g.:
telnet www.example.com 80 (this request was made over HTTP/1.1)
GET /foo/bar/text.txt
RESPONSE:
<HTML><HEAD>
<TITLE>404 File Not Found</TITLE>
</HEAD><BODY>
<H1>File Not Found</H1>
The requested URL /foo/bar/text.txt was not found on this server.<P>
</BODY></HTML>
As you can see the first step was to make the TCP connection to the server in port 80 and send the request. The server will immediately responde to the client sending the resource or simply a message.

If the URI identify a processing unit (e.g. an executable file) then the server executes the file and returns the output of it. This technique is called CGI (Common Gateway Interface). The CGI can be called by the client using a query string that starts after the ‘?’ being followed by several fields separated by ‘&’.
E.g.: “http://www.example.com/login?username=test&password=none&depart=5”
HTTP/0.9 was a blast because is difficult to get something so efficient and simple as this. HTTP/0.9 can carry any kind of file and it does rely on TCP stack as an application-level protocol layer 5.
But, as I said on the beginning, HTTP/0.9 is depreciated and obviously with several limitations. First, every time a request is made a connection is opened and immediately closed which means that if you have X images on your website this request will open X + 1 connection in order to present the webpage. Second, HTTP/0.9 doesn’t have headers whatsoever, so no metadata can be send, for example no cache. Third, the method GET has limitation on the amount of data that can be sent to the server and the information is at naked eye. And so on… I think these are the main lacks.
All this is understandable because the project wasn’t expected to achieve the importance that has today.
So, in order to respond these lacks, the protocol was evolving till the actual version HTTP/1.1 and is expected a new version to compete with new protocols that are being developed.
Note that the importance of study the HTTP/1.0 and HTTP/1.1 regards with security breaches and efficiency. If you are constructing a website you have to be carefull because there are several ways to damage your website just by manipulating different versions/headers of HTTP protocol. In my case when I’m programming in PHP I always validate the headers even if I have a decrease of efficiency because not doing that can put in jeopardy all work done.
Next part will be about HTTP/1.0 and with more practical information. Bye
References:
http://www2.themanualpage.org/http/http_http09.php3
MOREIRA, André, Redes de Computadores: Protocolo HTTP. Porto: Apontamentos teóricos de Redes de Computadores, Junho 2007.








![Validate my RSS feed [Valid RSS]](http://www.brunotavares.com/blog/wp-content/themes/mytheme/images/valid-rss.gif)
Add New Comment
Thanks. Your comment is awaiting approval by a moderator.
Do you already have an account? Log in and claim this comment.
Add New Comment