What is an http server

Table of Contents

HTTP Server or Web Server: Purpose, Functions and Role of the Server in HTTP

May 28, 2016
HTTP protocol, Question-answer, Servers and protocols, What is?
One comment

Hello, dear visitors of the site ZametkiNaPolyah.ru. We continue to familiarize ourselves with the HTTP protocol in the Servers and Protocols section and its HTTP protocol section. Let’s now take a look at the HTTP servers that we have. Generally speaking, the HTTP standard does not strictly distinguish between a server and a client; it often happens that an application can be both an HTTP server and an HTTP client at the same time . I must say right away that we will not consider HTTP clients separately, since I do not see any practical point in this, because, unlike HTTP servers , you come across clients every day.

HTTP server or web server

An HTTP server or web server is a program that understands what the client needs and gives him responses in the form of HTML pages, which can contain various information: images, texts, scripts, files, media data (video and audio) and much more. … The HTTP server accepts an HTTP request from a client (the client can be a browser, mobile phone, TV or electric kettle, if it has the function of accessing the Internet) and gives it an HTTP response, I want to draw your attention to the fact that it is not necessary to load the HTTP response the response should be in the form of an HTML page, the data may be different.

Difference between web server and HTTP server

For further understanding, I should clarify that the concept of a web server is much broader than an HTTP server . A web server is a software package necessary to support the operation of web protocols and directly the hardware on which these programs run (physical servers). An HTTP server is just one program that implements communication over the HTTP protocol. But within the framework of the records on the HTTP protocol, speaking a web server , I will understand exactly the HTTP server.

HTTP server functions. Web server functions

Let’s consider the main functions that the HTTP server performs :

the main function is to support interaction between computers on the network using the HTTP protocol;
The HTTP servercan keep its own logs: errors, user requests, and others;
data encryption (HTTP protocol does not support encryption, there are SSL and TLS protocols for this, you can read about encryption in HTTP and security in HTTP);
The HTTP servermust be able to balance the load;
compress the content of responses;
The HTTP server can be final, or it can be transit, in the second case it is called a proxy;
The HTTP server must be able to cache;
HTTP version 1.1 servers must support persistent HTTP connections;
servers must be able to manage HTTP conversations;
and much more.

The functions of a web server in its original concept are much broader. For example, the AMPPS web server includes several different servers, applications, or server technologies, including the Apache HTTP server, although Apache’s functions are somewhat broader than the functions of a simple HTTP server. An example of another web server assembly might be Denever, which is already heavily outdated at the moment.

Most Popular HTTP and Web Servers

In general, the list of functions of a web server or HTTP server depends on the program that performs these functions. Therefore, let’s list the most popular HTTP servers :

Apache is the most popular and widespread HTTP server, used for Unix systems, but there are versions for Windows operating systems. This HTTP server is free;
ISS – a web server from Microsoft, distributed free of charge with an operating system of the Windows family;
nginx is a free HTTP serverdeveloped by Russian programmer Igor Sysoev, it is worth noting that many large projects use Igor Sysoev’s web server.
Google Web Server – this web server isdistributed and maintained by Google, they took the Apache HTTP server as a basis and modified it.
Cherokee is a free web serverthat can only be managed via a web interface.

We have listed only the most popular HTTP and web servers , in fact, the list is much wider. But, we can say the following: it is enough to study the HTTP protocol to understand how to configure any of the web servers, or it is enough to understand how one HTTP server works in order to work with others. Of course, each web server has its own subtleties and features that can be learned through experiments and experiments or from documentation (or better, combine experience and documentation), but any web server is based on the HTTP protocol and any server must comply with the requirements of the HTTP protocol , and HTTP responses must contain everything required for transmission over HTTP parameters.

Introduction

The term “web server” can refer to both hardware and software. Or even both parts working together.

In terms of hardware, a “web server” is a computer that stores site files (HTML documents, CSS styles, JavaScript files, images, and others) and delivers them to the end user’s device (web browser, etc.) etc.). It is connected to the Internet and can be accessed through a domain name like mozilla.org.
From a software point of view, a web server includes several components that control web users’ access to files located on the server, at least an HTTP server. An HTTP server is a piece of software that understands URLs (web addresses) and HTTP (the protocol your browser uses to view web pages).

At its most basic level, when a browser needs a file hosted on a web server, the browser requests it over the HTTP protocol. When the request reaches the desired web server (hardware), the HTTP server (software) accepts the request, finds the requested document (if not, reports a 404 error) and sends it back, also via HTTP.

To publish a website, you need either a static or dynamic web server.

A static web server , or stack, consists of a computer (hardware) with an HTTP server (software). We call this “static” because the server sends the hosted files to the browser “as is”.

A dynamic web server consists of a static web server and additional software, most commonly an application server and database . We call it “dynamic” because the application server modifies the original files before sending them to your browser over HTTP.

For example, to get the final page that you view in a browser, the application server might populate an HTML template with data from a database. Sites like MDN or Wikipedia are made up of thousands of web pages, but they are not real HTML documents – just a few HTML templates and gigantic databases. This framework makes it easier and faster to maintain web applications and deliver content.

Active learning

Going deeper

To load a web page, as we said, your browser sends a request to the web server, which starts looking for the requested file in its own memory space. Having found the file, the server reads it, processes it as it needs, and sends it to the browser. Let’s take a closer look at these steps.

File hosting

First of all, the web server must contain the website files, namely all HTML documents and related resources, including images, CSS styles, JavaScript files, fonts and videos.

Technically, you can host all of these files on your computer, but it’s much more convenient to store them on a dedicated web server that:

always up and running
always connected to the internet
has a fixed IP address (not all providers provide a static IP address for home connections)
serviced by a third, third party company

For all of these reasons, finding a good hosting provider is a key part of building your website. Browse the many offers from companies and choose one that suits your needs and budget (offers range from free to thousands of dollars per month). You can find details in this article.

Once you have solved your hosting problem, all you need to do is upload your files to your web server.

HTTP communication

Secondly, the web server provides support for HTTP ( H yper t ext T ransfer P rotocol – hypertext transport protocol ). As the name suggests, HTTP specifies how to transfer hypertext (i.e., linked web documents) between two computers.

A protocol is a set of rules for communication between two computers. HTTP is a text-based, stateless protocol.

Text All commands are plain human-readable text. Does not keep state Neither the client nor the server remember any previous connections. For example, relying only on HTTP, the server will not be able to remember the password you entered or what step in the transaction you are in. For such tasks, you need an application server. (We’ll dwell on these technologies in future articles.)

HTTP enforces strict rules for client-server communication. We’ll cover the HTTP protocol itself in a technical article a little later. For now, it’s enough to know about these rules:

Exceptionally clientscan make HTTP-requests, and only on the server . Servers are only capable of responding to client HTTP requests .
When requesting a file via HTTP, the client must generate a file URL.
The web server must respondto every HTTP request with at least an error message.

On a web server, an HTTP server is responsible for processing and responding to incoming requests.

When a request is received, the HTTP server first checks if the resource exists at the given URL.
If so, the web server sends the contents of the file back to the browser. If not, the application server generates the required resource.
If none of this is possible, the web server returns an error message to the browser, most often “404 Not Found”. (This error is so common that many web designers spend a lot of time designing 404 error pages.)

Static and Dynamic Content

Roughly speaking, the server can serve static or dynamic content. “Static” means “given as is.” Static websites are the easiest to create, so we suggest you make your first website static.

“Dynamic” means that the server processes the data or even generates it on the fly from the database. This provides more flexibility, but is technically more difficult to implement and maintain, which makes the process of creating a site very difficult.

Take for example the page you are currently reading. The web server where it is hosted has an application server that pulls the article content from the database, formats it, adds it to HTML templates, and sends you the result. In our case, the application server is called Kuma, it is written in the Python programming language (using the Django framework). The Mozilla team created Kuma for the specific needs of MDN, but there are many similar applications built on completely different technologies.

There are so many application servers out there that it is difficult to come up with one. Some application servers are tailored to specific categories of websites, such as blogs, wikis, or online stores; others, called CMSs (Content Management Systems), are more versatile. If you’re building a dynamic website, take some time to choose the tool that suits your needs. If you don’t want to learn web programming (although that’s fun in itself!), Then you don’t need to create your own application server. This will be the invention of another bicycle.

Next steps

Now that you are familiar with web servers, you can:

Here’s a description of the main aspects of the HTTP protocol, the network protocol that has allowed your browser to load web pages from the early 90s to this day. This article is written for those who are just starting to work with computer networks and develop network applications, and for whom it is still difficult to read the official specifications on their own.

HTTP is a widespread data transfer protocol, originally intended for the transfer of hypertext documents (that is, documents that can contain links that allow you to organize the transition to other documents).

HTTP stands for HyperText Transfer Protocol . According to the OSI specification, HTTP is an application (upper, 7th) layer protocol. The current protocol version, HTTP 1.1, is described in the RFC 2616 specification.

The HTTP protocol assumes the use of a client-server data transfer structure. The client application generates a request and sends it to the server, after which the server software processes this request, generates a response and sends it back to the client. The client application can then continue to send other requests, which will be handled in a similar manner.

A problem that is traditionally solved using the HTTP protocol is the exchange of data between a user application accessing web resources (usually a web browser) and a web server. At the moment, it is thanks to the HTTP protocol that the work of the World Wide Web is ensured.

Also, HTTP is often used as a communication protocol for other application layer protocols such as SOAP, XML-RPC, and WebDAV. In this case, the HTTP protocol is said to be used as a “transport”.

The API of many software products also implies the use of HTTP for data transfer – the data itself can be in any format, for example, XML or JSON.

Typically, data transmission over the HTTP protocol is carried out over TCP / IP connections. The server software usually uses TCP port 80 (and if the port is not explicitly specified, then usually the client software uses the 80th port by default for opened HTTP connections), although it can use any other.

How do I send an HTTP request?

The easiest way to understand the HTTP protocol is to try to access a web resource manually. Imagine that you are a browser and you have a user who really wants to read Anatoly Alizar’s articles.

Suppose he entered the following in the address bar:

Accordingly, you, as a web browser, now need to connect to the web server at alizar.habrahabr.ru.

To do this, you can use any suitable command line utility. For example telnet:

telnet alizar.habrahabr.ru 80

I’ll clarify right away that if you suddenly change your mind, then press Ctrl + “]”, and then enter – this will allow you to close the HTTP connection. In addition to telnet, you can try nc (or ncat) to your liking.

After you connect to the server, you need to send an HTTP request. By the way, this is very easy – HTTP requests can consist of only two lines.

In order to form an HTTP request, it is necessary to compose a start line, and also set at least one header – this is the Host header, which is required and must be present in every request. The fact is that the conversion of a domain name to an IP address is carried out on the client side, and, accordingly, when you open a TCP connection, the remote server does not have any information about which address was used for the connection: it could be, for example , address alizar.habrahabr.ru, habrahabr.ru or m.habrahabr.ru – and in all these cases the answer may differ. However, in fact, the network connection in all cases is opened with the node 212.24.43.44, and even if initially, when opening the connection, not this IP address was set, but some domain name,

The starting (initial) request string for HTTP 1.1 is composed as follows:

For example (a start line like this might indicate that the site’s home page is being requested):

A method (in the English subject literature, the word method is used , and also sometimes the word verb – “verb”) is a sequence of any characters, except for control characters and separators, and determines the operation to be performed with the specified resource. The HTTP 1.1 specification does not limit the number of different methods that can be used, however, in order to comply with common standards and maintain compatibility with the widest range of software, as a rule, only some of the most standard methods are used, the meaning of which is unambiguously disclosed in the protocol specification.

URI ( Uniform Resource Identifier , unified resource identifier) is a path to a specific resource (for example, a document) on which an operation must be performed (for example, in the case of using the GET method, obtaining a resource is implied). Some requests may not refer to any resource, in this case an asterisk (asterisk, symbol “*”) can be added to the start line instead of the URI. For example, it might be a request that relates to the web server itself, and not to any particular resource. In this case, the start line might look like this:

The version determines which version of the HTTP standard the request is based on. Specified as two numbers separated by a dot (for example, 1.1 ).

In order to access a web page at a specific address (in this case, the path to the resource is “/”), we should send the following request:

GET / HTTP / 1.1
Host: alizar.habrahabr.ru

When doing this, keep in mind that you must use a Carriage Return followed by a Line Feed to feed a line. After the last header is declared, the line break sequence is appended twice.

However, the HTTP specification recommends programming the HTTP server so that the LF character is interpreted as a line separator when processing requests, and the preceding CR character, if any, is ignored. Accordingly, in practice, most of the servers will correctly process such a request, where the headers are separated by the LF character, and it is added twice after the declaration of the last header.

If you want to send a request exactly according to the specification, you can use escape sequences
and
:

echo -en “GET / HTTP / 1.1
Host: alizar.habrahabr.ru

“| ncat alizar.habrahabr.ru 80

How can I read the answer?

The start line of the response has the following structure:

The protocol version is specified here in the same way as in the request.

Status code ( the Status Code ) – the three digits (the first of which indicates the status of the class), which determine the result of the commission of inquiry. For example, if the GET method was used, and the server provides a resource with the specified identifier, then such a state is set using the code 200. If the server reports that such a resource does not exist – 404. If the server reports that it does not can provide access to this resource due to the lack of necessary privileges from the client, then the code 403 is used. The HTTP 1.1 specification defines 40 different HTTP codes, and the protocol can be extended and the use of additional status codes.

Explanation to the status code ( by Reason the Phrase ) – text (but not including the characters CR and LF ) an explanation of the code response, intended to simplify reading the answer man. The explanation may not be taken into account by the client software, and may also differ from the standard in some server software implementations.

After the start line, there are headers, as well as the body of the response. For example:

The body of the response follows two line breaks after the last header. To determine the end of the response body, the Content-Length header value is used (in this case, the response contains 7 octal bytes: the word “Wisdom” and a line feed character).

But for the request that we made earlier, the web server will return a response not with code 200, but with code 302. Thus, it informs the client that it is currently necessary to access this resource at a different address.

The new address is passed in the Location header. Now the URI (resource identifier) has changed to / users / alizar /, and this time you need to contact the server at habrahabr.ru (however, in this case it is the same server), and specify it in the Host header.

GET / users / alizar / HTTP / 1.1
Host: habrahabr.ru

In response to this request, the Habrahabr web server will already return a response with the code 200 and a fairly large HTML document.

If you have already managed to get used to the role, then you can now read the HTML code received from the server, take a pencil and a notebook, and draw Alizar’s profile – in principle, this is what the browser would do in your place now.

What about security?

By itself, the HTTP protocol does not imply the use of encryption for the transmission of information. Nevertheless, for HTTP there is a common extension that implements the packaging of transmitted data in the cryptographic SSL or TLS protocol .

The name of this extension is HTTPS ( HyperText Transfer Protocol Secure ). For HTTPS connections, TCP port 443 is usually used. HTTPS is widely used to protect information from interception, and, as a rule, provides protection against man-in-the-middle attacks – in the event that the certificate is verified on the client, and at the same time, the private key of the certificate was not compromised, the user did not confirm the use of an unsigned certificate, and the certificates of the attacker’s certification authority were not embedded on the user’s computer.

Currently HTTPS is supported by all popular web browsers.

Are there additional features?

The HTTP protocol offers a large number of extensibility options. In particular, the HTTP 1.1 specification assumes the ability to use the Upgrade header to switch to a different protocol exchange. A request with this header is sent by the client. If the server needs to switch to data exchange using another protocol, then it can return a response to the client with the “426 Upgrade Required” status, in which case the client can send a new request, already with the Upgrade header.

This opportunity is used, in particular, to organize data exchange via the WebSocket protocol (a protocol described in the RFC 6455 specification, which allows both parties to transmit data at the right time, without sending additional HTTP requests): the standard “handshake” is reduced to sending HTTP request with the Upgrade header, which has the “websocket” value, to which the server returns a response with the “101 Switching Protocols” state, and then either side can start transmitting data using the WebSocket protocol.

Anything else, by the way, is it used?

At the moment, there are other protocols designed to transfer web content. Specifically, the SPDY protocol (pronounced speedy , not an abbreviation) is a modification of the HTTP protocol that aims to reduce delays in loading web pages as well as provide additional security.

The increase in speed is achieved by compressing, prioritizing and multiplexing the additional resources required for the web page so that all data can be transferred within a single connection.

The November 2012 draft of the HTTP 2.0 protocol specification (the next version of the HTTP protocol after version 1.1, the final specification for which was published in 1999) is based on the SPDY protocol specification.

Many architectural solutions used in the SPDY protocol, as well as in other proposed implementations that the httpbis working group considered during the preparation of the draft of the HTTP 2.0 specification, were previously obtained during the development of the HTTP-NG protocol, but work on the HTTP-NG protocol was discontinued. in 1998.

Currently, support for the SPDY protocol is available in the browsers Firefox, Chromium / Chrome, Opera, Internet Exporer and Amazon Silk.

Is that all?

In general, yes. It would be possible to describe specific methods and headers, but in fact, this knowledge is needed rather if you are writing something specific (for example, a web server or some client software that communicates with servers via HTTP), and for a basic understanding of how the protocol works is not required. In addition, you can find all this very easily through Google – this information is in the specifications, and in Wikipedia, and many other places.

However, if you know English and want to delve into the study of not only HTTP itself, but also used to transfer TCP / IP packets, then I recommend reading this article