In software engineering we love abstractions. They take care of the tedious details and allow us to put our attention where it belongs. However, there is value in understanding how they do what they do (take this advice from Joel). Following this guidance, I decided to tackle an embarrassingly long-standing gap in my knowledge; translating tcp packets into HTTP responses.

This post goes through the steps of writing an HTTP client similar to curl. The steps are;

  • Creating a socket
  • Establishing a tcp connection
  • Sending an http request
  • Reading the http response

The code examples in this post are in C. The purpose of this choice is to be as close as possible to the system calls the kernel provides.

The Boilerplate

Now that we are under the hood, we are exposed to the overhead of establishing a tcp connection. First we need to create a socket. Then use that to initiate a tcp handshake.

int sockfd = socket(AF_INET, SOCK_STREAM, 0);

The example above uses the socket constructor from the C standard library. The type SOCK_STREAM reflects the stream oriented nature of the tcp protocol. This will become relevant later.

A web server is a machine that waits for clients to connect to it. We do that with the connect command. This command tells the kernel to initiate the handshake mentioned above. Handshake is a costly process. Hence http clients usually provide ways to optimize for it. Our naive implementation will not do that.

int sockfd, portno; // port is 80
struct sockaddr_in serv_addr;
struct hostent *server;

server = gethostbyname(“www.wikipedia.com");
if (server == NULL) {
  fprintf(stderr,”ERROR, no such host\n”);
  exit(0);
}
bcopy((char *)server->h_addr,
(char *)&serv_addr.sin_addr.s_addr,
server->h_length);
serv_addr.sin_port = htons(portno);
if (connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0)
  error(“ERROR connecting”);

What happens above is that; we resolve the host name to an address, we use that and the port number to create a socket address, we use the socket address to connect to the host. If the C language constructs are not familiar to you, do not worry. It is unlikely you’ll ever need to learn them. If you are a helpless curious, here are all the details you wish you didn’t ask for.

#https #sockets #c-language #tcp #c++

Looking Under the Hood: HTTP Over TCP Sockets
1.50 GEEK