Home > database >  How to only output the header when building a client side http downloader
How to only output the header when building a client side http downloader

Time:01-15

I am building a client side HTTP web downloader.

I have been able to successfully receive a reply/data from the server using this line:

char serverreply[4096];
int x = recv(sockfd, serverreply ,MAXLINE,0);

With this line, it sends out get request. The output is below. The header is up till connection close.

It ends with a unique \r\n\r\n.

With that, how can I only output everything before <!doctype html> to the terminal?

Thank you

Path:index.html, port: 80, Ipaddress 93.184.216.34 The message was sent
File was opened
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 27639
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Fri, 14 Jan 2022 06:16:48 GMT
Etag: "3147526947"
Expires: Fri, 21 Jan 2022 06:16:48 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (oxr/8321)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256
Connection: close

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
        
    }



 here is my while 1 loop.   ```while (1){

        int x = recv(sockfd, serverreply ,MAXLINE,0);
        if (x <0){
            printf("\nError generated by socket layer: %d", errno);
        }else{
            if (x>0){
              if (argc ==4){
                  if(strcmp(forh,"-h")==0){
                    //this is where i need to only print the header 

                    break;
                  }
            }
    

CodePudding user response:

Keep reading until you find a \r\n\r\n in the buffer of the connection closes. When you find a \r\n\r\n, output everything up to that.

If you don't have memmem, it may be worth implementing it. It's a bit tedious to use strstr in an application like this.

CodePudding user response:

int x = recv(sockfd, serverreply ,MAXLINE,0);

It ends with a unique \r\n\r\n.

With that, how can I only output everything before <!doctype html> to the terminal?

I have already been in the same situation multiple times and I typically used the following method to detect the \r\n\r\n sequence:

int state = 0;
...
for(/* each character received do ... */ ...)
{
    if(character == '\r' && state == 2) state = 3;
    else if(character == '\r') state = 1;
    else if(character == '\n' && state == 1) state = 2;
    else if(character == '\n' && state == 3)
    {
        /* "\r\n\r\n" has been received */
        state = 4;
        putchar('\n');
        ...
        break;
    }
    else state = 0;

    /* Print the character to the console */
    putchar(character);
}

This method has the following advantage:

Depending on the TCP connection (MTU, speed ...), recv() might return less data than MAXLINE bytes although there is still more data coming.

This means you have to call recv() multiple times until recv() returns zero or a negative value.

It might happen that the first call of recv() returns HTTP/1.0 ... close\r\n and the second call returns \r\n<!doctype ....

In this situation, the array serverreply[] does not contain the whole sequence \r\n\r\n so searching for that sequence in the array (e.g. using memmem()) will not work.

Using my method, you set the variable state to zero before the first recv() and you don't change the value of state when recv() is called again.

For this reason, the method will also detect the \r\n\r\n if it is "split" over multiple recv() calls.

Edit

More detailed example:

char state = 0;
char character;
int index;
int length;
char serverreply[MAXLINE];

do {
    length = recv(sockfd, serverreply, MAXLINE, 0);
    for(index = 0; index < length; index  )
    {
        character = serverreply[index];
        if(state >= 4)
        {
            /* Character belongs to the data */
        }
        else
        {
            putchar(character);
            if(character == '\r' && state == 2) state = 3;
            else if(character == '\r') state = 1;
            else if(character == '\n' && state == 1) state = 2;
            else if(character == '\n' && state == 3) state = 4;
            else state = 0;
        }
    }
} while(length > 0);
  •  Tags:  
  • Related