Getting started with LibCurl

Using LibCurl to download webpage using C

2 min readNov 21, 2014

LibCurl is an open-source client-side URL transfer utility supporting multiple protocols. LibCurl is supported by almost every conceivable common platform, making it one of the most versatile libraries of its kind.

In this post, I will show you how to use LibCurl in C to make simple HTTP requests with its built-in cookie processor to programmatically log in to websites and download web pages.

Setup / Installation

Download the LibCurl for your platform here. I would be using it on Windows with MinGW compiler and hence downloaded Win32 — Generic build. I extracted the zip file and copied the contents of “include” and “lib” folder to MinGW’s “include” and “lib” folders respectively. That’s it, you are ready to write your C program.

Retrieving a Webpage

The program which I am going to write will POST the login data to a login page, retrieve the cookie and use it for downloading webpage.

First we need to include LibCurl’s header file

Now we need to initialize the LibCurl globally as below

curl_global_init(CURL_GLOBAL_ALL);

This will initialize the LibCurl’s module. Now we need to create a handle to make HTTP requests.

CURL *myHandle;
CURLcode result; // We’ll store the result of CURL’s webpage retrieval, for simple error checking.
myHandle = curl_easy_init();

Now we will define our user agent as some websites reject HTTP requests without a proper user agent. We’ll also specify for LibCurl to follow redirects, as many login pages redirect users to a home screen. Remember, we only need to do this once. LibCurl will keep these settings in effect unless we change them.

curl_easy_setopt(curl, CURLOPT_USERAGENT, "Mozilla/4.0");
curl_easy_setopt(curl, CURLOPT_AUTOREFERER, 1 );
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1 );

Next, we’ll enable LibCurl’s automatic processing of cookies.

curl_easy_setopt(curl, CURLOPT_COOKIEFILE, "");

Now we’ll tell LibCurl which URL to fetch.

curl_easy_setopt(myHandle, CURLOPT_URL, "http://www.example.com");

Let’s visit the login page once to obtain a PHPSESSID cookie.

curl_easy_perform(myHandle);

Next forge the HTTP referer field, or website will deny the login.

curl_easy_setopt(myHandle, CURLOPT_REFERER, http://www.example.com);

Now we'll prepare the data to post to login form.

char *data=”username=your_username_here&password=your_password_here”;
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, data);

You may have to change data string according to the structure of login page. Next we'll make the second HTTP POST request to obtain the real page.

curl_easy_perform(myHandle);

And lastly we'll cleanup.

curl_easy_cleanup(myHandle);

Below is the complete code.

Next we'll compile the code.

gcc downloader.c -o downloader.exe -lcurl -lwsock32 -lidn -lwldap32 -lssh2 -lrtmp -lcrypto -lz -lws2_32 -lwinmm -lssl

The executable file thus generated can be executed to fetch the webpage.

Epilogue

LibCurl is used in numerous real-world applications. Refer this page for a list of such applications. You can customize the code above to make applications suited to your need.

Originally published at pranavprakash.net on March 20, 2014.

Getting started with LibCurl

Using LibCurl to download webpage using C

Setup / Installation

Retrieving a Webpage

Epilogue

Written by Pranav Prakash