Getting started with LibCurl

Using LibCurl to download webpage using C

Pranav Prakash
2 min readNov 21, 2014

LibCurl is an open-source client-side URL transfer utility supporting multiple protocols. LibCurl is supported by almost every conceivable common platform, making it one of the most versatile libraries of its kind.

In this post, I will show you how to use LibCurl in C to make simple HTTP requests with its built-in cookie processor to programmatically log in to websites and download web pages.

Setup / Installation

Download the LibCurl for your platform here. I would be using it on Windows with MinGW compiler and hence downloaded Win32 — Generic build. I extracted the zip file and copied the contents of “include” and “lib” folder to MinGW’s “include” and “lib” folders respectively. That’s it, you are ready to write your C program.

Retrieving a Webpage

The program which I am going to write will POST the login data to a login page, retrieve the cookie and use it for downloading webpage.

First we need to include LibCurl’s header file

Now we need to initialize the LibCurl globally as below

curl_global_init(CURL_GLOBAL_ALL);

This will initialize the LibCurl’s module. Now we need to create a handle to make HTTP requests.

CURL *myHandle;
CURLcode result; // We’ll store the result of CURL’s webpage retrieval, for simple error checking.
myHandle = curl_easy_init();

Now we will define our user agent as some websites reject HTTP requests without a proper user agent. We’ll also specify for LibCurl to follow redirects, as many login pages redirect users to a home screen. Remember, we only need to do this once. LibCurl will keep these settings in effect unless we change them.

curl_easy_setopt(curl, CURLOPT_USERAGENT, "Mozilla/4.0");
curl_easy_setopt(curl, CURLOPT_AUTOREFERER, 1 );
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1 );

Next, we’ll enable LibCurl’s automatic processing of cookies.

curl_easy_setopt(curl, CURLOPT_COOKIEFILE, "");

Now we’ll tell LibCurl which URL to fetch.

curl_easy_setopt(myHandle, CURLOPT_URL, "http://www.example.com");

Let’s visit the login page once to obtain a PHPSESSID cookie.

curl_easy_perform(myHandle);

Next forge the HTTP referer field, or website will deny the login.

curl_easy_setopt(myHandle, CURLOPT_REFERER, http://www.example.com);

Now we'll prepare the data to post to login form.

char *data=”username=your_username_here&password=your_password_here”;
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, data);

You may have to change data string according to the structure of login page. Next we'll make the second HTTP POST request to obtain the real page.

curl_easy_perform(myHandle);

And lastly we'll cleanup.

curl_easy_cleanup(myHandle);

Below is the complete code.

Next we'll compile the code.

gcc downloader.c -o downloader.exe -lcurl -lwsock32 -lidn -lwldap32 -lssh2 -lrtmp -lcrypto -lz -lws2_32 -lwinmm -lssl

The executable file thus generated can be executed to fetch the webpage.

Epilogue

LibCurl is used in numerous real-world applications. Refer this page for a list of such applications. You can customize the code above to make applications suited to your need.

Originally published at pranavprakash.net on March 20, 2014.

--

--

Pranav Prakash

Software Developer, Photographer, Computer Addict & Web Enthusiast