Meme-Scraper πŸ”₯

Ran Crump
3 min readMar 4, 2019

--

How to pull the spiciest memes with Node.js πŸ‘ŒπŸ’―

Introduction πŸ“Œ

β‡° About a year and a half ago when I was getting into javascript an frontend web-development I ran across using APIs (Application Program Interface), it is very useful in the way you can pull query a server and receive data in such as JSON (Javascript Object Notation). This data can be very useful but sometimes there might be something that doesn’t have an API. What are you to do? Do you want to save the data by hand? What if it needs to be asynchronous/ RESTful? This is when web-scraping comes in handy. You can gather the HTML from a website and parse the data out to create either a JSON, store the data, or send it out like I do with my discord bot.

Prerequisites πŸ€“

β—ˆ This will be using Node.js if you don’t have it you can download it here: https://nodejs.org/

β˜„ Some Basic understanding of Javascript as far as Arrays, Variables, and functions

β˜• Brief Understanding of Terminal or Command Prompt (DOS) commands such as change directory, if not go under the β€œGetting Started Guide”

❖ Any knowledge of NPM (node package manager), this is what we will use to install all of our dependencies. If you do not have it you can either download it directly from https://www.npmjs.com/get-npm or from https://nodejs.org/ (NPM and Node come together as a package.)

Getting Started πŸƒ

β—† Open your Command Prompt/ Terminal and change the directory to your Desktop (cd Desktop/).

β—‡ Next create a new Directory on your desktop called webscraper (mkdir webscraper)

β—† Change the active directory to webscraper (cd webscraper/)

β—‡ In the webscraper directory initialize a package file (npm init)

β—† It will prompt you a ton of questions, just keep hitting enter until it asks you to confirm (yes) then type β€” β€œyes”

Installing Dependencies βš’

β—‡ For this project we will install 2 dependencies, Cheerio and Request. To install these open your Command Prompt or Terminal and type

npm install cheerio request

β—† this will install them into a new folder called node_modules along with the basic node modules.

Source Code πŸ’»

β—‡ The way this works is it will use the request module to get all the html from the URL site

β—† We then use Cheerio to parse all the html data into selectable html elements and store it into the β€˜$’ element so we can query it later:

const $ = cheerio.load(html);

β—‡ We are then gathering all the media elements from the webpage and storing the URLs for them into a array:

$('.media-element').each(function(i, element){
var temp = $(this).attr('src');
returnInfo.push(temp);
});

β—† The next thing we are doing is creating a random number from the length of the returned image URLs:

var randomNum = Math.floor(Math.random() * returnInfo.length);

β—‡ We then can attempt to return the image url by selecting the random array index:

console.log(returnInfo[randomNum]);

β—† If you wanted to hook this up with like a discord bot or what not you can have it send returnInfo[randomNum] as a message or embed it into a message to hide the URL link.

Hope you enjoyed this dank explanation of meme-scraping if you want to see more examples I encourage you to take a look at my Github discord bot and my other examples on web-scraping here:

https://github.com/Ranner198/DiscordBot
http://rancrump.com/webscrape-js/
https://github.com/Ranner198/WebscraperJS

--

--