Emulating Reddit in the Terminal with Ruby

Marlon Merjos
Aug 24, 2017 · 3 min read

In my search for inspiration for this blog post, I remembered that a friend had recently mentioned that each Reddit page can be converted easily into a JSON simply by adding .json to the url. Hoping to gain more experience dealing with unfamiliar data sets and structures, I attempted to emulate Reddit in the terminal by scraping it with the RestClient and JSON gems. Naively, I thought this would be an easy task and that this blog post would be an accolade of my triumphs. To start it off, I looked at the JSON for the from page of r/all without any parsing and quickly realized that getting the data I wanted was no easy task.

r/all json

Don’t lose faith. This ugly duckling eventually turned into this… and today I’m going to walk you through my process.

In order to understand the the nested structure of the JSON, I attempted to get all the pertinent information from the first post. Through an iterative process of trial and error I was finally able to retrieve the title of the first post yielded from @request[“data”][“children”][0][“data”][“name”]. Success! Now what? I began with a RedditAdapter class that pulls the json from the current reddit page. In contrast to other adapters I’ve built, the resulting request needed to be mutable and update whenever a user changed pages or subreddits. After much trial and error, I found that including a set_up instance method allowed for the request and data to be dynamically updated in a single instance of the RedditAdapter class.


To produce a faithful reproduction of Reddit in the terminal, a user needs to be able to interact with the output as if they were in the browser. This led to the creation of the PageInteraction class. This class encapsulated the functionality of a Reddit page, such as changing pages, and updated the underlying URL. The biggest challenge with the RedditAdapter and PageInteraction classes was ensuring that the resulting URL was correct.

PageIntegration Class

The next class is the Post class that takes the data of a single post as an argument and returns a hash with the relevant information. The ultimate objective of the three classes addressed is to produce a CLI that actively pulls data from the web.

Post class

Lastly, I needed an interface to present the information in a digestible fashion. However, the code itself was not so clean. Aligning all of the output correctly required formatting and monitoring of blank space. The post_print method below limited the output length of the post title to 64 characters and outputted any leftover text indented directly below.

Although this merely poorly replicates what Reddit does on their website, I plan to utilize the scraping related classes to build a Ruby gem that pulls data and runs analysis on Reddit content. ^_^

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade