💸 How I created frugalfashion.xyz

Ihsan Etwaroo
The Startup
Published in
6 min readJan 30, 2019

I noticed a couple of my friends religiously check the subreddit r/frugalmalefashion for deals, in particular, common project drops. Over the 2018 holidays I created frugalfashion.xyz, a web application that tracks r/frugalmalefashion and r/frugalfemalefashion. Users specify products they’re interested in via keywords, then the app filters the subreddits using them. Those filters are used to send email notifications for relevant posts and tweets them out! So how did I do it?

Original design

Overview

  1. Cron script polls reddit
  2. Mysql stores new posts
  3. SQL inserts trigger downstream jobs — email and twitter notifications
  4. CRUD server — serves up the UI, fetches posts from mysql
  5. Firebase — authenticate and store user metadata

It turns out firebases’ free tier has more than enough space for a couple keywords and toggles per user. The only looming limitation is supporting concurrent http connections which is restricted to 100. In the case of frugalfashion.xyz, the average number of concurrent connections is 25, so it all works out for now 🤞

Finalized design

Polling Reddit

crontab was my weapon of choice to periodically poll reddit. I chose cron polling because of it’s daemon nature, living in isolation separate from the web application. The overarching vision was a stream of posts readily available for consumption, whether by web app users, email notifications, or twitter notifications. Decoupling the polling logic from these services was necessary, and cron daemons perfectly embody this behavior. The only downside to using crons is that minutes are the lowest level of time granularity, capping the polling frequency at once per minute. Alternative solutions which alleviate the time restriction are LoopingCall or an endless loop which sleeps, similar to the implementation of python polling.

Storing Posts

mysql was my database of choice due to its simplicity, ACID properties, and abundant binlog processing software. The binlog is a log file mysql writes to every time a database mutation occurs. I wanted to experiment with reactive log consumption through kafka as a proxy for real time notifications. Alternatively, I could send notifications directly from the cron script. However, more logic would be added to the cron which has a minute long time constraint, I would have a larger script, and that’s boring 😊.

In retrospect, an elasticsearch cluster would’ve been more performant when searching posts by keywords, and making more complex string queries (the nature of this application) however, it’s overkill for the current application state.

Consuming binlogs

Kafka and maxwell were the only reliable battle-tested veterans for binlog consumption. Kafka is a server waiting to consume data from a producer. Maxwell is our kafka producer; a server tailing the mysql binlog and writing to a kafka topic on post inserts which we consume in order to send twitter and email notifications. I mentioned earlier that I am reactively reading the binlog to simulate real time notifications, this is all accomplished through the producer/consumer model provided by maxwell and kafka. I could cheaply create another kafka consumer to index an elasticsearch cluster for more performant keyword searching.

Notifications

Twitter and email notifications are individually setup as kafka consumers. For twitter, tweets are created on each insert that has appropriate tags. For emails, a similar process is followed with the caveat that a firebase query is made to figure out which users want emails for all notifications or only certain notifications which contain their specified keywords. I currently use mailgun as my dedicated smtp server, which initially didn’t work out too well since the free tier servers have a poor sending score. Once upgraded, emails stopped bouncing and were delivered successfully.

CRUD Application

I used tornado because I ❤ Python, and tornado’s very simple to configure. The web application is light weight, solely serving the UI and querying mysql for posts.

Beautiful UI

As a minimalist who fancies large, seemingly weightless UI elements, I used gestalt which is Pinterest’s UI framework! I never used gestalt prior to this project and it was quite simple to use, especially when contrasted with material ui by Google. I find value in React frameworks when they offer high customizability through component composition rather than component configuration. This means the components are thin in nature with only necessary props, and part of a library that contains very basic and generic components.

Notice how the components in the snippet above are generic, yet each serves their purpose with a small amount of props. Those components generate the list of posts, which look great and react well to display events, without needing heavy prop customization.

Supporting Infrastructure

I used digital oceans five dollar server to host and serve everything. Crazy right? Although it’s not all sunshine and roses, I had to sacrifice storage space for swap in order to get more available memory for kafka, a notorious memory hog. Systemd is a process manager which enables rapid process startup, restart, termination, as well as relevant memory usage metrics.

Considerations

In the past, my app architecture was often monolithic in nature, comprised of a web server, nginx, and mongodb. This consumed a lot of memory, making fine tuned optimizations and deploys troublesome. For example, a change to the notification system would require a whole server rebuild and deploy, which is not ideal.

In addition to performance degradations, I would suffer from brittle logic coupling. This often times killed motivation to continue a project, let alone fix post production problems.

For this app, I decided to go with a service oriented structure, splitting features across multiple jobs monitored by systemd. This allows me to focus on particular parts that need attention, without being overwhelmed with other parts of the system. Each piece of the application is very small and lightweight, encouraging cleaner, concise code that I wouldn’t mind dealing with in the future. I deploy each notification service and the web app separately through Systemd, behind nginx.

Future downstream consumers are easy to create through my kafka consumer templates which live on their own, completely separate from the already existing services. If this were a monolithic web app, I would need to read through code to find the place where I call my services and make necessary modifications.

Let it be made clear this approach is far from perfect. Instead of coupled logic and large files, one has to deal with coordinating multiple services, and keeping track of how the systems work with one another. It’s a different problem, one which is new to me and I currently enjoy.

Conclusion

Overall, I am pretty happy with how this turned out 🤗. The app was well received by the community with 266 upvotes across two reddit posts, 2K page views and 300 signups in a week. The coolest part was when a redditor retweeted my bot for notifying him of a deal he acted on. Tune in here for future feature updates and releases 💪.

If you enjoyed this article and want to continue the discussion, or poke my brain about anything, feel free to reach out @ ihsanetwaroo@gmail.com

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by +418,678 people.

Subscribe to receive our top stories here.

--

--