Predicting sunset colorfulness with the weather

Kevin Xu
shiftcreatorspace
Published in
7 min readMay 5, 2023
Screenshot of Horizonapp.in

I recently shipped Horizon, a webapp that forecasts sunset quality based on weather data. I thought I’d share my development process from working on this project for the past few months.

Problem

Last summer, my friends and I would often try to make it to vantage points in our hometown to catch sunrises and sunsets. Yet, sometimes the early mornings or evenings would be duds and other times I stayed in when I should’ve gone outside. As a hobbyist photographer, I started diving into what makes a good sunset — how can I better predict a colorful sunset?

Research

I learned from articles like this one and this one that, in addition to cloud cover, there are a number of other factor to consider, including humidity (low percentage is better), cloud base altitude (the higher the better), visibility (the higher the better), and even the cloud conditions towards the horizon (you want a clear sky).

And so, I landed on the core feature of this web app: to provide a predictive “score” to future sunsets at a user-specified location based on relevant weather data.

Along the way, I came across other sunset forecast solutions, like SkyCandy, and Alpenglow (which uses SunsetWx). But, these models seem to have been manually developed, but for a problem where predictions are involved (the weather forecast data), images (of sunsets) can be utilized, and there aren’t well-agreed-upon formulas (for a good sunset), I became interested in exploring how machine learning could be leveraged for this problem.

What even is a good sunset?

Before I go into the development and design of this app: some comments on what I mean by “sunset quality”.

As a photographer, I value sunsets most where there are some clouds and the sky looks like it’s on fire. Like this:

And inversely, sunsets where it’s just overcast are ones I (and most people) rate the lowest.

In between these two extremes, there’s a gray area. For photographers, a clear sky doesn’t make for too interesting of an image, but others might like the pastel-colored, gradient sky more.

For the purposes of this app, I’ve decided that these clear-sky sunsets will be considered an “average” sunset. Colorful skies with clouds will generally be “above average,” and sunsets with minimal color will be “below average.” These subjective, somewhat vague definitions will be important later in the data processing stages.

Data

Because I couldn’t find any pre-existing machine learning models relating to sunset predicting or image scoring, my next steps were to find and aggregate data to develop a machine learning model from scratch.

This model’s inputs would be weather data and its output would be a predictive score, which means I needed to source sunset scores that had information on when and where these sunsets were and historical weather conditions.

I figured that best way to obtain the first set of data would be finding sunset images (that I could later score) with date and geolocation metadata, and I found that Flickr would be a good starting point as a source for images via their photos.search API.

For each photo, I could use its metadata to get the relevant weather conditions (humidity, cloud coverage, visibility) using a historical weather API service (I landed on VisualCrossing).

To score each image, I had a number of options: based on heuristics, manually , or using more machine learning (CNN). Trying to create a CNN seemed the complicated, and human scoring seemed the least scalable and the most time intensive, so I opted with figuring out some rule-of-thumbs I could apply on images to obtain a score. Some examples heuristics are like, generally, the better the sunset, the more of the image has orange, red, or pink colors. Or, that good sunsets have less blue, grey, or green. Or, good sunsets have an average color closer to red and magenta. To start out, I implemented just the first heuristic I mentioned.

After a few days hitting the API call quota, I had around one thousand observations of scores and weather data. Researching the various ML libraries out there, I ended up choosing TensorFlow, as it seemed to have good documentation and the ability to be easily deployed for a web app. Using Google Colab to avoid setting up my own machine for ML and with the help of some TensorFlow documentation, I trained a model ready for deployment.

Design

Naming a webapp is always a fun process, whether it be iterating through various English words associated with sunsets, fabricating English respellings of words by removing vowels or adding a “y” somewhere, or trying to find related Latin or Greek words that might sound cool.

I ultimately went with “Horizon.” Sunsets and sunrises occur when sun passes through the horizon, and the forecasting nature of the app relates to the near-future connotations of horizon (e.g. the phrase, “on the horizon”).

With inspiration from Apple’s weather app and wanting a simplistic interface, I designed a monochrome prototype in Figma:

Figma prototypes of Horizon

Development

For the technology stack of the web app, I started out with what seemed convenient. I didn’t need to store any data for the app, so I didn’t need a database service. I was most familiar with Node.js, so this is what I went for my backend. For the front-end, I wanted to develop a more fundamental understanding of CSS and responsive design, so I went with good ol’ HTML/CSS/JS. I originally deployed my app on Heroku, but in the middle of development they discontinued their free services, so I switched to AWS Elastic Beanstalk.

From the development process, I learned (and banged my head) a bunch about the DOM, deploying with AWS, responsive designs, and more broadly how the internet works. From ideation to deployment, my first version of the app took around took around four months of on-and-off progress.

The next few months were spent iterating and improving Horizon, like implementing Google Analytics, getting advice from a UM professor (shoutout Professor Kutty!), and refining the machine learning model — I actually wound up manually scoring a thousand images, with the help of using ChatGPT to create a scrappy image rater app to streamline this process.

Shipping

Towards the end of this school year, I shipped Horizon on Product Hunt, which was a unexpectedly exciting experience. I got more than zero upvotes, and more than zero comments with feedback or feature requests. It was also cool to see the international traffic I was getting onto the site:

Users and countries the week after shipping

Learnings

One of my biggest challenges with this project was just making consistent progress on the app. There would be weeks or even months where I wouldn’t touch the project. Sometimes I overestimated the amount of time my classes or other life things would take and ran out of time to work on projects, while other times I just felt unmotivated, like from being unsure on how to best move forward in stages of the project, like how to improve the model or choosing services to use. Though, as classes eased up, I made small progress on the app, or deadlines set by a club I’m involved with on campus approached, I found myself more productive. Nonetheless, I could have definitely been better about making progress on the project.

Retrospectively, but also looking onwards, two lessons come to mind:

Some progress is better than no progress. Here, I’m reminded of quotes by Donald Knuth, “Premature optimization is the root of all evil,” and Paul Graham, “Action produces information.” I’d benefit from looking for the good enough option rather than the best option, spending less time planning and more time executing.

Secondly, small victories matter. I find that making marginal progress, however small, is motivating for further progress. Had I done this consistently, I think development would’ve been faster. To this idea, I think that making progress visible (like using GitHub or keeping a progress journal) is helpful, as you can then see “how far you’ve come.”

Next steps

There are huge areas of improvement for Horizon that I want to dive into this coming summer. I have the feedback from my Product Hunt launch; there’s still a substantial amount of error (some good sunsets are being scored as average or below average, and vice versa) in the predictions the current sunset forecast model so I want to explore what I can do with CNNs; I also want to add an informational page explaining the scoring system; reimplementing Horizon’s frontend could also be a good excuse to learn React. Are there ways I could monetize Horizon?

--

--