HackMirror: The HackMIT 2018 Puzzle Guide Part 3

Shreyas Kapur
HackMIT Stories
Published in
6 min readOct 19, 2018

(Part 3 of 1, 2, 3)

Welcome to Part 3 of the Puzzle Guide, keep reading on for politics and piracy forensics ;)

Gerry

(website, GitHub)

If you’ve been following political news in the United States, you’ve probably heard the word Gerrymandering thrown around a lot. As strange as this word is, it’s a very real activity that takes places in many U.S. states. Congress is meant to provide representation to states proportional to population. This means that each state has some number of members of Congress that is proportional to the number of people who live in that state. These members of congress are elected by voters in a specific area in each state. The specific areas are supposed to represent congressional districts. This puzzle focuses on the layout of districts within states. The key insight is that by optimizing over the layout of districts one can increase the expected number of seats their party wins.

Puzzlers were presented with a request from their employer, E Corp, to help gerrymander for an upcoming election. The state was modeled as a grid, in which each cell contained voters biased for party A or B. The puzzlers were required to provide a labeling such that there were 20 total districts, each district contained one block, and each block belonged to a single district. Additionally, districts were required to be contiguous. Puzzlers had to provide a labeling that satisfied a local board of elections and weakly dominated another consultant’s solution. Three objectives were considered:

  • Expected number of districts for party A (approximated to the sum of probabilities of party A winning each district)
  • District population imbalance (L2-norm between district populations and (State Population)/(Number of Districts))
  • Expected Efficiency Gap (Proposed metric used to evaluate fairness of districting, would be more interesting if we had included voter turnout in our model)

This problem was open-ended. There wasn’t a single intended solution. We randomly generated population stats, drawing populations from an exponential distribution and voter proportions from a normal distribution. We used simulated-annealing to solve the optimization problem, and then slightly relaxed our objectives to set the requirements. We randomly assigned puzzlers to our generated problems so it wasn’t likely that two puzzlers would get the same problem.

Puzzlers took many different approaches. Some took a similar simulated annealing route, others used genetic algorithms, and our favorite solution was manually gerrymandering using Excel. One person even created a React app to assist them. Although we weren’t expecting people to optimize manually, it was really cool to see that Gerrymandering, something that’s been going on since the 1800s) is feasible without the assistance of advanced optimization methods. Manually packing (putting party B voters in a few small districts) and cracking (spreading party A voters across many districts such that they have a majority in each district) led to successful solves.

HackFlix

(website, GitHub)

Copyright is a serious issue. Often content creators use Watermarks to protect their content,

Usually, these watermarks are visually intrusive and detract away from the main focus of the content. As a response, content creators like Netflix resort to embedding a digital hidden watermark, embedding a few bits of information in the frequency space.

This way if someone freeboots and uploads content from Netflix to the Pirate Bays of the interwebs, Netflix knows which user stole that content. Schemes like Cinavia can even survive compression, transcoding and speed changes in the video.

Introducing HackFlix: the most premium content distribution platform out there, with the best security.

Visiting the website, the puzzlers were greeted by theHackBay™.

Past a few distractor videos, the puzzlers finally encountered the coveted Adventures at HackMIT II: A 5 second clip of a random section from some footage we shot at our high school event, Blueprint.

A few clever puzzlers inspected the network and got the raw mp4 file. But trying to upload this to the theHackBay™ resulted in the following DMCA Takedown Notice:

Uh oh, this is no good. At this point a few puzzlers noticed a little QR code in the bottom left corner of the image, which when decoded consisted of some metadata including the username.

The blending of the QR code however was not exactly reversible. The blending formula was I = min(I + α(255 -Q), 255). Where I is the final image and Q is the QR code image. While a few of pixels were recoverable by simple subtraction and trying out a few different alpha values, a few pixels were lost if they exceeded 255. If the puzzlers did this using this naive method, they got the following message:

The intended way to solve this puzzle was to use a computer vision technique called inpainting. We measured similarity using the structural similarity metric on the QR patch and then the rest of the image, tuning this metric such that simple inpainting techniques such as fast marching inpainting (found easily in OpenCV’s cv2.inpaint()) wouldn’t work.

Our solution used the new partial convolution approach, leveraging neural networks to do the inpainting. These networks often use the SSIM loss while training, which is also the metric the server was using to assess the quality of the submitted clip.

People came up with very creative ways to solve this puzzle. One puzzlers wrote a Photoshop script to automate using their proprietary inpainting method on every frame to get the job done. A few people ended up using OpenCV’s inpainting while manually fixing bad inpainted frames and getting past chroma subsampling and other compression artifacts.

While this puzzle was the most frustrating out of all the puzzles due to the open ended nature and lack of goal, a lot of puzzlers did manage to solve this, completing all the puzzles and getting a spot at HackMIT 2018. We’ll try our best to reduce the frustration for this type of puzzle next year. 😄

Conclusion

As puzzling season has come to an end, we would like to give a huge thanks and congratulations to everyone who tried the puzzle this year! With over 7500 puzzlers spending an average of 164.45 hours (nearly 1 week) hours to solve the puzzle and over 820 puzzlers collaborating on the Slack, this was one of our most exciting years to date! We also opened up guest registration this year, after feedback from last year, and were excited to see over 2500 guest users participate in the puzzle.

We had exactly 60 solvers this year; special shout out to the fastest solver, who finished in just over 32.26 hours, and the final solver, who finished in just under 774.77 hours (~32 days — huge props for perseverance!). To be in the top 50, you had to finish within 466.43 hours (~19 days). For those of you keeping score at home, we have some stats below:

Solvers Stats per Puzzle: It’s interesting to see that the solve numbers dropped off roughly in pairs
Top 50 Solve Times (in hours)
Puzzle Solve Times (in hours)

The puzzles are one of the most exciting parts of the year for all of us puzzle writers, but it’s all of your puzzlers that really make it meaningful ❤️. Congratulations to our top 50 solvers, and we hope everyone had as much fun working on the puzzles as we did making them!

--

--