Locked In and Busy: Week 3

Aaron Chen
4 min readApr 6, 2020

--

Photo by John Salvino on Unsplash

Hey folks!

This is an update on my adventures learning new data science and data engineer skills while in quarantine. You can read the previous parts at these links:

Part 1

Docker on Windows 10 Home via WSL2

Part 2

Distributed Computing Cluster

I started running out of some supplies and decided to make a big order to have shipped to me. This means that I will also get some Raspberry Pi 4s to play with! My dream was to get 8 just like in this article I found before, but I couldn’t justify spending that much money. Also, “infrastructure” I’d have to move up to in order to expand to 8 Raspberry Pis dramatically increased the cost. I’ll break that down in a second.

So what did I order? I decided to get 2 Raspberry Pi 4 boards with 4GB of memory each, 128GB micro SD cards, one case that can hold up to 4 boards, one 4 port Power over Ethernet (PoE) gigabit switch (with +1 uplink port), and 2 PoE HATs (Hardware Attached on Top) to install on the Pi boards. Most of those items seem pretty self-explanatory (the Pi boards are the computers, the micro SD cards are storage for each board, and the case keeps everything together), but you may be wondering why a PoE switch and what are PoE HATs. Raspberry Pi boards normally are powered through a USB port (you can sometimes use your phone charger), which is convenient unless you have multiple boards in close proximity. Then, the transformer plugs end up taking a lot of space at your power strip. You can try to use a multi-USB charger, but they have to support enough wattage at each port AND there’s still an issue of networking the Pi boards together.

I generally prefer wired networking over wireless. It’s faster. Those claims that WiFi can deliver ultra fast speeds are theoretical and usually have test devices placed almost right next to the router. Since the point of getting these boards is to DIY a Spark cluster, I wanted to use larger datasets handling larger datasets over slower transfer speeds sounded…painful. Plus, the Pi boards support PoE with HATs! This meant that using PoE hardware throughout would allow for 1 cable for each board for both power and networking and simplify the wiring mess.

Eventually, I am thinking of upgrading with 2 more boards to max out the 4 port switch, but getting 8 boards seemed way past overkill for personal projects. In addition, reliable 8 port PoE switches cost a lot more than 4 port ones, were harder to find (it seemed easier to find 16, 24, or 48 port switches than 8 port units), and were much larger. My hope was to put this pocket cluster on the side and be out of the way as much as possible. A quick glance at the hardware I ordered made it look like I could just stack the cluster case on top of the PoE switch, or even mount the switch sideways. If I got an 8 port switch with 8 boards, I think it would’ve been possible to put two cluster cases next to each other on top of the switch…but again, that’s a lot of cluster for not much reason.

Through doing research on these items, it looks like PoE implementation isn’t entirely agreed upon. The Raspberry Pi PoE HATs I ordered support 2 IEEE active PoE standards, and the switches that use those tend to cost more. The other PoE method (Passive) is more common on the less expensive networking hardware that people tend to like to use. However, there’s differences between how the standards handle power delivery and that’s best summed up here.

The last sentence in the paragraph describing Passive PoE was likely summing up a problem: I kept finding 1 star reviews for Raspberry Pi PoE HATs saying that they kept getting fried or 1 star reviews for switches saying that the security cameras people hooked up to them would burn out. My guess is that people were buying less expensive switches to use with Pis and security cameras without reading the manuals. I did my best to pick components and hardware that explicitly supported one or both of the active IEEE PoE standards and keep it consistent throughout. When everything arrives in late April, I’ll be able to build and test the cluster.

Russian Handwriting Character Recognition Project

I’ve been writing up documentation for this! Also refactoring the project to explicit Python 3 and TensorFlow 2 throughout, which has meant reading a lot of TensorFlow documentation on migrating TF1 code to TF2.

This has also led me to queue up documentation on optimizing workflows for working with TensorFlow…I’ll write up something for that later.

Outside of Data Science

I bought buckwheat flour a while ago! I like making buttermilk pancakes to have for weekend breakfasts, and wanted to try making them a little healthier.

However…I bought too much buckwheat flour since I didn’t realize the recipe used a mix of flours. I was thinking of making homemade soba noodles since I’ve never done that before and they sound delicious!

Also, we have bananas at home that are turning into good stock for banana bread…so buckwheat banana bread made in a cast iron skillet might be on the menu. Why a cast iron skillet? Well, I prefer the rustic look of th-I don’t have a loaf pan here.

Stay healthy, wash your hands, and hope to see y’all soon!

--

--