CS373 Fall 2020: Week 6

What Did I Do This Week?

This week, my team got straight to week on phase two. With a lot of our front-end done from Phase 1, we moved to the harder part of data collection. With a couple of team meetings and sub-team meetings out of the way, this week is dedicated to collecting data for our Texas politicians.

What’s In My Way?

Right now, data collection is a lot of work. Our three RESTFUL sources are only a part of the data collection — much of our data needs to be scraped from CSVs, excel spreadsheets, PDFs, and HTML websites. Trying to source everything will be a team effort and will definitely take a lot of time.

What Will I Do Next Week?

In the upcoming days, I’m collecting the harder parts of the politician model. With one of my teammates scraping elected official information from the Google Civics Information REST API, my job is to scrape data about campaign finance from Open Secrets and get missing political challenger information from HTML files on Ballotpedia. This will be done through making use of the requests python module, the csv module, and the html.parser module. We’ll store this in another CSV using pandas.

Thoughts on Why is Silicon Valley So Awful to Women

The stories within the article are quite depressing, and it’s pretty awful to think about how this is still rampant within the industry. To fight this, it isn’t acceptable to not be sexist — we must be anti-sexist to call out messed up power structures and create an inclusive environment for all genders.

What is it Like Working in a Group?

I’ve had bad experiences with groups in the past, but so far my group is pretty on point. One of our members is pretty experienced with AWS and has become the de-facto pipeline guy and GIT master. The atmosphere has been super chill, and we have really smart people on all ends. Overall, this has been my favorite group project so far.

Experience with Iterators, Reduce, and Tuple

Lots of this is eye-opening for a python noob (me). Getting to learn about the python class system and iterators has again made me aware of the intricacies of program design, where the main philosophy of python is to be as concise as possible with what needs to be coded by a developer. Reduce was already familiar to me as a Javascript developer, and it was nice seeing the corresponding python implementation. Implementing the Tuple class in 10 minutes was a challenge, though my group managed to power through and pass all test cases within the time!

Experience with Team Contract and Peer Review

The team contract and peer reviews were very easy to do. With our group being very cooperative with each other, crafting the team contract was very straightforward, and giving my teammates glowing reviews was a no-brainer.

What Made Me Happy This Week

This week was relatively busy, but seeing our site deployed did put a smile on my face.

Tip Of The Week

If any other team has to scrape through a table within a PDF, the easiest way to do so is to convert the table inside the PDF into a CSV. Luckily, an open source project exists for this! Tabula was made to scrape table data in PDFs for news publications like ProPublica and the New York Times, and it’s maintained by an amazing developer community. All you have to do is run the local server, click a few buttons, and export your new CSV to your computer! Very simple.

Obligatory headshot

--

--

--

A blog that contains graded posts for Professor Downing’s Software Engineering class at UT Austin.

Recommended from Medium

7 reasons not to skip the tests

Basics of Operations and Operation Queues in iOS

Flashloan Bot on Polygon Part.2

Total Recoll

Key Features of a Powerful ITSM Software and How They Benefit Your Business

Key Features Of ITSM Software

Writing PySpark logs in Apache Spark and Databricks

Go-to Dev Environment for a Macbook Pro

Feren OS 2022.03 — the newest delayed release

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Larry Win

Larry Win

UT Austin CS Major - Spring 2022

More from Medium

100 Books you should read before die!

100 books you must read before die

CS373 Spring 2022: David Trakhtengerts — Final Entry

NGROK

NeoTripper Thailand