How to win a hackathon.

J Spadigam
Pinboard Consulting
4 min readJun 21, 2022

Dear readers the success has not completely gone to our heads.

This blog is a summary of the challenges we ploughed through in the build up to our victory. Last month the Pinboard Consulting submitted the TigerGraph for UN Data project and won second prize!! The competition in the final stage was tough with some incredible applications of the software to the psychological and financial fields (and more).

The aim of the competition was to engineer a solution to tackle current issues — the solutions must use TigerGraph. There are numerous reasons why the hackathon was TigerGraph native:

1) TigerGraph hosted the hackathon

2) Graph databases can easily find relationships between different data sets

3) The querying power of graph systems is unreal — functions that could take a few minutes to compare on legacy systems can be retuned in a few seconds on TigerGraph

4) The data capacity of graph systems is large — especially TigerGraph. There is no limit to the number of nodes and edges for a graph hosted on TigerGraph

During the hackathon period the supportive staff at TigerGraph held regular sessions on discord to help teams brainstorm through issues encountered. TigerGraph is a relatively new software and as users are still getting accustomed to the platform there were a variety of issues being raised. The mixture of issues included problems faced by newbies in the graph industry and some API issues faced by pro coders; kudos to the TigerGraph team who remained supportive and helpful (even when asked with the same question from 10 different people).

There’s a reason the challenge was called a hackathon and not just a datathon or codefest — it’s because the competition finalists went through an iterative process — of hacking data sets and connection issues before creating working prototypes. Pinboard Consulting had registered for the hackathon as soon as it was announced and created a project plan that had several sprints.

Stage 01) Collect Data and Clean data

Stage 02) Create a schema

Stage 03) Load data and run GSQL analysis

Stage 04) Return JSON outputs on a user-friendly UI

Stage 05) Win the hackathon

Obviously, the process was not that easy or streamlined because data is never clean or easy to work with. The finalist pool is a group of projects that would have gone through a similar process to pave the path for future graph users.

Sprint 01) Data is messy and has lots of blank values. Drop them — especially when you’re using a data set that is not yours (like the UN Data) or when there are multiple data sets combined into a single table (like the UN Data) drop it. It is essential that all commas a replaced — commas in numbers, commas in names, commas anywhere in the csv files should be replaced. All of our test loads and querying were hosted on the TigerGraph Cloud Portal where we loaded csv files. CSV files — as the name states — use commas to separate columns. Very often our data sets were split in the most random locations because of the accidental comma somebody forgot to remove. Furthermore, there was an enormous amount of data being loaded into this graph — towards the end we had to upgrade our server so the incredible TigerGraph software could handle all the data.

Sprint 02) Creating a schema is not a fixed process: it is iterative! In February we started with a beautiful schema where every topic had its own node and they were all connected to common nodes, by the end of the project our graph schema was heavily condensed. Each node can host different CSV files if they have a similar format (data types should be the same for columns with the same name — AND NO COMMAS ALLOWED!)

Sprint 03) Do not wait to write GSQL queries. Post data munging we were creating the load scripts and additional GSQL queries that would create end points for the UI to call. None of the stages need to be done individually, the processes can be run simultaneously at different paces. The Pinboard Consulting team had some legends working on the queries, since it is a part of the secret sauce more details cannot be disclosed :) Word of advice for new graphists, GSQL can be intimidating to learn but the language is very specific because the returns are specific. The more queries you practice the more you will understand the language and the outputs.

Sprint 04) Return the output on a user-friendly platform. There are limitations everywhere! Pinboard consulting tackles all these issues with the UI Designed. TigerGraph returns JSON a format that is not easy to use (unless you code) — whereas the UI returns outputs so that users can filter them as needed. The UN Data website only allows a maximum download of 500,000 rows of data — our UI has no such limitations. Additionally, the UN Data website only returns one table per topic — the Pinboard UI shows users 2 different data sets in a table and compares them through a correlation graph! Anything can be compared — the number of blueberries produced in a year vs the number of homicides committed the same year (people do get angry when blueberry season is late.) There were some issues with connecting the end points to the UI but with help from the TigerGraph team and lots of brainstorming the issue was resolved — supporting our prototype.

Sprint 05) Obviously. We won!

In the end, Pinboard consulting took home a prize for a UI that can be used by anybody to export and compare data about different issues in the world. If you want to find out more or try creating your own version of our submission, check out our DevPost and GitHub pages.

--

--