Intro to Data Structures

Robert Mundinger
CodeParticles
Published in
3 min readDec 11, 2017

Excel, csv, xml, json, databases

The most familiar data structure in the world is the spreadsheet 🤓 🤘

But it’s not the only one.

One way to represent a spreadsheet is with CSV (basically a simplified version of an Excel file — read more here). Each ‘cell’ value is separated by a comma. Here’s the same file as a .csv file:

You can copy and paste the following CSV data:

Group,Admin,Grade,Social Studies — Number Tested,Score
CARROLL ISD,Spring 2017,8,462,4417
RICHARDSON ISD,Spring 2017,8,647,4304
ALLEN ISD,Spring 2017,8,815,4275
DALLAS ISD (057905),Spring 2017,8,236,4176
COPPELL ISD,Spring 2017,8,360,4149
MCKINNEY ISD,Spring 2017,8,926,4145
PROSPER ISD,Spring 2017,8,533,4132
FRISCO ISD,Spring 2017,8,2063,4101
HIGHLAND PARK ISD (057911),Spring 2017,8,491,4097
PLANO ISD,Spring 2017,8,1353,4057
CARROLLTON-FARMERS BRANCH ISD,Spring 2017,8,181,4026
GRAPEVINE-COLLEYVILLE ISD,Spring 2017,8,596,4016
GARLAND ISD,Spring 2017,8,569,3981
MESQUITE ISD,Spring 2017,8,185,3960

Here:

codebeautify/csv-to-xml-json

And convert to other data structures — either JSON or XML (two other examples of data structures widely in use today). It’s the same data represented in different ways. Each has its own strengths and weaknesses, each is typically used for different purposes and each has it’s own methods to query and filter the data.

Most of the data that moves around the internet today is in Json form, which has largely replaced XML which used to be much more prevalent.

Databases

Databases are basically just groups of spreadsheets on steroids. A database can have many tables (basically just fancy spreadsheet) which are linked. You can imagine a kick ass clothes store that sells…clothes. They might have a database with 3 tables — Customers, Products, Orders:

Customers Table
Products Table, Orders Tables

This is far more organized and efficient than if the ‘Orders’ table had to list each customer’s name, address, email, the products name, and price. You would have a HUGE amount of redundant information. As long as the tables have matching columns (‘keys’ as they’re called in database speak) everything is organized much more efficiently, simply and uses much less space.

It’s also far easier to find information. If I want to find all orders above $30 I can write and execute a SQL Query to find that.

Select [Columns] from [Table] where [Column] > 30

Basically the entire internet is run on databases. Facebook has a table of Users, a table of Posts, Likes, Comments and thousands of other actions you can take on the site that they track. Every tweet you write, every google search you type, every link you click is entered into a database somewhere on the internet.

There are thousands of other data representations on computers that are out of scope of this article, but each essentially does the same thing…represents and links data together.

All with tradeoffs — ability to query effectively, speed, file size, etc.

sqlite, protocol buffers, hash tables, dictionary, lists, arrays, neural networks

SQL, XPath,

SQL Server, MySQL, MongoDB, etc.

big data,

https://developers.google.com/protocol-buffers/

--

--