Database is not always the answer

Jorge Castro
Cook php
Published in
3 min readOct 27, 2019

It is about big data versus conventional database

I recently ended a hackathon www.spaceapps.cl and my project was simple: crawled information about the weather and process it visually. The system worked. This hackathon lasted 2 days so everything was rushed.

I didn’t win it but I am not mad

However, a jury lost her arms :-)

My project was simple.

Collects information from a website.

Technically, it is illegal to crawl a site so I can’t share the library that I use and it is a shame, it works really well.

Parse this information.

->enterLevel('<A HREF="http://www.nws.noaa.gov/dm-cgi-bin/nsd_lookup.pl?station=','"',false,false)
->if()
->set('myid','@_value@')
->showmessage('@myid@')
->object('myrow','stationid','@_value@','add')
->else()
->showmessage('exit')
->break()
->endif()
->exitLevel()

It is part of the code.

Store into a database Mysql

Shrink it and process it (ETL)

And finally display it.

I love the use of the database. Databases are ideal for data analysis.
However, I tried to insert a lot of information into the database and it was impossible to do it in a timely manner. It took around 3 hours to store 5% of the whole information (and this hackathon lasted 48 hours).

So, I decided to change strategy: FILE SYSTEM and surprise

It did the job in 5 minutes.

Why? It’s simple. Every time we insert a value into a database, the database does a lot of jobs, updating the index, adding values to the redo-log, reserving space into the tablespace and inserting the value. Even if we don’t use an index, the work is huge. Instead, the file system is simple, it stores the information as-is. The only bottleneck is the hard disk.

I could have done with MongoDB but (for this job), the file system is way faster even to MongoDB. Also, MongoDB adds a new level of complexity.

Finally, I compressed all the information and I store a consolidated into the database and the system works decently.

Note: Also published on

--

--