Comparing Apples with Massively Parallel Oranges — MPP Architecture in a Nutshell
So, you’ve got your first gig coding ETL on Netezza, or let’s face it, any of the other Massively Parallel Processing (MPP) data warehouse platforms out there (because someone will get upset if I forget them … Teradata, Exadata, APS, Greenplum, Redshift … Who did I forget?)
I digress. So, you’ve got your first Netezza gig and you’re wondering to yourself as you code the same old SQL like you did on your quaint old enterprise RDBMS, what’s the big deal with these Massively Parallel Processing platforms? It’s not like Oracle hasn’t had a parallel query engine since, like, forever — seriously, where do these guys get off with their high-falutin’ “Database Appliances” talk?
Fun fact: Fire trucks are known technically in the biz as Fire Appliances. I’m totally painting my database red.
Well I love explaining stuff, and I also love a good metaphor, so I present for your demystifying pleasure:
Uncle Exatezzadata’s Trade Unionist Apple Orchard
To be clear, what I’m about to attempt is an explanation of MPP architecture using a grossly simplified system, being an apple orchard filled with trade-unionist workers.
Uncle Exatezzadata (or “Unc”) has an apple orchard — think of that as your database platform — and while apples do grow trees, they don’t pick themselves, so Unc is going to need some workers. Sadly, the labour market in the apple biz is heavily unionised, and every worker specialises in a single skill set to the exclusion of all others for fear of demarcation reprisals. Unc’s workforce, which is frankly struggling with demand, is currently as follows:
- Deb is an apple picker. She picks the apples off the trees (ie. Deb is a disk drive)
- Rob is a collector. He collects apples from Deb and holds them in preparation for filling apple orders (ie. Rob is RAM, or computer memory)
- Cal is a sorter. He knows apples upside down and inside out. When it comes to filling orders, Cal is the ultimate authority on variety, colour, size, and ripeness. He can shine them, put the little stickers on them — in fact, Cal pretty much makes all the decisions. (ie. Cal is a CPU)
Here’s how YOU MIGHT THINK a typical order works:
UNC: Deb, gimme four small, Red Delicious.
DEB: You got it, Boss!
UNC: Oh, and Debbie — make sure they’ve all got those five bumps on the top. I like the bumps. And shine em up for me — they’re for an important customer.
DEB: Consider it done.
Actually that’s how Uncle Exatezzadata wishes it worked too, but Deb is in the Orcharders’ Collective Trade Union (OCTU), and according to union regulations, she is ONLY allowed to pick. That’s it! All of the other jobs, like selecting the right variety, shining, and even counting — well they’re Cal’s job. And Cal’s got his own problems. He can’t work with Deb’s big dirty picking buckets, so he requires Rob to lay out every apple neatly on the table and hand him each one as he asks for it.
So what really happens is:
- Unc places the order with Cal
(SELECT apples FROM orchard WHERE variety=’Red Delicious’ AND bumps=5 LIMIT 4) | SHINE | DELIVER
- Cal sends Deb out for apples (just that, no Red Delicious, no five bumps, just apples)
- Deb brings back apples by the bucket load and dumps them on Rob’s table
- Rob hands them one by one to Cal, who examines them and discards the Pink Ladies and Granny Smiths, keeping the Red Delicious only if they have five bumps.
- If there’s not enough of the right apple, Cal sends Deb back out into the orchard and the process repeats.
Do we see any inefficiencies in this operation? It might seem silly, but this is how typical databases behave, because disks can’t make decisions and CPUs can’t access data until it’s in RAM. The obvious problems:
- Cal is wasting a lot of time waiting for Deb to return from the orchard. This is CPU idle time.
- Deb is spending too much time walking and not enough picking. This is disk latency.
The obvious solution is to retrain the workers but the OCTU is too strong, so Uncle Exatezzadata decides to scale things up, hiring more pickers, more collectors, and more sorters. Now the place is really humming. Things do go faster because the collectors and sorters have more apples coming in from all those pickers, but Unc has made a critical scaling error — everybody is still doing things in the same old inefficient way. The collection shed is a madhouse with pickers racing in and out, collectors stepping on each other’s toes, and sorters fighting over the best jobs, spending half their time idle because they can’t allocate work fairly. Worse, Deb and the other pickers are still spending too much time walking to and from the orchard with apples nobody wants.
This is how normal parallel processing works: same old methods, just more of them.
Desperate to scale his orchard, Uncle Exatezzadata shows up at OCTU headquarters with hat in hand, asking for their help. Can we please, PLEASE give the pickers a little autonomy to pick the right fruit. Lucky for Unc, the union has just brokered a new deal with members. It’s not what Unc wanted (multi-skilled pickers), in fact, it seems like the opposite — even more specialised workers — but it just might do.
The Massive Parallel Processing Apple Orchard
There are two key innovations in the Massively Parallel Processing Apple Orchard, and not co-incidentally, they are the same two innovations in Massively Parallel Processing database platforms:
- Shared-Nothing Parallel Processing Units
- Accelerated Streaming Technology (does not apply to all MPP appliances)
Let me explain…
Shared-Nothing Parallel Processing Units
Rather than putting all of the orchard workers in the same shed, Unc consults the new union by-laws and sends all those new collectors and sorters out to small stand-up locations (Apple Processing Units, or APUs) distributed throughout the orchard. Each of these APUs has its own picker, its own collector, and its own sorter. Ie. Each APU has dedicated HDD, RAM, and CPU. Cal and Rob are still back in the shed taking orders and shining apples, but each APU is now sending them exactly what they asked for, so their job of delivering the right fruit is now orders of magnitude easier. Furthermore, the APUs all have their own dedicated apple fields, with every apple variety present in every field, so when Unc asks for Red Delicious, every APU can get involved and none of them are colliding with each other or fighting over work.
MPP databases are the same. They have SPUs (Netezza) or AMPs/Cells/whatever — small, independent units that perform data processing tasks. You still need the top tier of processing like having Rob and Cal in the apple shed, but all of the grunt work is pushed down to these specialist units whose ONLY job is to crank out data processing tasks.
Accelerated Streaming Technology
If the Shared-Nothing Parallel Processing Units was the end of the story, then you’d get a fast and scalable solution from the efficiencies alone, (especially when you add indexes to send those pesky pickers in the general direction of the right apples). Netezza has spurned the idea of indexes, and instead they have introduced a new MPP innovation in Accelerated Streaming Technology (fAST). Exadata has something similar by a different name — SmartScan.
Back to the orchard metaphor. As efficient as the new APUs have made the operation, pickers are still spending a lot of time walking to and fro with the wrong apples. Remember they cannot make decisions, so they’re still picking Pink Ladies half the time when the order asks for Red Delicious. Unc can’t teach pickers how to pick the right apple, and he can’t move the sorters closer to the trees, so instead he hires a new type of worker — a Picker’s Pal (let’s call her Fran). Fran is a bit like a mobile sorter; she can go out amongst the apple trees with Deb and help make simple decisions on variety, size etc.
Now, when Deb comes back with a bucket load of apples, they ALL match Unc’s request — Red Delicious, 5 bumps. Fran’s no sorter — she can’t put the stickers on, she can’t shine the apples, and she can’t make up complex orders of different varieties, but by goodness, she saves Deb a heck of a lot of walking by making sure she only selects the fruit that will pass inspection.
In a Netezza database appliance, Fran is known as a Field Programmable Gate Array (FPGA). Greatly simplified (because I’d get it wrong if tried to explain it in full), an FPGA is a special-purpose computer chip that sits close to the disk — it’s not part of the operating system, and it’s not part of the database kernel. To a disk system, data is usually just a mess of bits and bytes; it doesn’t become tables, columns and rows until you introduce it to the database software. The FPGA is smarter than that. It has the capability to perform some simple operations to reduce the amount of data sent back to the database kernel, including:
- Filter non-matching rows using WHERE clauses
- Filter unwanted columns that are not required by the SQL
- Compress these bare-minimum results and pass them back to the Netezza database kernel.
Instead of getting entire blocks of raw disk dropped into memory containing many unwanted rows and columns, Netezza’s FPGAs ensure that only the barest minimum data is returned, reducing the payload on the SPU’s RAM and CPU.
A key tenet of database performance tuning is reducing redundancy, especially redundant I/O. In traditional database systems, indexes played a big part in this practice, and we needed to learn a host of intricate techniques to craft ever more intricate solutions, forever seeking another reduction in redundancy. Massively Parallel Processing systems in general, and Netezza in particular, turn this practice on its ear. Far from intricate slicing with a scalpel, Netezza takes to data with a great scythe, lopping off redundant limbs with alacrity.
By no means should you think that MPPs and FGPAs obviate performance tuning, but they do change the playing field, and the principles we learned with traditional DBMS offerings do not necessarily apply — or at least, not in the same ways.
If there’s a learning here, then it’s this: know thine enemy. Before you launch headlong into editing your SQL and wildly restructuring Organizing Keys and Distribution Keys, get to know your database. The MPP engine is only the tip of the iceberg. Find out how Netezza goes about processing your queries, because only then will you be able to identify and tune redundancies. Anything less is not tuning, it’s guesswork.
* * * *
If you thought this was amusing, leave me a “clap”. If you thought it was informative, leave me a few. I’ll post more about Netezza query tuning in future posts with newer and more tortured metaphors, so make sure you follow the DWS+Symplicit blog to get it all in your in-box.