Lego vs SoC, Apple M1 + MT8195, Microservices and Big Data Model
This week (2020–11–10) was really big for System on a Chip: first Apple M1, and then followed by MediaTek MT8195/MT8192. But why on earth these have anything to do with lego, microservices and even data model? It is a topic I have put off for a few years.
Probably, all of us have unanimously nominated lego as the most powerful tool or toy ever invented in history. In any survey or multiple choice quiz, as long as “lego” appears, its super flexibility and limitless creativity can always stand out. Whenever I joined a discussion about “next generation of data processing platform” or “what is the best big data architecture”, lego was often mentioned as the perfect analogy for the ideal design. However, I’d always favor SoC over lego bricks for the critical big data problems.
Before any explanation, let’s joke a bit about the dark side of lego and microservices first. Actually lego and microservices share some key traits: Agility, Flexible Scaling, Easy Deployment, and Reusable Building Block. But over-indexing upon the above traits might also lead to:
- oversimplified / rigid interface : that can explain the rise of GraphQL over REST
- transportation and efficiency overhead : 180 bytes payload with 300 bytes header. call graph amplification and get lost there. high-end network and big memory are provisioned to reduce each service/brick’s latency, yet the overall hardware utilization remains very low and energy consumption stands high.
- lack of end-to-end optimization : teams might use the excuses of deployment or modular agility/isolation to draw the clear-yet-small permitter for responsibility, then end-to-end becomes almost impossible.
- little data locality : because each service can only performance a relatively simple task, the data/payload are transferred from service to service to another service. The advanced CPU (big L2 cache, deep pipeline, SIMD, AVX, out-of-order execution) and memory (shared memory, huge pages, DMA) acceleration are barely put to work.
- complex debug and customer service : each block is simpler and easier to test, but the packaging is more fragmented now. More burden is shift to SRE team and integration/acceptance test, which have to deal with more coordination effort and much more lineage/call-graph mess. The product or business partners will have to deal with more technical teams to triage and fix any functional and/or engineering issues. It is not uncommon to see some microservice teams kicking the ball around and the business grows frustrated.
While the lego bricks and microservices sound awesome, it’s also interesting to see how Apple and MediaTek are doing the opposite to bring the speed, efficiency, and what customers/partners want to the market: instead of further breaking functions into more micro chips and outsource each module to different vendors, both competitors consolidate more services into their SoC: putting GPU, AI (Neural Engine) and Imagine Signal Processor into a single silicon. This makes the tapeout and testing more complicated, but once the SoC is delivered (to make the phone, tablet, laptop, and server in the future), a lot of more can be quickly built and evolved on top of it. In addition to putting 6 modules into M1, Apple also tightly integrates its macOs software with M1 to achieve the amazing CPU + GPU performance, fan-less MacBook Air, super long batter life, and thiner/lighter body. Yet all such tight integration & coupling might be considered as anti-patterns by microservices and lego fans.
Old-school monolithic architecture is obviously bad, but there is also the drawback in thousands of microservices chaos and associated cost blackhole due to the low efficiency and over provision. The more balanced way is to develop and unit-test software in microservices way, but then to package multiple coherent & related microservices into a single
macroservicefor integration-test and final deployment. More importantly, once a set of
mciroservices are repackaged together:
- a good portion of the intercommunication can be switched from HTTP/RPC to IPC/shared-memory
- cache can be shared with much higher hit rate and utilization
- better data locality can be achieved along with a bigger bulk/batch size across CPU level, data storage level, and service level
biggerservice interface and integration point now can hide more needy-greedy details from consumers/clients, the high-level call graph becomes easy to follow and comprehend, and deployment coordination is also simplified
We can still scale the system by dial up/down instances of such
macroservice, even without the super fine-grain control (e.g. having one
microservice with more/less instances than another
microservice). The re-packaging should be totally worth it.
“Data Middle Platform” or “Data Middle Office” is a concept that Alibaba has advocated and practiced since 2017. It was inspired by the amazing data-driven efficiency of Supercell which was visited by Alibaba executives in 2015. Alibaba then rearchitected its data infrastructure and organizations to push the so-called “big data middle platform with small front platform” strategy — consolidating scattered & repetitive microservices & data models is the core action behind the fancy name. Though this movement is not well known to or understood by the internet giants in US, it actually reassembles a lot of focus and practice in the SoC.
In layman’s terms, it is important to spend more engineering + business effort to model and scale one or a very small number of key tables for each business core/line. The ETL, storage and query/serving platform are end-to-end optimized to support such wide/big tables. All the experiment performance, analytics, decision making, and prediction are derived from such wide/big tables. The popular-yet-chaotic data democratization fashion advertised by many Hadoop vendors fell apart recent years, and Databricks, Snowflake, and even Microsoft/Google are bringing back again the data warehouse modeling with better underlying infrastructure support. A wide/big fact table with a dozen dimension tables are developed, tested, evolved, and packaged together like a well-organized SoC to provide both higher quality & efficiency, and lower integration & support cost.
When the world is enjoying the mobility of phones and tablets, SoC and its ongoing consolidation plays the crucial role there. People love to have faster device with longer batter life and better software efficiency. This should not be a hardware-only trait, I believe that software and data architecture should also rethink the balance between microservices and macroservices, and put the efficiency & quality into higher priority from consumer & business angle (instead of producer or engineering angle), because after all, only the successful business can fuel the true innovation of technology especially in big data and AI.
(* Spark Whole-stage Java Code Generation is one of the examples to illustrate how to write the logic in multiple operators/snippets, but the system will combine them into a single execution stage to achieve higher efficiency.)
(* Disclaimer: The views expressed in this article are those of the author and do not reflect any policy or position of the employers of the author.)