Are Servers Next for Apple?
With the impressive performance and low power usage of the M1 chip, could Apple begin building chips for the server market? And if not, what is stopping them?
Okay, so we know that Apple’s M1 chips are really fast. What is stopping them from also taking over the server market in the future with their own chips? To answer this question I will cover a number of different topics:
- What makes a good server chip? Will a server make different demands on a microprocessor compared to a desktop computer?
- A comparison of ARM and x86 business models, to better understand how competitors in the ARM chip market operate in a fundamentally different way from x86 competitors.
- What does the ARM competition look like? We will look at some high profile ARM chip makers: Amazon and Amphere. What is their technology and business model? How do they stack up against Apple’s M1?
- A paradigm shift is ongoing in manufacturing. Chips are not made the same way anymore. In fact nothing is. How will this affect the development of the market over time?
- What advantages and disadvantages does Apple have entering the server market? We will discuss Apple’s unique vertical integration advantage.
- How Apple’s unique advantages in the consumer market may actually work against them in the server market.
What makes a Good Server Chip?
In the server market there are many different workflows. As discussed in my Why is Apple’s M1 Chip So Fast story, servers often have very different needs from a desktop computer. There I used the example of how servers could have multiple requests from multiple users. These requests are frequently not CPU intensive. E.g. serving up a web page or getting data from a database is not CPU intensive. In this case being able to carry out multiple tasks in parallel is more important. Thus CPUs with lots of cores are very advantageous.
This is the opposite of how Apple’s M1 or A14 chips are designed. These are designed to have a few very powerful cores. That matters more when video editing, playing games, drawing etc.
Thus in this space there are other ARM CPU designs which are better. Let us look at bit at some of them and compare with Apple’s M1 chip.
But to make sure you actually know what I am talking about, let me briefly talk about CPU instruction-sets.
Quick ARM vs x86 Comparison
Apple makes chips called Apple Silicon. AMD makes Ryzen Chips. Intel typically makes chips they call Intel Core. Qualcomm makes Snapdragon chips. Ampere makes Altra chips. However all these microprocessor producers make chips which fall into two categories: ARM and x86. This is basically the name of the instruction-sets used by these chips. You can think of this as the language used by each chip.
Here is a superficial comparison. Below you see 5 ARM instructions which loads the numbers 4 and 5 into two separate registers
r2 . Registers are memory locations inside the microprocessors (CPUs) that allows you to perform arithmetic operations.
We add the numbers and later store them at memory location 24 in memory (RAM). I will not go into further detail about this as it is not that important for this story.
LDR r0, #24
LDR r1, #4
LDR r2, #5
ADD r1, r1, r2
STR r1, [r0]
On a chip that understands x86, the instructions used will be slightly different:
MOV ax, 4
MOV bx, 5
ADD ax, bx
MOV 24, ax
In this case we are moving the numbers into registers called
bx before adding. The key takeaway here is that ARM and x86 are basically different languages. A program that was written to run on ARM cannot run on x86 or visa versa. If you want to run an x86 program on an ARM based processor you need to use a translation service such as Apple’s Rosetta 2, which translates x86 code to ARM code.
Intel and AMD makes x86 CPUs, which means even if you wrote a program to run on an Intel processor it will also run on an AMD processor. Likewise a program written for Apple Silicon would also run on Qualcomm Snapdragon, Ampere Altra chips and Amazon’s AWS Graviton chips.
That is one of the advantages of ARM over x86, there are a lot of different companies making ARM chips. There is a another element of complexity to this story.
You see the ARM Ltd. company doesn’t just design the ARM instructionset. They also create whole designs, or what we call micro-architectures for ARM chips. Modern CPUs are really made up of multiple CPU cores. ARM will design such a cores. Example ARM has designed a CPU core called the Neoverse N1. Amazon and Ampere pay ARM Ltd. to use this design to create their own CPUs. It is up to them to decide how may of these Neoverse N1 cores to they want on their chip. How much cache memor to they want. How should these cores be connected. They may buy a license for a design to do that from somewhere else.
But this is not the only way to do it. Apple’s M1 and A14 processors e.g. does use the ARM instruction-set licensed from ARM Ltd. However Apple is using their own custom designed CPU cores called Firestorm and Icestorm.
This is important to be aware of when making comparisons. A lot of ARM chips form different companies will be quite similar in performance at the same number of cores. Apple is a bit of an outlier.
In the x86 world there is no sharing of core designs. Both Intel and AMD make their own cores and thus the performance characteristics can be quite different between them.
Okay now we got some background so we can begin to compare Apple’s M1 to existing ARM chips aimed at the server market.
Amazon’s AWS Graviton Processors
Amazon has been making their own ARM processors for the AWS cloud service, called AWS Graviton. The latest incarnation is Graviton2. These processors are based on licensing Neoverse N1 cores from 1ARM Ltd.
So the effort that has gone into making these are not really comparable to what Apple has been doing. And in terms of performance these cores are nothing like the Firestorm cores on the M1. AnandTech has done a number of performance tests of the Graviton2 compared to the competition which you can find here. However this was released before the M1, so I have taken AnandTech’s later performance tests of the M1 and added into the same plot for comparison below.
Okay, what does all this geeky stuff mean, such as
444.namd? These are names of specific programs written in the C++ programming language. When doing performance tests these same programs are run on different microprocessors. By always using exactly the same programs we are able to compare microprocessors. Each program has been made to challenge the microprocessor in different ways. A CPU isn’t equally good at everything.
Thus each column represents the score each microprocessor got on running a specific program. You can see the Apple M1 in yellow and the Ryzen 9 5950 X in red cover the largest area on each column because they got the highest scores.
The Graviton2 in blue in contrast has quite a small area. This performance test is not of the whole chip but of a single CPU core. In this case we can se that the Neoverse N1 cores used by Graviton is significantly weaker than Apple’s Firestorm cores and AMD’s Zen3 cores.
So it looks like pretty much everybody crushed Amazon. Could e.g. AMD just move in and steal the market with their flagship Zen3 based CPUs? Not so fast! Amazon is not stupid. There are a lot more to take into consideration here. As already mentioned the number of cores matter a lot in the cloud. Here M1 comes up short with just 4 Firestorm cores, while Amazon’s Graviton2 has 64 Neoverse N1 cores.
But more importantly microprocessor today have gotten so cheap relative to how powerful they are that over the lifetime of a chip you are going to pay a lot more for the electricity to drive these chips than for the chips themselves.
Infoq discuss this more in detail, but let me cherry pick a bit. For instance a the Intel Xeons solution compared with the Graviton2 consumes 420 Watt, while the EPYC from AMD consume 180 Watt. In contrast the Graviton2 consumes a mere 80–110 Watt for its 64 cores
These kinds of savings on power and cooling translates into the ability for Amazon to offer more computational power on ARM than on x86. According to Amazon AWS they are able to do the same workload at 40% lower price than an equivalent x86 solution.
Amazon EC2 T4g, M6g, C6g, and R6g instances, and their variants with local NVMe-based SSD storage, that provide up to 40% better price performance over comparable current generation x86-based instances
Towards a New Manufacturing Paradigm
This also underscores the changing computing landscape we are currently part of. Large companies such as Amazon are now increasingly able to build in-house solutions tailored to their specific computing needs. Graviton2 is not something you can buy off the shelf. Because IP (intellectual property) for different chips can now be bought all over the place and combined to create your own microprocessor, the threshold for anyone to make a chip has been significantly reduced.
This isn’t unique to the computer industry. This is happening all over the place. Will Chapman e.g. was a software developer, who discovered that despite being just one guy, he could enter the manufacturing business. He founded BrickArms, which makes historical plastic weapons for Lego minifigs. Today you can design a part with 3D modeling software at home and upload your design to a factory in China which will mass produce it for you. These factories now produce products in small runs so you don’t have to be a big enterprise to get started.
Ponoko examplifies this trend. They call themselves your personal factory. You can upload designs and their facilities will 3D print and laser cut parts according to your designs which others can buy. Thus anyone sitting at home alone with minimal capital can get in the business of manufacturing things and selling them. You see the same with books, where Amazon will print books on demand. Thus even if your books sell few copies, it is still possible to have them printed.
Google, Tesla, Facebook, Amazon and other are simply taking advantage of this global trend. They all make their own hardware today. Google make specialized machine learning hardware used in their data centers called Tensor Processing Units. These are similar in usage to the Neural Engine on Apple’s M1 chip.
Tesla also makes their own car computers to accelerate the running of machine learning models to help on the massive processing power required for their self driving service. Don’t confuse this with the Tesla cards from Nvidia.
And this trend is simply accelerating. Google is busy making their own ARM chips for their Android phones. I think we will increasingly see large tech companies designing their own custom hardware tailored to their needs, because you can so easily buy ready made designs. When you have a designed a chip, it is just a matter of transmitting the blueprint to a large chip foundry , such as TSMC to build it. Just like BrickArms owner Will Chapman, can ship his Lego weapon blueprints to some factory in China, which immediately starts cranking out the pieces using plastic injection moulding.
Okay, let us look at one of the other competitors in the ARM server chip market.
Ampere Computing is a new company started in 2018, which aims to offer ARM server processors to all the cloud providers who are not Amazon and consequently cannot get hold on a Graviton2.
However there are a lot of interesting thing about Ampere which makes it worth talking about especially comparing it to the M1. Performance per core will likely be similar to Graviton2 since it is based on the Neoverse N1 CPU cores designed by ARM Ltd.
But this is where the difference ends. Ampere has some unique choices:
- Lots and lots of cores. Their current Ampere Altra has a whopping 80 cores. But they are busy making Altra Max which will feature 128 cores.
- You can put two of these Ampere chips on their motherboards. Hence you can get a total of 160 cores in one computer. Their Mt. Jade Platform below is an example of this. With the Altra Max which are backwards compatible you can get 256 cores in one computer!
- Ampere Altra is designed to give deterministic performance per core. What does that mean? Often when you got multiple cores the activity of one core can cause another one to suddenly drop in performance. Cloud customers don’t want sudden drops in performance due to activities of other customers. Thus this is a major selling point for Ampere Computing.
- Massive expansions abilities. You can connect a lot of memory, hard drives, graphics cards, neural engine style cards and other forms of accelerator cards to it. Accelerators are specialized cards to make particular tasks run faster such as video encoding or machine learning.
Thus if we are to compare to Apple’s M1 we can see some stark differences. Both the Graviton2 and Ampere may have significantly weaker cores but M1 has only 4 fast cores and 4 slow ones.
Ampere chips, unlike M1, have not jumped on the heterogeneous computing wagon. There are no specialized chips for encryption, machine learning, image processing or video encoding. In my Apple M1 story I remarked that it would be difficult for the PC industry to copy Apple’s heterogeneous strategy as there would be numerous players in the industry, which will need to coordinate and agree what these specialized chips should be, otherwise one creates complete chaos.
This is indeed a problem facing Ampere, since they only make the hardware and don’t control the software like Apple. Thus Ampere Computing have stated that they thought it was too early to add specialized co-prossesors. They are waiting for standards to establish themselves in the industry. Something Apple does not need to do. Apple makes the software that access e.g. their Neural Engine, and thus they can make sure the software and the hardware match up. No coordination between hardware and software vendor is needed.
Ampere Computings solution to this thus far is to put in support for a lots of PCIe lanes. What is a PCIe lane you ask? You can think of a PCIe lane as a data pipe or a data tube. The more of these tubes you have, the more external hardware you can connect. The PCIe standard lets you connect up to 16 of these tubes to one external hardware card such as graphics cards or say a hard drive. They have also added support for a standard called CCIX, which as far as I understand, allows something akin to Apple’s Unified Memory Architecture. It allows the GPU on an external graphics card and the Ampere CPU to share data in memory. You don’t need to explicitly copy blocks of data from one memory location to another.
Apple Advantages and Disadvantages in Entering the Server Space
What we have discussed thus far may start to give you some intuition about the kind of tradeoffs we are facing.
Apple has an advantage in that they control the whole widget, but that only applies if both their hardware and software is used. Thus to get the full advantage of heterogeneous computing power from Apple, you actually need to run macOS in the cloud, not Linux, not FreeBSD and certainly not Windows.
If not, Apple would have to wait for industry standards supported by Linux, BSD, Windows and others to emerge and then tailor their hardware to those standards. This is unlikely to be something Apple agrees with.
Also I am skeptical that Apple would want to sell solutions not running their software. This puts potential users in a bind. It helps that macOS is a Unix operating system. That means a lot of Linux and BSD software will run fine on it with minimal change. Yet macOS is not really optimized for server use. Linux kernel developers are very focused on this and that drives their development efforts. macOS e.g. is highly tuned towards things like low latency to deal with things such as real time audio and video. These are use cases which matter to professionals working on video and audio. That is a deep part of the Apple DNA and heritage.
Thus developers will potentially face the choice of running on macOS and get superior performance, bu miss out on the customization and full openness they love about Linux. Keep in mind that crucial technology for cloud software such as Docker requires a Linux kernel to run.
Sure one can run Linux on macOS through virtualization but then you have also lost access to Apple specific frameworks such as Core Audio, Core ML etc which utilize custom Apple co-processors.
For server workloads which cannot utilize Apple’s specialized hardware there may not be a strong reason to pick an Apple solution. This is what we need to keep in mind. Making large high performance cores, like Apple has done, is not a magic solution. It is a deliberate choice because their users don’t have workloads which can easily utilize lots of CPU cores.
For situations where you can utilize a lot of cores such as for cloud services, the Apple solution likely doesn’t have an advantage. Amazon and Ampere can simply match Apple in performance by having more cores. Currently Apple only wins this competition in workloads where only 4–8 cores can be utilized such as in modern computer games.
Read more: 32-Core Apple Silicon Macs in 2021?
Where Apple Might Shine
All cloud services are not the same. For high performance computing and machine learning services, we want massive number crunching capability at lower power consumption. The M1 chip may be able to beat everybody in this regard. The kind of performance it can offer in doing highly intensive math operations is remarkable much thanks to its special co-processors.
Rumors also have it that Apple will be making Apple Silicon chips with up to 128 GPU cores. GPUs tend to give a major advantage in many scientific number crunching applications. Ampere and Amazon in contrast don’t have any integrated GPUs to leverage.
Researchers may simply have to suck up the fact that they need to run macOS, because there may simply be too much money to save by using an Apple solution.
Hence if Apple is to enter the server space, they should not be targeting the normal server market, hosting web sites, databases etc. They should focus on:
- Data analysis.
- High Performance Computing (HPC).
- Machine Learning (ML).
Why I don’t Think Apple Will Succeed in the Server Space
While I have laid out reasons why Apple may offer a strong advantage, and I am a huge Apple fan, I must try to be realistic with respect to what I know about Apple history. Apple simply does not have a good track record.
They gave up on their Mac OS X Server. They ditched their special server rack computers. And until iCloud their services for synching data had a rather poor track record.
Apple is really good at what they do, but in areas that Steve Jobs never cared passionately for such as business to business and gaming, Apple has never managed to make it big. The Apple TV e.g. could have been a good gaming console that could have competed with Nintendo, but Apple totally failed to grab that opportunity by delivering a standard controller almost entirely unsuited to play games.
Apple is not a good business to business company for several reasons, strongly tied to their DNA. What is it that makes Apple? What we know and love about Apple is the mystique. They work in secret for years on new revolutionary products. Then suddenly they burst onto the stage and show us something we have never seen before, before announcing that you can buy it tomorrow (or in half a year).
For the excitement and awe to actually work, you need full secrecy. You need to have developed a product really far before it gets announced. We see this in typical Apple development. They make a shock announcement with a product that leapfrogs everybody else. We all lineup to get it. But as the years pass we notice that Apple products start lagging behind the competition because they simply don’t do frequent updates.
Part of the reason why Apple products are so good is because Apple take their time to really polish and perfect their products. But that means there will not be a steady stream of improved versions pushed out on regular intervals. In this regard Apple operates a lot like the Console market. Often when a new console is released they utterly destroy gaming PCs in similar price range. Yet over time PCs tend to close that gap.
Consoles like Apple products are ultimate consumer products. Customers want to be delighted and dazzled. That is why these things bring massive improvements and beautiful new designs when released.
This is not how the server market works. The industry wants clear roadmaps and transparency. They would want to know what Apple is planning, what they are currently working on and what they can expect. These guys have to make long term strategic decisions. Waiting toget surprised at Christmas, is not how you run a successful business.
These kinds of completely different needs have always made it difficult for Apple to deal with business customers. Apple wants tight control and streamlined products where business wants open and flexible solutions. To succeed in the server market or in the business to business market, Apple would have to become a lot more like Microsoft, and that I believe is the last thing Apple would want or their fans. Microsoft is the antithesis of what Apple stands for.
Microsoft protect backwards compatibility at all costs e.g. That has a tremendous value in business. But it also drags down Microsoft and creates messy convoluted solutions hampered by legacy. Apple in contrast brutally breaks with the past to create new shiny solutions. They piss off a lot of customers on the way, but after we get our hands on their new shiny products we quickly forget these transgressions.
What Role Can Apple Play in the Server Space?
Instead of making server solutions themselves I think it is far more realistic that Apple cooperate with Amazon, Ampere, Google and others to establish ARM as a strong alternative to x86. They all have an interest in this. With ARM all the big players can build their own custom solutions tailored to their needs in a way they never could with x86.
But there are two parts to this puzzle. You need the server hardware but you also need popular desktop and laptop computers running ARM. If these are not prevalent, then developers will not develop sufficient experience with ARM. Linus Torvalds has been quite clear that your home computer needs to run the same hardware as your server.
With the excitement we now see around the M1, there could be a major uptake for ARM based computers which means Amazon and others will finally have a large population of developers with ARM based machines who can develop services running on their Graviton2 chips. This will get the snowball rolling. It is all about reaching critical mass. As long as you got enough people running on ARM one will reach an inflection point where everybody will want to transition their services to the much cheaper ARM cloud services.
That could mean a massive influx of developers and other professional onto the Apple platform. Or at least until the PC industry is able to come up with a viable ARM alternative. Which as I have discussed in earlier stories I think will take time. Apple now likely has several years where they will enjoy a durable advantage over the rest of the PC industry.
Eventually the PC industry will standardize and catch up. However people don’t generally leave the Mac platform once they have become accustomed to it. Thus this may mean a permanent higher market share for Apple.