Project Ember

The Plan: Design and implement a fully functional “retro” video game console, with a custom CPU, GPU, Audio Processor, and development architecture from scratch. Engineering background, while helpful, is not required to follow along. Please join us!

The Flame GPU — Initial Design Part 1: Basic Requirements

8 min readMar 14, 2025

--

Flaming CPU
Dabarti CGI — Shutterstock

Now that we are sufficiently far along with the Ember CPU design, we will begin looking at the design of the Graphics Processing Unit, or GPU.

Before the turn of the century, most graphics or display chips processed only very simple 2-dimensional images, especially in home consoles and PCs. 3D triangle rendering only started to become available to home devices in the last half of the 1990s and into the new century, though 3D triangle-based games were already in arcade hardware throughout the 90s. Even slightly more advanced consoles in the 90s, like the Sega Genesis and Super Nintendo Entertainment System, could only display a few background images and a small number of sprites, or movable 2D images, each frame…though they could perform quite a few effects on these, like rotation and scrolling.

To properly imitate these machines, we want our GPU to be able to display one or more color backgrounds, scroll, rotate, and apply effects to them, and draw some number of movable color sprites over the backgrounds. However, just like with the Ember CPU, we will start with a limited set of capabilities and then gradually evolve this over time, adding new features only as needed.

Most consoles and home computers of the time supported multiple graphics modes, typically referred to by number (Mode 0, Mode 7, etc.). These were available to programmers and offered various capabilities and tradeoffs, like higher resolution or more colors, depending on the needs of a particular game.

The very first mode we need is a simple text display mode that will allow us to display console output and interact with the machine during initial development. This will allow us to make progress and get the basic system working, without needing to deal with much more complicated rendering details. We can later develop more advanced modes, adding colors, higher resolutions, and more backgrounds and effects.

Retro Displays

To better understand how the Ember display might work, let us first take a quick look at some background on displays of the time.

TV Displays

Cathode Ray Tube (CRT) Televisions of the 1900s could display about 480 interlaced scan lines at 60 Hz (NTSC/North America) or about 576 scan lines at 50 Hz (PAL). Home game machines typically used fewer lines for various reasons, including performance, memory limitations, and even the physical limitations of early TVs, as their rounded screen edges tended to cut off some of the screen area. For TV shows, cutting off the edges of the program was just expected; however, for video games, it would be annoying if your score was cut off the top of the game screen!

RGB CRT with Shadow Mask
RGB CRT with Shadow Mask

In addition to the clipped edges, these CRT TVs, when displaying broadcast TV programs, would also not display the entire frame at once but in two passes called fields. First, half of the image, every other line, was displayed, followed by the other half of the image, offset by half a scan line, which would draw in-between the other lines. This would result in the display of the full set of interlaced scanlines only 30 or 25 times per second. However, computers, home game consoles, and most VCRs (Video Cassette Recorders) would typically display only one set of scanlines every frame, choosing not to offset the next frame by half of a scanline. This resulted in true 60 or 50 frames per second displayed but at half the vertical resolution. TVs would happily do this just fine, as they were directly driven by the voltages sent to them.

Horizontal resolution was a bit more complicated, limited to some extent by the size of the phosphor dots on the screen that the cathode ray was illuminating, the size and focus of the been itself, and how quickly the brightness of the beam could change as it scanned the line. In color CRTs, 3 synchronized electron beams pass through a metal mask with holes in it. The holes line up exactly with phosphors representing each color: red, green, and blue. As the rays trace each scanline, they only show through their respective holes and onto a matching color phosphor. What we think of as a pixel could be made up of multiple sets of three colored phosphor dots, known as a triad.

Simple CRT Scanning Diagram
Simple CRT Scanning Diagram

What made this more complicated was that even though the mask was very tiny, the size of the beams and the nature of the analog timing signals would result in the bleeding of light to the triads all around the desired pixel, making the output blurry, or changing the resulting color of all or some of the pixel. For this reason, most computer display modes only allowed much lower horizontal resolutions. For NTSC, the highest practical horizontal resolution was about 320 pixels, though most hardware only supported about 240 or so, or 288 for PAL. To save cost and memory, many early home consoles had even lower resolutions; the Atari 2600 could only display 40x192 for backgrounds, and 160x192 for sprites!

VGA

Another display technology that became popular with home computers in the 1990s was VGA. CRT-based VGA displays were capable of much higher resolutions due to the higher density of RGB phosphor triads, much tighter beams, and a new signaling interface, which defined higher frequency horizontal and vertical clock rates. Unlike NTSC orPAL, VGA allowed for multiple pre-defined resolutions, refresh rates, and color depths, which allowed a proliferation of new display modes. Starting at around 320x400, similar to TV, and going all the way up to 2048x1536 in theory, though most monitors of the time topped out at 1280x1024 or maybe 1920x1200 in the early 2000s before LCD displays took over with much higher resolutions. In fact, most early LCD displays supported at least one VGA input, mapping it to the native panel resolutions for better or worse.

Unfortunately, the high cost and generally smaller sizes made these displays uncommon for home game consoles, which were designed to display on larger home TV sets. VGA displays were used almost exclusively with computers on a desk. It wasn’t until the widespread availability of larger LCD displays that this started to change.

Initial Resolutions

For our use case, since we will most certainly be using modern LCD or LED displays and HDMI or DisplayPort connections, we aren’t technically subject to these limitations. However, in the spirit of a Retro Console, and because of the retro-inspired games we want to design for the system, we should stick to something era-appropriate, especially if we are going to implement the design on consumer FPGAs, and also emulate period home and arcade games which used CRTs as displays. We can always add new modes, with higher resolutions and capabilities if required.

As mentioned, for development purposes, we will first want at least one simple text-only display mode for debugging or system console use. Since this text mode is for development, and will typically be attached to a modern LCD display, either through VGA or HDMI, we can easily expect support for a 16x16 font at 80x60 on 1280x1024 or 80x48 on 1280x768.

Once we start to look at graphics mode resolutions for games, we might note that even arcade machines of the time typically supported at most 320x240, more often 240x320 when the CRT was mounted sideways in an upright cabinet. Pac-Man, for example, was 224x248, and the home SNES console supported 256x224 (or 256x240 for PAL).

Fonts and Video Memory

Modern text fonts are rendered directly from mathematical descriptions of how each letter should appear at any resolution and with various effects applied, like italics or bold. Back in the 8-bit era, however, most text, especially in games, was displayed using pre-made images of each letter at fixed sizes. For text modes, these font bitmaps were of varying sizes, often from 8 or 9 bits wide and 9, 11, 12, 14, or 16 bits high.

On Western PCs, ASCII (American Standard Code for Information Interchange) quickly became the standard for text rendering, while there were many competing alternatives for Eastern computers initially. Unicode has since become one of the primary text representations today, with support for around 150,000 characters in many different languages. For the Ember and Flame implementations, we will stick with ASCII for now, at least for debugging purposes and our proposed Text Mode 0.

Of the 128 ASCII base codes for characters, the first half of the range includes various control characters like carriage return and backspace, then symbols including quotation and exclamation marks with numbers starting at value 48. The capital letter A starts with the value 65, while lowercase letters start at 97, allowing an application to add or subtract 32 in order to change from upper to lowercase or vice versa.

Tilemap, Bitmap Font, and Memory Layout for Mode 0
Tilemap, Bitmap Font, and Memory Layout for proposed Text Mode 0

If we create a single-color text mode supporting 16x16 pixel characters, where each pixel is represented by 1 bit in memory, we can store each character in 32 bytes, so 128 characters will take up 4K of memory. This block of memory is called the Tilesheet.

In a text mode made up of 40x30 characters (640x480 pixels), there is an array in memory of 40x30, or 1200 locations that describe a single character; this is called the Tilemap. There could be one byte or several bytes at each location, depending on the data needed to display the characters. Sometimes, more than one byte is used in order to specify attributes about each character, like color, bold, italics, inverse, etc. In our case, in the debugging text mode, we’ll just use one byte per character, so the tilemap at that resolution will need 1,200 bytes of memory.

Since this is a single-color text mode, we only need two colors for the entire screen: Background and Foreground. In the future, for full-color graphics modes, we will need to introduce the concept of palettes for colors, which is another level of indirection, where the values in the tilesheet itself refer to another array of colors for each pixel in the tile. But for now, we can ignore that additional level of complexity.

Next Steps

In the next installment, we will look at the GPU's memory map and how the CPU tells the GPU what to display using registers mapped into the memory address space of the Ember Console. Later, we will also need to discuss movable sprites for characters, as well as various effects that can be applied to backgrounds like rotation or scrolling, though those are not needed for the initial text mode.

Next post in this series:

The Flame GPU — Initial Design Part 2: Tilesheets, Tilemaps, and Graphics Registers

Ember Design Series

Back to the beginning of Project Ember:

Going Old-School: Designing A Custom Homebrew Retro Video Game Console From Scratch

https://buymeacoffee.com/emberproject
https://buymeacoffee.com/emberproject

Enjoying my content? Support me by buying me a coffee, clapping for this post, and following my page. Thanks so much, and stay safe!

--

--

Project Ember
Project Ember

Published in Project Ember

The Plan: Design and implement a fully functional “retro” video game console, with a custom CPU, GPU, Audio Processor, and development architecture from scratch. Engineering background, while helpful, is not required to follow along. Please join us!

Tom Gambill
Tom Gambill

Written by Tom Gambill

Software Engineer, Retro Hacker, World Traveler. And also: sailboats, fish tanks, nature, family, startups…

No responses yet