ZX GIFs: 127 colors on the Spectrum 48K… ultra low resolution!
During the Covid pandemic, lockdown days and all, a lot of people decided to occupy their time learning how to make sourdough bread and such. Mine was to turn my nerd dial to 11.
My pandemic project started by wondering what I could do with a ZX Spectrum 48K, using the tools available at the day (2021). The result was much better than I expected :)
For those who are interested, I’ll tell you the process. For the rest of you, I’m sorry to tell you that you’ve reached the end of the Internet; there’s nothing left to do here, you can turn off the screen and go out and enjoy the day.
1. The Idea
The Spectrum 48K was a very limited device: only about 40K of usable RAM (the rest is used by the screen), tape storage, 256x192 pixels… which could be colored, not individually, but in blocks of 8 x 8, choosing for each block a pair of colors from two palettes of 8 (with or without “brightness”).
The graphics were also terribly slow. It occurred to me that I could kill a couple of birds with one stone by shooting them with another bird: by planting a raster on the bitmap, and working only with the attribute map (“BRIGHT”, “INK” and “PAPER”), I could very quickly achieve combinations of those colors, full screen. At the cost, of course, of lowering the resolution to 32 x 24.
A “checkerboard” pattern (50%) is perhaps the most pleasing to the eye, but results in two identical combinations of INK & PAPER, wasting one:
A pattern of one pixel out of three (33%) would allow us to take advantage of the four possibilities that working on the attribute map gives us, resulting in an iRrGgBb color space. That is, two bits for each RGB component, plus the brightness bit, giving 7 bits in total = 128 theoretical colors.
Considering that we will have two compositions for black, in practice we will get 127 colors… which is quite a bit more than the 15 (iRGB x 2, minus a duplicate composition for black) offered by the Spectrum:
To evaluate the idea, we performed a proof of concept in BASIC. We use UDGs (8 x 8) to create the frames; the 33% frame (9 x 9) has obvious moiré. At this point, we decide that this is something we can solve later, and we consider the proof sufficient to conclude that this option is worth trying.
2. Creating images
The next step was to find some way to assemble the images with modern tools (e.g. Photoshop) and transfer them. For that, it was necessary to map the RGB colors resulting from the iRrGgBb model. Or rather, iGgRrBb, because the Spectrum has the components in another order.
A good job for a spreadsheet.
At this point, we can pass the colors to an image, simply by writing an HTML table with the RGB codes, and taking a screenshot. Unfortunately, the table of colors that Photoshop builds based on an image is not in the order in which they appear:
If the color indices of the palette do not match those resulting from the iGgRrBb model of the Spectrum, we can end up with results like this:
The solution is to create the colormap directly in code. BMP turns out to be an appropriate format for the whole process: we can create a color table by intervening with a hexadecimal editor, create indexed color images without compression, then take the image bytes, and apply them to the Spectrum attribute map.
The result is getting close to what we are looking for, but it is not exactly what we wanted:
The problem is that BMP defines pixels from bottom to top, rather than top to bottom. With JavaScript we can read the BMP files (or take the hexadecimal data copied from an editor), discard the data we don’t need, and reverse the order of the byte rows:
<table><tr><td>
Input - BMP file 32x24, indexed color (iGgRrBb palette):<br>
<input type="file" id="filein" oninput="readBMP()" />
<br><br>
Input - BMP data:<br>
<textarea id="datain" cols="64" rows="24" oninput="bmp2attrData()"></textarea>
</td><td>
Output - ATTR data for ZX Spectrum, to be applied over 33% pattern:<br>
<textarea id="dataout" cols="64" rows="24"></textarea>
</td></tr></table>
<script>
function readBMP() {
// based onhttps://jsfiddle.net/Lv5y9m2u/
var input = document.getElementById("filein");
var output = document.getElementById("datain");
if (input.files.length === 0) { return; }
var fr = new FileReader();
fr.onload = function() {
var data = fr.result;
var array = new Uint8Array(data);
var result = "";
var pos = 0;
for (let leByte of array) {
result += (toHex(leByte));
pos++
}
output.value = result;
bmp2attrData();
document.getElementById("dataout").focus();
document.getElementById("dataout").select();
};
fr.readAsArrayBuffer(input.files[0]);
}
function toHex(d) {
return ("0" + (Number(d).toString(16))).slice(-2).toUpperCase()
}
function bmp2attrData() {
// inBMPdata: bytes of the BMP, header + table + image from bottom left to top right.
var inBMPdata = document.getElementById("datain").value;
inBMPdata = inBMPdata.replaceAll(" ", ""); // we remove spaces that the code editor brings
inBMPdata = inBMPdata.substring(566 * 2, inBMPdata.length - 4); // we discard everything we don't need (without checking if the format is correct, etc).
// outATTRTRdata: will have the final data, from top left to bottom right.
var outATTRdata = "";
for (var leRow = 0; leRow < 24; leRow++) {
for (var leCol = 0; leCol < 32; leCol++) {
var leBMPbyte = (23 - leRow) * 32 + leCol; // is the byte to be taken, multiply by 2 (1 byte = FF)
outATTRdata += inBMPdata.substring(leBMPbyte * 2, leBMPbyte * 2 + 2);
} //end for leCol
} //end for leRow
document.getElementById("dataout").value = outATTRdata;
}
</script>
And now, we have achieved what we wanted:
3. Animating frame by frame
With all this behind us, let’s go for more: how many frames per second could we achieve in assembler Z-80, transferring data to the attributes area?
A small experiment using ROM data shows that the only brake on assembler speed is the HALT instruction needed to maintain synchronization with the TV signal.
That is, in a single second we would consume 50 frames… which at a rate of 768 bytes each, is precisely all we can store in RAM, leaving little more than 3 Kb for BASIC and assembler routines.
By using PAUSE in the FOR…NEXT from which we call our memory block transfer subroutine, we can reach very acceptable and less… voracious fps:
- 50 fps (without PAUSE)
- 24 fps approx. (PAUSE 1)
- 16 fps approx. (PAUSE 2)
- 12 fps approx. (PAUSE 3)
To take full advantage of the 50 frames we can store, we decided to stay at 12 fps, and apply some additional techniques:
- instead of using all the frames from beginning to end, use sub-loops.
- animate the sub-loops in both directions (back and forth),
- develop another assembler routine to transfer the images, mirrored.
And that concludes our nostalgic adventure :)
The video capture that illustrates the beginning of this article was created running the program in the Retro Virtual Machine emulator.
4. Final result (for runing on an emulator!)
If you want to run the program on your own (and/or modify it, etc), you can download the final result with:
- .tap file to run with an emulator (the images in this article were taken using Fuse and Retro Virtual Machine),
- Tools to create your own animations: an example BMP image (from which you can take the palette) and the HTML/JavaScript utility I made to convert BMPs to attribute data.
- Source code of the assembler Z-80 subroutines and a couple of comments on the BASIC program (reproducing what was shown in this article).
5. Can anything else be done?
With all the steps solved and tested, you could automate the process to generate .taps automatically from a GIF or video.
Or create a system of sprites with transparency, indicating as “transparent color” the second #000 of the table (with brightness, 0b01000000)… leaving us the Flash bit to indicate that the color of the sprite is mixed (averaged) with the screen color. The latter could be achieved by adding separately the pairs of GgRrBb bits, and discarding the last bit of the result:
- 00 + 01 = 001 → 00
- 01 + 01 = 010 → 01
- 10 + 01 = 010 → 01
- 10 + 10 = 100 → 10
- 10 + 11 = 101 → 10
- 11 + 11 = 110 → 11
All this, as they say in the manuals, is left as an exercise for the reader :)
Post mortem: revisiting the ‘80s
In this experience, I was struck by how little it took me to recover my muscle memory when it came to hunting for key tokens.
On the Spectrum 48K, by decisions dating back to the ZX-80, BASIC is not entered by typing directly. Instead, you have to know how to set the keyboard to the correct mode, and then press the key with the desired “keyword” once. For example, “LET a = INT b” is obtained by pressing the keys: L, a, [symbol shift + L], [symbol shift+caps shift to enter Extended Mode], R, b.
It took me even less time to relive the relentlessness of programming in assembler. The slightest mistake usually has catastrophic consequences. And even with the possibility today of inspecting memory and running a real-time disassembler, or even assembling Z-80s on a web page, the only thing that works is patience. Here, unraveling the routine that transfers mirrored blocks, with a step-by-step schematic developed in a spreadsheet:
An hour or so later I found the problem: I had typed “LD HL,DE” instead of “ADD HL,DE”.
Mind you: when things finally work, the thrill is… heroic. Those were the days!
Post data: Z-80 assembler source code with comments and references
(Part of the ZIP with the final result)
The BASIC list (with several REMs for clarity) is easily obtained from the .tap file: loading the program in an emulator, it is enough to “break” and ask for LIST.
The BASIC is stored from line 9900 with LINE 9800, pointing to the block that loads the machine code and frames. The animation starts on line 1000; the subroutines are at the beginning of the program, following the recommendations developed in his blog by Juan Antonio Fernández Madrigal. For convenience, line 1 indicates GO TO 1000, allowing to start the program with GO TO 1 (not RUN).
Here are the routines written in assembler with their comments and references:
; ///////////////
; Routines for ZX GIFs:
; Low-res 127 colors for Spectrum 48K
; Santiago Bustelo, November 2021
;
; We will store 50 frames from 6A00 (27136).
; For these routines we choose the address 26880,
; which will allow us a maximum of 256 bytes.
.ORG 26880
; ////////// Fast LDIR: transfer frames to the ATTR map
FASTLDIR16:
; Transfer a block of RAM to attributes
;
; Initial values
FASTLDIR_HL:
LD hl,00 ; source (for POKE: ORG+1 = 26881)
LD de,22528 ; destination - ATTR map
LD bc,768 ; for BC=768 to 1 (ATTRs)
HALT ; sync: wait for the electron beam of the TV to be at the end of the screen, avoiding flickering when manipulating the image map.
; Here we could do:
; LDIR
; but there is a faster way documented in:
; http://map.grauw.nl/articles/fast_loops.php
;
FASTLDIR16_LOOP:
; Unrolled (rickrolled?) LDIR:
LDI ; 16x LDIs
LDI
LDI
LDI
LDI
LDI
LDI
LDI
LDI
LDI
LDI
LDI
LDI
LDI
LDI
LDI ; If BC == 0, flag P/V triggers JP pe
JP pe,FastLDIR16_Loop ; end loop
FASTLDIR16_END:
RET
; ////////// Fast LDIR con FLIP: transfer mirrored frames
FASTLDIR32FLIP:
; Transfer block RAM to attributes, mirrored
; Initial values:
LD hl,( FastLDIR_HL + 1 ) ; we take HL value from the same place as FastLDIR16, in order to use the same POKE in BASIC.
LD de,22528 ; destination
LD bc,768 ; for BC=768 to 1 (ATTRs)
HALT ; sincronización
; We prepare HL: To start 1st iteration (after the initial "carriage return" of the loop) it must be 31.
; Since the initial "carriage return" of the loop for the 2nd and subsequent iterations is 62, we anticipate by subtracting 31.
; So we need to make HL -= 31.
; The "clean" alternative uses 7 bytes and 46 T-states:
; PUSH DE
; LD DE,31
; SBC HL,DE
; POP DE
; The alternative (equivalent to the non-existent SBC HL,A)
; from https://plutiedev.com/z80-add-8bit-to-16bit
; uses 10 bytes y 39 T-states:
; //Z80 adding 8-bit to 16-bit: Unsigned substraction
LD A,31
; Since we know that the value of A is not zero, we can skip the condition (JP z, ...).
; // If A=0 do nothing, otherwise flip A's sign. Since the upper byte becomes -1, also substract 1 from H.
NEG
; JP z, SBC_HL_A_SKIP ; We don't need this.
DEC h
; // Now add the low byte as usual. Two's complement takes care of ensuring the result is correct
ADD a,l
LD l,a
ADC a,h
SUB l
LD h,a
; SBC_HL_A_SKIP: ; We don't need this.
FASTLDIR32FLIP_LOOP:
; Initial loop's "carriage return" for 2nd and subsequent iterations:
; HL+= 62: move HL from col 1 row N,
; to col 31 row N+1.
; The "clean" alternative uses 6 bytes and 42 T-states:
; PUSH DE
; LD DE,62 ; 30(end row N)+ 32(row below)
; ADD HL,DE
; POP DE
; The alternative (equivalent to the non-existent ADD HL,A)
; from https://plutiedev.com/z80-add-8bit-to-16bit
; uses 7 bytes and just 27 T-states:
LD A,62
ADD a,l ; A = A+L
LD l,a ; L = A+L
ADC a,h ; A = A+L+H+carry
SUB l ; A = H+carry
LD h,a ; H = H+carry
; We now move (HL) to (DE).
; Unrolled (rickrolled?) LDIR results in HL++ and DE++.
; We need two DEC HLs after each LDI to transfer the mirrored line:
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
DEC HL
DEC HL
LDI
; If we were to do two DEC HLs here, the "carriage return"
; in the next iteration would be 32.
; Instead, we'll save that HL-- and
; will do a carriage return of 31.
; The last LDI sets P/V when BC == 0:
JP pe,FastLDIR32flip_Loop ; end loop
FASTLDIR32FLIP_END:
RET
; ////////// Make 33% pixel pattern: //////////
MKPPAT33:
; Create 33% pattern: one bit on every three.
; RRA allows us to work with 9 bits,
; passing through Carry.
LD HL,16384 ; Spectrum's (pixel) screen memory
LD A,146 ; pattern 1/3
; loop 6144 times: 32 cols x 192 rows… in three thirds
LD d,3 ; for D = 3 to 1 (three thirds)
MKPPAT33_LOOPD:
; Based on http://map.grauw.nl/articles/fast_loops.php
LD c,8 ; for C=8 to 0 step -1, loop's MSB + 1st loop
LD b,0 ; for B=256 to 1 step -1, loop's LSB
MKPPAT33_LOOPCB:
LD (HL),A
INC HL
RRA
DJNZ MkPpat33_LoopCB ; end for B
DEC c
JP nz,MkPpat33_LoopCB ; end for C
RLA
DEC d
JP nz,MkPpat33_LoopD ; end for D
;
MKPPAT33_END:
RET