(A repost of my older pieces about ZX Spectrum programming and hacking.)
Everyone who had once owned a ZX Spectrum, surely had at least once seen an amazing loading effect and wondered: How are they doing this? So, come and have a read…
Because ZX Spectrum didn’t have any dedicated circuit for storing and retrieving programs to and from a cassette tape (such as PMD), it had to do it purely by software. At the highest level, programs were loaded as a pair of “header + data”. In the header file there was information about the file (name, type, length, etc.) and the data file contained actual custom data.
The header file has a length of 17 bytes and a following structure:
Position | Length | Description |
---|---|---|
1 | Type (0=program, 1=number array, 2=character array, 3=code) | |
1 | 10 | Name (right padded with spaces to 10 characters) |
11 | 2 | Length of data block |
13 | 2 | Parameter 1. Eg. for programs it is a parameter LINE, for ‘code’ it is the address of beginning of the code block |
15 | 2 | Parameter 2. Eg. for programs it is the beginning of variables |
Data block doesn’t have any such structure, it just contains data.
Let us descend one level: Each block was preceded by one flag byte ($00 for a header, $ff for data) and terminated with a checksum (all bytes, including the flag byte, XORed together). Header was therefore actually 19 bytes long – it began with 0 and ended with the checksum, but these bytes were “consumed” by the system and you couldn’t get your hands on them.
This is where the level at which you could get by calling routines from the ROM ends. The rest of it was just
Ones and zeroes
Data on the tape was written as a sequence of pulses. The processor took care of generating of these pulses, including the proper timing.
Each entry was preceded by a so-called pilot tone. It was created by a rectangular signal (where ON and OFF states were regularly alternated), which means that MIC output in the ON state was 2168 T long (T is the processor clock), and in the OFF state, 2168 T long as well. Boot tone prepared input circuits (if there were any) in a tape recorder to the correct volume level. This boot tone before the header lasted 5 seconds and before the data block two seconds. The duration of the phase tone was determined by a flag byte, all values less than 128 meant a “long” phase tone and more than 128 meant a short one.
After the pilot tone a sync pulse was generated. Its purpose was to let the loading routine know that the pilot tone ends and to begin to process the data. Sync pulse was 667 T ON and 735 T OFF long.
The sync pulse was then followed by actual bytes of data. First was the flag byte, then the contents, and eventually checksum. Bytes were sent bit by bit from the highest one. Every bit was represented by a pulse (state switching to ON and OFF), but differed in length. For logical 0 it was 855T in the ON state and 855T in the OFF state, for logical 1 it was doubled, thus 1710T and 1710T, respectively.
When reading data (which is what we are after here) processor then measured the time between each change of the input signal (ie. time between transitions, or the so-called “Edges”). The recording was therefore not sensitive to polarity (such as for aforementioned PMD in the first version), however the processor did not have much time to do anything else, since it had to listen to EAR input, count loops until the signal changed, and check for SPACE key being pressed (which interrupted the loading, as you surely remember).
Assembler time
Now this is the moment to extract the ROM contents (taken from ‘The Complete Spectrum ROM Disassembly’ by Dr. Ian Logan and Dr. Frank O’Hara, as published by Melbourne House in 1983, available at WOS):
THE 'LD-BYTES' SUBROUTINE
This subroutine is called to LOAD the header information (from 076E) and later
LOAD, or VERIFY, an actual block of data (from 0802).
0556 LD-BYTES INC D This resets the zero flag (D cannot
hold +FF)
EX AF,AF' The A register holds +00 for a header
and +FF for a block of data
The carry flag is reset for VERIFYing
and set for LOADing
DEC D Restore D to its original value
DI The maskable interrupt is now disabled
LD A,+0F The border is made WHITE
OUT (+FE),A
LD HL,+053F Pre-load the machine stack with the
PUSH HL address - SA/LD-RET
IN A,(+FE) Make an initial read of port 254
RRA Rotate the byte obtained
AND +20 but keep only the EAR bit
OR +02 Signal RED border
LD C,A Store the value in the C register
(+22 for 'off' and +02 for 'on' - the
present EAR state)
CP A Set the zero flag
The first stage of reading a tape involves showing that a pulsing signal
actually exists. (i.e. 'On/off' or 'off/on' edges.)
056B LD-BREAK RET NZ Return if the BREAK key is being pressed
056C LD-START CALL 05E7,LD-EDGE-1 Return with the carry flag reset if
JR NC,056B,LD-BREAK there is no 'edge' within approx.
14,000 T states. But if an 'edge' is
found the border will go CYAN
The next stage involves waiting a while and then showing that the signal is
still pulsing.
LD HL,+0415 The length of this waiting period will
0574 LD-WAIT DJNZ 0574,LD-WAIT be almost one second in duration.
DEC HL
LD A,H
OR L
JR NZ,0574,LD-WAIT
CALL 05E3,LD-EDGE-2 Continue only if two edges are found
JR NC,056B,LD-BREAK within the allowed time period.
Now accept only a 'leader signal'.
0580 LD-LEADER LD B,+9C The timing constant
CALL 05E3,LD-EDGE-2 Continue only if two edges are found
JR NC,056B,LD-BREAK within the allowed time period
LD A,+C6 However the edges must have been found
CP B within about 3,000 T states of each
JR NC,056C,LD-START other
INC H Count the pair of edges in the H
JR NZ,0580,LD-LEADER register until 256 pairs have been found
After the leader come the 'off' and 'on' parts of the sync pulse.
058F LD-SYNC LD B,+C9 The timing constant
CALL 05E7,LD-EDGE-1 Every edge is considered until two edges
JR NC,056B,LD-BREAK are found close together - these will be
LD A,B the start and finishing edges of the
CP +D4 'off' sync pulse
JR NC,058F,LD-SYNC
CALL 05E7,LD-EDGE-1 The finishing edge of the 'on' pulse
RET NC must exist
(Return carry flag reset)
The bytes of the header or the program/data block can now be LOADed or VERIFied.
But the first byte is the flag byte.
LD A,C The border colours from now on will be
XOR +03 BLUE & YELLOW
LD C,A
LD H,+00 Initialize the 'parity matching' byte
to zero
LD B,+B0 Set the timing constant for the flag
byte.
JR 05C8,LD-MARKER Jump forward into the byte LOADing loop
The byte LOADing loop is used to fetch the bytes one at a time. The flag byte is
first. This is followed by the data bytes and the last byte is the 'parity'
byte.
05A9 LD-LOOP EX AF,AF' Fetch the flags
JR NZ,05B3,LD-FLAG Jump forward only when handling the
first byte
JR NC,05BD,LD-VERIFY Jump forward is VERIFYing a tape
LD (IX+00),L Make the actual LOAD when required
JR 05C2,LD-NEXT Jump forward to LOAD the next byte
05B3 LD-FLAG RL C Keep the carry flag in a safe place
temporarily
XOR L Return now if the flag byte does not
RET NZ match the first byte on the tape
(Carry flag reset)
LD A,C Restore the carry flag now
RRA
LD C,A
INC DE Increase the counter to compensate for
JR 05C4,LD-DEC its decrease after the jump
If a data block is being verified then the freshly loaded byte is tested against
the original byte.
05BD LD-VERIFY LD A,(IX+00) Fetch the original byte
XOR L Match it against the new byte
RET NZ Return if 'no match' (Carry flag reset)
A new byte can now be collected from the tape.
05C2 LD-NEXT INC IX Increase the 'destination'
05C4 LD-DEC DEC DE Decrease the 'counter'
EX AF,AF' Save the flags
LD B,+B2 Set the timing constant
05C8 LD-MARKER LD L,+01 Clear the 'object' register apart from
a 'marker' bit
The 'LD-8-BITS' loop is used to build up a byte in the L register.
05CA LD-8-BITS CALL 05E3,LD-EDGE-2 Find the length of the 'off' and 'on'
pulses of the next bit
RET NC Return if the time period is exceeded
(Carry flag reset)
LD A,+CB Compare the length against approx.
CP B 2,400 T states; resetting the carry flag
for a '0' and setting it for a '1'
RL L Include the new bit in the L register
LD B,+B0 Set the timing constant for the next bit
JP NC,05CA,LD-8-BITS Jump back whilst there are still bits to
be fetched
The 'parity matching' byte has to be updated with each new byte.
LD A,H Fetch the 'parity matching' byte and
XOR L include the new byte
LD H,A Save it once again
Passes round the loop are made until the 'counter' reaches zero. At that point
the 'parity matching' byte should be holding zero.
LD A,D Make a furter pass if the DE register
OR E pair does not hold zero
JR NZ,05A9,LD-LOOP
LD A,H Fetch the 'parity matching' byte
CP +01 Return with the carry flag set if the
RET value is zero (Carry flag reset if in
error)
THE 'LD-EDGE-2' and 'LD-EDGE-1' SUBROUTINES
These two subroutines form the most important part of the LOAD/VERIFY operation.
The subroutines are entered with a timing constant in the B register, and the
previous border colour and 'edge-type' in the C register.
The subroutines return with the carry flag set if the required number of 'edges'
have been found in the time allowed; and the change to the value in the B
register shows just how long it took to find the 'edge(s)'.
The carry flag will be reset if there is an error. The zero flag then signals
'BREAK pressed' by being reset, or 'time-up' by being set.
The entry point LD-EDGE-2 is used when the length of a complete pulse is
required and LD-EDGE-1 is used to find the time before the next 'edge'.
05E3 LD-EDGE-2 CALL 05E7,LD-EDGE-1 In effect call LD-EDGE-1 twice;
RET NC returning in between in there is an
error.
05E7 LD-EDGE-1 LD A,+16 Wait 358 T states before entering the
05E9 LD-DELAY DEC A sampling loop
JR NZ,05E9,LD-DELAY
AND A
The sampling loop is now entered. The value in the B register is incremented for
each pass; 'time-up' is given when B reaches zero.
05ED LD-SAMPLE INC B Count each pass
RET Z Return carry reset & zero set if
'time-up'.
LD A,+7F Read from port +7FFE
IN A,(+FE) i.e. BREAK and EAR
RRA Shift the byte
RET NC Return carry reset & zero reset if BREAK
was pressed
XOR C Now test the byte against the 'last
AND +20 edge-type'
JR Z,05ED,LD-SAMPLE Jump back unless it has changed
A new 'edge' has been found within the time period allowed for the search.
So change the border colour and set the carry flag.
LD A,C Change the 'last edge-type' and border
CPL colour
LD C,A
AND +07 Keep only the border colour
OR +08 Signal 'MIC off'
OUT (+FE),A Change the border colour (RED/CYAN or
BLUE/YELLOW)
SCF Signal the successful search before
RET returning
Note: The LD-EDGE-1 subroutine takes 464 T states, plus an additional 59 T
states for each unsuccessful pass around the sampling loop.
For example, therefore, when awaiting the sync pulse (see LD-SYNC at 058F)
allowance is made for ten additional passes through the sampling loop.
The search is thereby for the next edge to be found within, roughly, 1,100 T
states (464 + 10 * 59 overhead).
This will prove successful for the sync 'off' pulse that comes after the long
'leader pulses'.
In this article there is no space for detailed description of how the algorithm works, but let’s quickly go over some facts.
Routine disables interrupts. This is logical, since it is dependent on exact timing. That’s also the reason why loaders stored in slow RAM ($4000- $7FFF) will not work.
The last state of EAR input is stored in the C register, but it is shifted by 1 bit to the right, and the lowest three bits contains border color. For the leader it is 02 (red) / 05 (cyan), for data it is 01 (blue) / 06 (yellow). Why the right shift? During loading, a sequence LD A, $7F and IN A, ($FE) is processed. The value $7F is sent to the upper 8 bits of the address, therefore when reading a keyboard input it selects a line of keys B – N – M – Symbol Shift – Space. The status of these keys is in the lowest five bits (0-4), bit 6 contains an EAR state. Using the RRC instruction the lowest bit (status of the SPACE key) moves to the flag CY and EAR state moves to the 5th bit position. Therefore, if CY is zero, it means that a SPACE was pressed and loading ends.
The routine begins by detecting a signal that corresponds to the pilot tone (LD-LEADER). If 255 of such pulses were read, it was then believed that this was a pilot tone and the routine waits for a shorter synchronization pulse. When it arrives (LD-SYNC), you can retrieve data.
LD-MARKER reads 1 byte to register L. It begins with a value of 01, to serve as a counter. Gradually it fills with bits from the right by instruction RL L, the highest bits then passes into CY. If CY = 0, they keep loading further, but once CY = 1, it means that a complete set of eight bits was retrieved.
Key routines are LD-EDGE-n (wherein n is 1 or 2). LD-EDGE-1 first waits for a certain period of time (465T) and then determines whether the value of the EAR input has changed and compares it against the last stored value (in register C, see above). If it has not changed, the loop is repeated. For each loop the value in the registry B is increased. Once it gets to zero, it means “timeout” – the edge did not come in the expected time limit.
If the edge is found, the content of the register C is negated. This results both in a change of stored EAR value, but also in a change of the color border.
LD-EDGE-2 actually performs two LD-EDGE-1 in a sequence.
LD-EDGE output is as follows:
- CY = 0, Z = 1 – during a time interval EAR change did not come (“timeout”)
- CY = 0, Z = 0 – SPACE was pressed (BREAK)
- CY = 1 – edge was found, the current value of the counter is in B
The counter in B counts, as I wrote, upwards. During every loop pass, which takes 58 processor ticks, B is incremented by one. For example, when reading a bit, routine LD-EDGE-2 is called, thus seeking two edges. The counter in B is set to $B0. This means that the timeout comes after $4F cycles ($FF- $B0). This represents a 2 x 465T of wait loop + 79 * 58T = 5512T. After all this time, the routine reports a signal failure.
The resulting counter value is compared to $CB. If it is smaller, it is then evaluated as two short pulses of log. 0, if it is larger, then log. 1. The value of $CB means that the loop was done 27 times ($CB-$B0), specificaly that the two edges came at a time of less than 2496T. Let’s recall: For log. 0 the last two pulses take 1710T, for log. 1 they take 3420T. Therefore the difference between these two times is 2565 and this is roughly what I came up with. It is a bit less because of overhead (subroutine calls, evaluation etc., See LD-8-BITS).
Loader hacking
This was not so difficult, was it? So now, let’s make some of those tricks …
First, reading routine is not completely “T-pedantic”, so few Ts here or there do not pose any major problem. If you want a simple effect, we can add it without any complicated adjustments.
Border effects
We can for example change color of the stripes, if we do not like the default two-tone ones. How about a rainbow? Simply rewrite the end of the routine LD-EDGE:
LD A,C
INC A
XOR $20
AND $27
LD C,A
AND $07
OR $08
OUT ($FE),A
SCF
RET
What’s going on here? Instead of negating of the contents of register C, we increase its value by 1 and negate the value of the fifth bit. Thanks to masking of the value of $27, we avoid the overflow which would affect the EAR bit. Therefore it will vary in the range 0-7 and create a rainbow effect border.
Note: If you intend to try this, be sure to place the loading routine in the upper 32 Kbytes of RAM!
If we add a pair of instructions XOR A; OUT ($FE), A to the end and before the instruction SCF, stripes in the border will change into short lines on a black background.
Here we can put any effects that affect a color border either by amending it, or by using it. What about the effect that Busy used with loader for Song In Lines 3 (rounded corners), it looks impressive, huh?
Yet it is not very hard… Four squares in the corners contain a simple pattern (“rounding”). Pixels that are equal to 1 (the color INK) will look as if they were part of the border and will show streaks. I did not examine how Busy does it, but I’d bet that in principle it’s done somehow like this:
LD A,C ;Change of the last "edge type"
CPL ;as well as BORDER color.
LD C,A
AND $07 ;Taking just only the BORDER color
OR $08 ;MIC off
OUT ($FE),A ; Change border color
OR $30 ; A contains: 0 0 1 1 1 b b b (b = border color)
; so PAPER=7, INK=border color,
; BRIGHT 0, FLASH 0
LD ($5800),A ;left top corner attribute
LD ($581F),A ;right top corner
LD ($5AE0),A ;left bottom
LD ($5AFF),A ;right bottom
SCF ;Set CY=1 as a "success"
RET ;before return
Border effects are mostly simple and fast enough, so we can squash them here and not worry about the timing too much. Usually they fit within tolerance.
Simple effects with loaded content
During the loading we can certainly manage simpler operation with loaded data, either at the bit or byte level. Digisynth demos loader was able to perform real-time data extracting using Huffman decompression algorithm (Huffman suits this purpose quite well, you just need to have a decompression tree saved and pass through it according to the loaded bit). For those interested, I have prepared a reconstruction of the loader. But we will have a look at another case, and this will be a well known Mad Load – a routine that loads square images in a certain order. The video shows its improved version.
Mad Load used a very simple data format. After the flag byte, data for each square followed. Each took 11 bytes – lower and higher byte of a screen address where the square should be stored, then 8 bytes of video memory and a 1 byte of attribute. This was followed by another square …
This loading routine was slightly modified by Frantisek Fuka (FUXOFT) – he made LD-8-BITS into a subroutine which reads 1 byte to register L. This subroutine is then used in another subroutine to retrieve one of the squares (MAD_SQUARE). Squares loading subroutine is then called over and over again until there is data to be loaded from the tape, and when there isn’t, it stops and returns back. There is no checksum performed or anything.
Here is the Mad Loader source code. I only commented on the parts that differ from the standard code.
LD HL,MAD_RETURN
PUSH HL
JP MAD_LOAD
NOP
NOP
NOP
NOP
MAD_RETURN:
EI
RET
MAD_LOAD:
DI
IN A,($FE)
RRA
AND $20
LD C,A
CP A
LD_BREAK:
RET NZ
LD_START:
CALL LD_EDGE_1
JR NC,LD_BREAK
LD HL,$0415
LD_WAIT:
DJNZ LD_WAIT
DEC HL
LD A,H
OR L
JR NZ,LD_WAIT
CALL LD_EDGE_2
JR NC,LD_BREAK
LD_LEADER:
LD B,$9C
CALL LD_EDGE_2
JR NC,LD_BREAK
LD A,$C6
CP B
JR NC,LD_START
INC H
JR NZ,LD_LEADER
LD_SYNC: LD B,$C9
CALL LD_EDGE_1
JR NC,LD_BREAK
LD A,B
CP $D4
JR NC,LD_SYNC
CALL LD_EDGE_1
RET NC
; Mad Load itself
CALL LD_ONE_BYTE ; The first one is a flag byte. Drop it!
MAD_LOOP:
CALL MAD_SQUARE
RET NC
JR MAD_LOOP
MAD_SQUARE:
CALL LD_ONE_BYTE ; Lower byte of address
LD A,L
EX AF,AF' ; save into AF'
CALL LD_ONE_BYTE ; Upper address byte
RET NC
EX AF,AF'
LD H,L ; to the H register
LD L,A ; and the first one to the L, so I have a full address in HL
LD B,$08 ;8 bitmap bytes for each square
MAD_SCRN:
PUSH HL ;save the address
PUSH BC ;and the counter
CALL LD_ONE_BYTE ; read 1 byte
POP BC ; restore the counter
LD A,L ; byte to accumulator
POP HL ; restore address for this byte
RET NC ; If some error occured, return
LD (HL),A ;Else store the byte into screen memory
INC H ; addr + 256 - it means "next screen microline"
DJNZ MAD_SCRN ; repeat for all 8 bytes
LD A,H ; Convert address from screen memory to attribute memory
SUB $08 ; First, sub 8 to get the original value
RRA
RRA
RRA ; H div 8
AND $03 ; lowest 2 bits
OR $58 ;$58, $59 or $5a - attribute memory address
LD H,A ; So now I have an attribute address in HL
PUSH HL ; save it
CALL LD_ONE_BYTE ; read one byte
LD A,L ; save it to accumulator
POP HL ; restore the address
LD (HL),A ; and put the attribute byte to a proper place
RET ; Finished, one square is done
LD_ONE_BYTE:
LD B,$B2
LD L,$01
LD_8_BITS:
CALL LD_EDGE_2
RET NC
LD A,$CB
CP B
RL L
LD B,$B0
JR NC,LD_8_BITS
SCF
RET
;-----------------------------------
LD_EDGE_2:
CALL LD_EDGE_1
FF78: RET NC
LD_EDGE_1:
LD A,$16
LD_DELAY:
DEC A
FF7C: JR NZ,LD_DELAY
FF7E: AND A
LD_SAMPLE:
INC B
RET Z
LD A,$7F
IN A,($FE)
RRA
XOR C
AND $20
JR Z,LD_SAMPLE
LD A,C
INC A
XOR $20
AND $27
LD C,A
NOP
NOP
AND $07
OR $08
OUT ($FE),A
SCF
RET
Not so visual, but useful loader
I unexpectedly discovered one nice piece about loaders which I would like to share. It’s not very impressive and its magic is hidden inside, but who knows, maybe someone will find it useful.
While elsewhere we often go from theory to practice, let’s do it the other way around this time and go straight to practice. Here’s a .tzx file, try to run it in the emulator…
(a necessary break to download the file, start the emulator, start loading… wait a bit… well… hmm… and what is this, is this all?)
Yeah, that’s all. It is just screens loading from tape. I told you, there’s not much of an effect. But try to look at the TAP file. Sure, it’s BASIC, loader, individual screens, yep, 2682 bytes, 2189, 3340, 4522 bytes, ah… they are compressed, but they are being loaded directly… the loader must be decompressing them on the fly!
Well there you go!
In the previous article I dandily claimed that DigiSynth was unpacking data during loading. Wow, really? Wasn’t I just dreaming that? I do have a leaky memory after all… So I downloaded DigiSynth, stared into the code for a while, and then I saw a familiar part: No, I wasn’t dreaming! Good, I wanted to let it go, but you know how this goes… In a subway, I started tinkering: After all, it cannot be that difficult to reconstruct the loader and make a packer for it … and Huffman compression is simple enough…
Huffman compression
…is really simple. At least once you know a bit about compression methods. If not, let me give you a quick 101:
The simplest compression methods are those which eliminate long sequences of bytes of the same value (RLE). They are quick and easy, so they can be used in a loader and deployed into a copying program (so you can fit more stuff into a memory because during loading it compresses data simply like this: “The following block contains sixty zeros!”)
Better compression methods exploit the fact that some sequences are often repeated. They therefore create a dictionary of repeating sequences, and replace those sequences with short codes. That is why they are called “dictionary methods”. They are based on ancient dictionary algorithms, namely LZ (Lempel-Ziv).
Mr. Huffman have chosen another approach, he suggested compression based not on sequences, but the frequency of occurrence of certain values. In short, it takes all values in the file (eg bytes, so 0 to 255) and count how many times the value occurs in the input file. Using this information it creates a code for each value (a sequence of bits), which has a property that the more frequent the value, the shorter the sequence. For example, those screens have very often zeroes in them. If there really is a frequent occurence of a zero value, it is encoded into some two bits. Even into one, in an extreme case. Yes, on the other hand, the less frequent values can easily occupy twelve or fifteen bits. But this loss is more than made up for by those frequent values.
If the input file contains different values and all of them have approximately the same count, then Huffman compression becomes ineffective, but it gives back good results for regular files. It is also often used to compress the values of LZ compression dictionary (LZHUF). It is also used in JPEG algorithm…
I will not go into the implementation details, it’s enough to know that the algorithm creates a binary tree, which eventually has exactly the same property that I described above.
The disadvantage is that we need to decipher this binary tree first. This disadvantage can be bypassed by adaptive Huffman coding, but that’s computationally demanding during decompression. And therefore not very suitable for Spectrum loaders…
Decompression tree as it is implemented, is in principle a very simple structure. Each item has two values, one for a zero bit, the other for a one bit. The value is either a reference to another item (if the code continues), or the resulting number.
Illustratively
Suppose we have a string ABRAKADABRA. You can see the tree here, and the resulting code is:
A: 0
B: 100
R: 11
K: 1011
D: 1010
The decompression tree will therefore look like this:
0: [*, A | ., 1] 1: [., 2 | *, R] 2: [*, B | ., 3] 3: [*, D | *, K]
Do not worry, I will explain in a jiffy. Decompression always starts on the first item. If the first bit is zero, the left part is taken (*, A); if the bit is a one, the right part is taken (., 1). The asterisk means “I already know this character, it’s this one!” Dot means “continue with that entry”.
Suppose incoming bits will be 0, 1, 0, 0, 1, 1, 0, … What will happen?
- Start with record number 0.
- Bit=0: the left side is telling us that we have found the character A. We have a first byte and we are starting again with record #0
- Bit=1: the right part says that we should continue with record #1
- Bit=0: the left side (.,2) says that we should continue with record #2
- Bit=0: the left part of the record 2 says that we have found letter B. We have a second byte and We are starting again with record #0
- Bit=1: the right part says that we should continue with record #1
- Bit=1: the right part says that we found a character R. Again begin from record #0
- Bit=0: the left part says that we have found a character A.
- … And so on.
Implementation
I wrote the compression algorithm in JavaScript. You can use it yourself, it is a standard HTML page, where you use drag and drop to transfer the file you want to compress, and it returns .tap file with the result, suitable for the loader. Warning: IE3 running on Pentium MMX will probably not work. It doesn’t even work with new IE. Use Chrome or Firefox, thanks.
The resulting file has the following format:
- A flag byte. I am ignoring it
- Checksum. XOR of all the values of the input file
- Length of the decompression table (number of records)
- Data for decompression table. Each entry is stored as a 2×9 bits. The first bit is an attribute, the following eight bits are a value. The attribute determines whether it is a target value (1) or a reference (0). The first 9 bits is for a zero bit, the remaining 9 for a one bit.
- The length of the file in bytes. 2 bytes.
- Compressed file as bitstream
- Bit alignment to eightsome, so there are no problems when copying files
At the beginning the custom loader is a copy of a standard ROM loader, as described previously. To retrieve the entire byte I am using a slightly modified LD_8_BITS routine, which does not put the result into the register L, but instead into the registry E. LD_EDGE is not part of the routine, I’m calling those from the ROM (because I do not create any special effect, additionaly, emulators work better like this and allow for various accelerated loading).
For the decompression table we need to find 1kB of space from the address, which is aligned to the value of $400. I chose $FC00, but you can select a different one. When storing, the attribute of value/reference is stored as a whole byte (00/FF), testing is then simpler (by using simple rotation either zero or one is copied into CY register).
The routine does not check the flag byte nor the data length, all it needs is the address for storing data in the registry IX.
The routine uses no special tricks, everything is as straightforward as I described above.
Oh, and if you want, you can use it for your own creations, it is licensed under CC-0 (Public Domain) licence. Direct your thanks to the author of the original routine from DigiSynth…
Of course, it is possible to improve the compression, remove repetitive sequences, precompress, thereby resulting in improved compression ratio. However, my goal wasn’t a beefy compressor, but to show you how you can incorporate an interesting functionality into a loader.
PS: Manic Miner, the Huffman way
.ORG $f000 ;61440
.ENGINE zxs
;loader test
AGAIN:
LD ix,$4000
CALL LD_BYTES
JP again
;---- more or less copies of RAM routines - ignoring flag byte and attributes
LD_BYTES:
DI ;disable interruption
LD A,$0F ;white BORDER.
OUT ($FE),A
LD HL,$053F ;Adresss SA/LD_RET
PUSH HL ;into buffer
IN A,($FE) ;Test $FE gate
RRA ;Rotation of read byte
AND $20 ;bytes, but cosidering just EAR bit
OR $02 ;Signal BORDER red is also stored in
LD C,A ;register C. ($22 for OFF and $02 for ON state of the EAR input)
CP A ;zero flag is set to 1.
;First task during loading is to determine
;whether a pulse signal exists (therefore edges
;on-off and off-on).
LD_BREAK:
RET NZ ;return from BREAK.
LD_START:
CALL LD_EDGE_1 ;if there is no signal during 1400 T
JR NC,LD_BREAK ;return with CY=1.
;otherwise BORDER is set to cyan.
;next we wait and check for signal presence
LD HL,$0415 ;the wait period is nearly a second
LD_WAIT:
DJNZ LD_WAIT
DEC HL
LD A,H
OR L
JR NZ,LD_WAIT ;waiting loop
CALL LD_EDGE_2 ;continue when catching two subseqent
JR NC,LD_BREAK ;edges in current period
;now just the loading signal will be accepted
LD_LEADER:
LD B,$9C ;timing constant
CALL LD_EDGE_2 ;continue when catching two subsequent
JR NC,LD_BREAK ;edges in current period
LD A,$C6 ;these edges must be caught during
CP B ;3000 T.
JR NC,LD_START
INC H ;number of pairs of edges is stored into H
JR NZ,LD_LEADER ;until there is 256 of them
;parts off and on of pulse sync come after boot signal
LD_SYNC:
LD B,$C9 ;timing constant.
CALL LD_EDGE_1 ;every edge is checked
JR NC,LD_BREAK ;until two edges are found close to each other
LD A,B ;(starting sync pulse).
CP $D4
JR NC,LD_SYNC
CALL LD_EDGE_1 ;at the end there must be a final edge of the on part
RET NC ;of the sync pulse
;now header or program bytes can be loaded
;during operations LOAD, VERIFY.
;the first byte defines a type
LD A,C ;BORDER to green / magenta
XOR $06
LD C,A
;---------
; this is where actual loading begins
; first the decompression tree gets created
; it is stored from address FC00 (must be divisible by $400)
; constant HUF_TABLE is this address / $400
HUF_TABLE EQU $3F ; $3F * $400 => $FC00
LD hl,HUF_TABLE * $400
CALL LD_byte ;flag byte -> E
RET nc
;throw away flag
CALL LD_byte ;checksum -> E
RET nc
;store checksum into A'
LD a,e
EX af,af'
CALL LD_byte ; number of quartlets in compression tree
RET nc
LD d,e
; D values for table
HUF_DECODE:
LD B,$B2
CALL LD_EDGE_2 ;locate length of pulses of each bit
RET NC ;return if wrong (longer) pulse length (then CY=0)
LD A,$CB ;compare length to about 2400 T,
CP B ;when for a zero bit is CY=0 and for a one bit is CY=1.
; first bit of recording is an attribute value / reference
SBC a,a ; CY=0 -> 00, CY=1 -> FF
LD (hl),a ; store into memory attribute as 00 or FF
INC hl ; after it the first value / reference
CALL ld_byte ; load a byte
LD (hl),e ; and store
INC hl ; both will repeat for a one bit
LD B,$AF
CALL LD_EDGE_2 ;locate length of pulses of individual bits
RET NC ;return if wrong (longer) pulse length (then CY=0)
LD A,$CB ;compare length to about 2400 T,
CP B ;when for a zero bit is CY=0 and for a one bit is CY=1.
; one bit of attribute
SBC a,a
LD (hl),a
INC hl
CALL ld_byte
LD (hl),e
INC hl ; quartlet done
DEC d ; already a completed tree?
JR nz,huf_decode ; not yet, keep reading
; table is ready, we can load data now
; first length into registers DE
CALL ld_byte
LD h,e
CALL ld_byte
LD d,e
LD e,h
LD A,C ;BORDER to blue and yellow
XOR $05
LD C,A
; main loading loop
HUF_LOAD:
LD b,$b2
LD l,0
HUF_BIT:
CALL LD_EDGE_2
RET nc
LD a,b
CP $cc ; slightly modified constant
LD h,HUF_TABLE ; H is an upper byte of address / 4
; L lower (reference to a record)
CCF
ADC hl,hl
ADD hl,hl
; HL = table address * 4 + 2 * CY
; therefore for CY=0 it is 4*HL, for CY=1 it is 4*HL+2
RRC (hl) ; attribute value/reference into CY
INC hl
LD l,(hl) ; in L is now value or reference
LD b,$b1 ; set up a timing constant
JR nc,huf_bit ; if it was a reference, continue with next bit
LD (ix+0],l ; if not, in L is a value to store
INC ix ; address++
DEC de ; counter--
EX af,af'
XOR l ; checksum in A'
EX af,af'
LD a,d ; all bytes read?
OR e
JR nz,huf_load ; not yet!
EX af,af'
CP 01 ; if A' not zero, then CY=0 - therefore an error
RET
; byte load into E
LD_BYTE:
LD B,$B2 ;timing constant.
LD_MARKER:
LD E,$01 ;storing of a marker bit
;this loop combines the loading byte into registry E
LD_8_BITS:
CALL LD_EDGE_2 ;locate length of pulses of each bit
RET NC ;return if wrong (longer) pulse length (then CY=0)
LD A,$CB ;compare length to about 2400 T,
CP B ;when for a zero bit is CY=0 and for a one bit is CY=1.
RL E ;storing of a new bit into registry E.
LD B,$B0 ;timing constant for next bit
JR NC,LD_8_BITS ;it was not the last 8th bit
;jump back into the loop.
;
RET ;return with CY=1
LD_EDGE_2 EQU $05e3
LD_EDGE_1 EQU $05e7
Each tick counts
In this part, we’ll need to count and weigh each and every processor tick. This time, our closest ally will be a waiting loop in a routine LD_EDGE_1. Let’s recall that routine:
LD_EDGE_2:
CALL LD_EDGE_1 ;calling, in fact, LD_EDGE_1 once more
RET NC ;return if error
LD_EDGE_1:
LD A,$16 ;7T
LD_DELAY:
DEC A ;4T
JR NZ,LD_DELAY ;12T / 7T
AND A ;4T
LD_SAMPLE:
Do you see it there in LD_DELAY? Yes? So that’s it right there! It takes 7T + 21*(4T+12T) + (4T+7T) = 354T, which is quite a lot of ticks for our purposes. Which purposes, you ask? Well, mostly likely a counter for how much data is remaining to be loaded.
Graphical indicator
In my text game Poradce (The Consultant) I used, besides for some scary flashing, some kind of a “thermometer” that showed how much remains to be loaded. It was that little something on the left side of the screen…
This effect is simpler than you might have expected. The length of the column is almost exactly “file length / 512” (I say “almost exactly” because I had painted it a little shorter…) So I’ll take the upper byte of remaining number of bytes (which is in register pair DE), divide it by two, add offset from the bottom of the screen, convert the coordinate to an address on the screen and mask the byte with value %11100111 – therefore everything else remains except for just two points in the middle, which, coincidentaly, exactly represent our “thermometer” and they get overwritten. To recalculate, I even use a routine PIXEL-ADD ($22AA) from Spectrum’s ROM, which takes the coordinates in B and C (the same coordinate system the PLOT has) and calculates an address on the screen (HL) and a mask of the point (A).
LD_EDGE_2:
CALL LD_EDGE_1
RET NC
LD_EDGE_1:
JP LD_OWN
LD_BACK:
LD A,03 ;7T
LD_DELAY:
DEC A ;4T
JR NZ,LD_DELAY ;12T/7T
;7T + 2*(4+12) + (4+7) = 50T
AND A
LD_SAMPLE:
INC B
RET Z
LD A,7f
IN A,(fe)
RRA
RET NC
XOR C
AND 20
JR Z,LD_SAMPLE
LD A,C
CPL
LD C,A
LD A,06
RRCA
AND 07
OR 08
OUT (fe),A
SCF
RET
LD_OWN:
PUSH DE ;11T
PUSH HL ;11T
PUSH BC ;11T
PUSH AF ;11T
LD B,D ;4T
SRL B ;8T
NOP ;4T
INC B ;4T
INC B ;4T
LD C,00 ;7T
CALL 22aa ;17T + 132T routine
LD A,(HL) ;7T
AND e7 ;7T
LD (HL),A ;7T
POP AF ;10T
POP BC ;10T
POP HL ;10T
POP DE ;10T
JP LD_BACK ;10T
As you can see, I interfered with the timing loop. At the beginning I popped out into my own routine which operates that effect (it is 305T with that diversion) and after returning back I still have enough time, so I’ll wait another 50T, which translates to 355T, which it is pretty much exactly our magic number!
Numeric indicator
If you rather want to show a number instead of some dwindling column, things get more difficult. Each digit must in fact be drawn on the screen, which means eight writes for each digit. And for a three-digit counter this is quite a lot of time, it won’t fit into 354T. So you need to chop the algorithm into parts that fit within this time limit and call them in sequence. The second set of registers and index register IY will be your invaluable tool for this.
Games from Hewson and Czech programs from Universum (I think) had some very nice counters, they used their own font and they also shown fliping digits effect. In another text game I used just a simple counter of “the number of bytes / 64”, which for 48kB block still fits into three digits. At the beginning I prepared the desired digits (in decimal!) into registers D and E (resp. into the mirror ones):
PUSH DE
EXX
POP HL
LD DE,00
LD BC,$40
LD_DIV:
AND A
SBC HL,BC
JR C,b0be
LD A,E
ADD A,01
DAA
LD E,A
LD A,D
ADC A,00
DAA
LD D,A
JR LD_DIV
LD_DIV2:
LD H,3d
LD BC,4001
EXX
LD IY, LD_CHARS ; coming up next...
This effect did not take place directly in LD_EDGE but in LD_8_BITS instead. However, LD_EDGE has been modified so as to use significantly different timing values during loading:
LD B,b0
LD L,01
LD_8_BITS:
CALL LD_EDGE_2M
RET NC
LD A,d4
CP B
RL L
LD B,b0
JP NC,LD_LONGWAY
LD A,H
XOR L
LD H,A
;... the rest of it is normal...
;LD_EDGE_2M : 468 + 58 * B
;shortened by 462T
LD_EDGE_2M:
CALL LD_EDGE_1M
RET NC
LD A,10
DEC A
JR NZ,$-1 ; 254T
LD_EDGE_1M:
AND A
INC B
RET Z
LD A,7f
IN A,(fe)
RRA
XOR C
AND 20
JR Z,b15b
LD A,C
CPL
LD C,A
AND 04
OR 08
OUT (fe),A
SCF
RET
The effect ifself then played out as follows:
LD_LONGWAY:
EXX
DEC C
JP Z,LD_SUB1
JP (IY)
LD_RETHERE:
POP IY
EXX
JP LD_8_BITS
LD_SUB1:
LD C,07
DEC B
JP Z,LD_SUB2
LD A,04
DEC A
JR NZ,b1a9 ;wait 59T
EXX
JP LD_8_BITS
LD_SUB2:
LD A,E
SUB 01
DAA
LD E,A
LD A,D
SBC A,B ;in B there is 0
DAA
LD D,A
LD B,$40
LD IY,LD_CHARS
LD HL,$3d00
EXX
JP LD_8_BITS
;1st digit
LD_CHARS:
LD A,E
ADD A,A
ADD A,A
ADD A,A
OR $81
LD L,A
LD A,(HL)
LD (50fd),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (51fd),A
INC L
LD A,(HL)
LD (52fd),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (53fd),A
INC L
LD A,(HL)
LD (54fd),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (55fd),A
INC L
LD A,(HL)
LD (56fd),A
CALL LD_RETHERE
;2nd digit
LD A,E
AND $f0
RRA
OR $81
LD L,A
LD A,(HL)
LD (50fc),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (51fc),A
INC L
LD A,(HL)
LD (52fc),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (53fc),A
INC L
LD A,(HL)
LD (54fc),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (55fc),A
INC L
LD A,(HL)
LD (56fc),A
CALL LD_RETHERE
;3rd digit
LD A,D
ADD A,A
ADD A,A
ADD A,A
OR $81
LD L,A
LD A,(HL)
LD (50fb),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (51fb),A
INC L
LD A,(HL)
LD (52fb),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (53fb),A
INC L
LD A,(HL)
LD (54fb),A
CALL LD_RETHERE
INC L
LD A,(HL)
LD (55fb),A
INC L
LD A,(HL)
LD (56fb),A
LD (56fb),A
NOP
LD IY,LD_CHARS
EXX
JP LD_8_BITS
This, of course, deserves a few explanatory notes:
Registers DE contain the counter itself, stored in BCD coding. The register C contains a bits counter, register B contains a bytes counter. Once they calculate down to zero, the counter gets decreased by 1, the result gets adjusted by DAA instruction and prepared for the actual printing of characters. HL holds the value of $3D00, which coincidentally is an address in the ROM where numbers are stored, and IY holds an address of the routine LD_CHARS, which displays the digits.
When you chop the routine into pieces, you are left with two options. You either edit the code and rewrite the address where you need to jump next, or you store the address somehow. Here it is in the IY register. By using a simple trick CALL LD_RETHERE (as in RETurn HERE) you go back to LD_8_BITS, while the return address gets into IY, so that during the next call the JP (IY) will be directed to the next piece. Note that printing of characters is chopped into small pieces by calling LD_RETHERE.
Time indicator and more
Calculating remaining time is more complicated than counting bytes. You either have to count every bit differently (a one bit is twice as long as a zero bit), or use an interrupt (yes, interrupt in loader!)
I was looking for an example of the first approach, but before I could comment on it, Busy called me and said that he had found his Overscan loader source code, asking me if I want to publish it. So here we go, with Busy’s courtesy, Overscan loader!
(Time calculation and graphical effects are included!)
And here is the source code with Busy’s comments (Thanks!) Because it is a bit long, I shall say my goodbyes right now and wish you good luck in your own experiments.
5b00 *a
5b00 *s
5b00 ;===============================================================;
5b00 ;== Version 16 == Loader for Overscan == 12.08.1991 Busy soft ==;
5b00 ;===============================================================;
5b00 znaky = #8800 modifiet charset
5b00 org #8200,0
8200 f3 p di
8201 310082 ld sp,p
8204 fd217f00 ld iy,#7f timers init for time and auto kitt
8208 210080 ld hl,#8000 IM2 vector init
820b 110180 ld de,#8001
820e 011001 ld bc,#0110
8211 3681 ld (hl),#81
8213 edb0 ldir
8215 212183 ld hl,rut relocation of operation routine IM2
8218 118181 ld de,#8181
821b 010800 ld bc,load-rut
821e edb0 ldir
8220 067f i ld b,#7f
8222 216711 ld hl,#1167
8225 11d685 ld de,k
8228 d5 push de
8229 7e ll14 ld a,(hl) random memory inserts
822a ad xor l (to fool an enemy)
822b 12 ld (de),a
822c 02 ld (bc),a
822d a9 xor c
822e 0b dec bc
822f 02 ld (bc),a
8230 0b dec bc
8231 13 inc de
8232 2b dec hl
8233 7c ld a,h
8234 b5 or l
8235 20f2 jr nz,ll14
8237 e1 pop hl
8238 012a7a ld bc,-k
823b edb0 ldir
823d 3e80 rst ld a,#80 IM2 launch
823f ed47 ld i,a
8241 ed5e im2
8243 fb ei
8244 cd0c85 call zn charset conversion
8247 3e48 ld a,#48
8249 32b784 ld (n20+2),a
824c 210040 ld hl,#4000 screen init
824f 110140 ld de,#4001
8252 010018 ld bc,#1800
8255 71 ld (hl),c
8256 edb0 ldir
8258 010603 ld bc,#0306
825b 71 ld (hl),c
825c edb0 ldir
825e 21a341 o ld hl,#41a3 squares under kitt effect
8261 0e06 ld c,#06
8263 061a ll11 ld b,#1a
8265 e5 push hl
8266 367e ll10 ld (hl),#7e
8268 2c inc l
8269 10fb djnz ll10
826b e1 pop hl
826c 24 inc h
826d 0d dec c
826e 20f3 jr nz,ll11
8270 210059 ld hl,#5900 red frame in the middle
8273 011208 ld bc,#0812
8276 5d ld e,l
8277 73 ll12 ld (hl),e
8278 2c inc l
8279 71 ld (hl),c
827a 7d ld a,l
827b c61d add a,#1d
827d 6f ld l,a
827e 71 ld (hl),c
827f 2c inc l
8280 73 ld (hl),e
8281 2c inc l
8282 10f3 djnz ll12
8284 21e15a ld hl,#5ae1
8287 1e44 ld e,#44
8289 cd0083 call sap
828c 21e158 ld hl,#58e1
828f cdfe82 call pas
8292 21015a ld hl,#5a01
8295 cdfe82 call pas
8298 215f85 ld hl,firma texts printout
829b cd0783 call text
829e 217685 ld hl,demo
82a1 cd0783 call text
82a4 219185 ld hl,loatim
82a7 cd0783 call text
82aa cd2983 call load calling the loader
82ad 08 ex af,af
82ae af xor a
82af cdf884 call out Border 0
82b2 08 ex af,af
82b3 3839 jr c,ok if we loaded a block correctly, then jump
82b5 af error xor a if error during loading
82b6 210048 ld hl,#4800 then show message
82b9 77 ll13 ld (hl),a
82ba 23 inc hl
82bb cb64 bit 4,h
82bd 28fa jr z,ll13
82bf 112548 ld de,#4825
82c2 212515 ld hl,#1525
82c5 cd0b83 call txt
82c8 21d285 ld hl,krik
82cb cd0b83 call txt
82ce 21a585 ld hl,rew
82d1 cd0783 call text
82d4 21b785 ld hl,reload
82d7 cd0783 call text
82da af xor a
82db 32b784 ld (n20+2),a
82de 06ff press ld b,#ff waiting for keypress
82e0 cdcf83 call mmm we call loader during waiting
82e3 af xor a so our effects still run
82e4 dbfe in a,(#fe)
82e6 f6e0 or #e0
82e8 3c inc a
82e9 28f3 jr z,press
82eb c33d82 jp rst
82ee
82ee 210058 ok ld hl,#5800 if we loaded a block correctly
82f1 110158 ld de,#5801 erase all attributes
82f4 0603 ld b,#03
82f6 edb0 ldir
82f8 cde6c3 call 50150 demo decompression
82fb c376e9 jp 59766 demo launch
82fe
82fe 1e12 pas ld e,#12 some help subroutines
8300 061e sap ld b,#1e for frame drawing
8302 73 pp1 ld (hl),e
8303 2c inc l
8304 10fc djnz pp1
8306 c9 ret
8307
8307 5e text ld e,(hl) text printout
8308 23 inc hl HL = text address
8309 56 ld d,(hl) DE = text position on screen
830a 23 inc hl
830b e5 txt push hl
830c d5 push de
830d 6e ld l,(hl)
830e 2688 ld h,>znaky
8310 0608 ld b,#08
8312 7e xtx ld a,(hl) print of one character
8313 12 ld (de),a
8314 14 inc d
8315 24 inc h
8316 10fa djnz xtx
8318 d1 pop de
8319 e1 pop hl
831a 1c inc e
831b cb7e bit 7,(hl) text ends with char with set bit7=1
831d 23 inc hl
831e 28eb jr z,txt
8320 c9 ret
8321
8321 08 rut ex af,af routine from interrupt
8322 fd24 inc yh YH = timer for kitt effect
8324 fd2c inc yl YL = timer for counting and time printing
8326 08 ex af,af
8327 fb ei
8328 c9 ret
8329
8329 af load xor a THIS IS WHERE LOADER BEGINS!!
832a cdf884 call out
832d 211f40 rohy ld hl,#401f drawing of cut off corners
8330 11ff57 ld de,#57ff
8333 7b ld a,e
8334 77 ld (hl),a
8335 12 ld (de),a
8336 d9 exx
8337 010207 ld bc,#0702
833a 210040 ld hl,#4000
833d 11e057 ld de,#57e0
8340 77 ld (hl),a
8341 12 ld (de),a
8342 7e rr1 ld a,(hl)
8343 24 inc h
8344 15 dec d
8345 cb27 sla a
8347 77 ld (hl),a
8348 12 ld (de),a
8349 d9 exx
834a 7e ld a,(hl)
834b 24 inc h
834c 15 dec d
834d cb3f srl a
834f 77 ld (hl),a
8350 12 ld (de),a
8351 d9 exx
8352 10ee djnz rr1
8354 nnn
8354 3ef8 rrr ld a,#f8 catching the leader tone
8356 32f484 ld (xor+1),a
8359 2600 ld h,#00
835b 06ff djnz ld b,#ff
835d cdcf83 call mmm
8360 25 dec h
8361 20f8 jr nz,djnz
8363 cdcf83 call mmm
8366 30ec jr nc,nnn
8368 cdcb83 call ppp
836b 30e7 jr nc,nnn
836d 069c sss ld b,#9c
836f cdcb83 call ppp
8372 30e0 jr nc,nnn
8374 3ec6 ld a,#c6
8376 b8 cp b
8377 30db jr nc,rrr
8379 24 inc h
837a 20f1 jr nz,sss
837c 3efc ld a,#fc
837e 32f484 ld (xor+1),a
8381 06c9 ttt ld b,#c9
8383 cdcf83 call mmm
8386 30cc jr nc,nnn
8388 78 ld a,b
8389 fed4 cp #d4
838b 30f4 jr nc,ttt
838d cdcf83 call mmm
8390 d0 ret nc
8391 79 ld a,c leader and sync-pulse OK, we can start loading
8392 ee03 xor #03
8394 4f ld c,a
8395 cdb583 call byte flagbyte Load
8398 d0 ret nc
8399
8399 dd21e6c3 loa ld ix,50150 demo bytes loading loop
839d 116a24 ld de,9322
83a0 ; ld ix,#4000 during testing of effects in loader
83a0 ; ld de,#1b00 the block was loaded onto the screen
83a0 cdb583 loa1 call byte
83a3 d0 ret nc
83a4 dd7500 ld (ix+#00),l
83a7 dd23 inc ix
83a9 1b dec de
83aa 7a ld a,d
83ab b3 or e
83ac 20f2 jr nz,loa1
83ae cdb583 call byte load parity
83b1 d0 ret nc
83b2 fe01 cp #01
83b4 c9 ret
83b5
83b5 06b2 byte ld b,#b2 Load of one byte
83b7 2e01 ld l,#01
83b9 cdcb83 loa9 call ppp
83bc d0 ret nc
83bd 3ecb kk ld a,#cb
83bf b8 cp b
83c0 cb15 rl l
83c2 06b0 ld b,#b0
83c4 30f3 jr nc,loa9
83c6 7c ld a,h
83c7 ad xor l
83c8 67 ld h,a
83c9 37 scf
83ca c9 ret
83cb
83cb cdcf83 ppp call mmm Load of one byte
83ce d0 ret nc
83cf d9 mmm exx load of one halfperiod
83d0 c3 db #c3 there are my special subroutines instead of wating loop
83d1 ec83 skok dw n1 jump to part of code to execute next
83d3
83d3 7e n1sub ld a,(hl) /0
83d4 81 add a,c time value recalcullation
83d5 fe0a cp #0a
83d7 0e00 ld c,#00
83d9 3802 jr c,n1aa
83db 0c inc c
83dc af xor a
83dd 77 n1aa ld (hl),a
83de 23 inc hl \50
83df 7e ld a,(hl)
83e0 81 add a,c
83e1 fe06 cp #06
83e3 0e00 ld c,#00
83e5 3802 jr c,n1bb
83e7 0c inc c
83e8 af xor a
83e9 77 n1bb ld (hl),a
83ea 23 inc hl \100
83eb c9 ret \110+call=117
83ec
83ec fd7d n1 ld a,yl
83ee fe33 cp 51 \29
83f0 3866 jr c,n10 Test if a second has passed
83f2 0e01 ld c,#01 so that we could draw another time value
83f4 fd7d ld a,yl
83f6 d632 sub 50
83f8 fd6f ld yl,a
83fa 215a85 ld hl,cas \76
83fd cdd383 call n1sub
8400 3e04 ld a,#04
8402 110784 ld de,n3
8405 181a jr jpde \222 +/-
8407
8407 23 n3 inc hl every second we also change
8408 cdd383 call n1sub an efect in middle third
840b ed5f ld a,r
840d 32b984 ld (udaj+1),a
8410 3a5b85 ld a,(cas+1)
8413 0f rrca
8414 9f sbc a,a
8415 e60f and #0f
8417 f607 or #07
8419 32cc84 ld (rot),a
841c 3e05 ld a,#05
841e 112a84 ld de,n2
8421 ed53d183 jpde ld (skok),de
8425 1ef9 ld e,#f9
8427 c3e084 jp wait \222 +/-
842a
842a 2b n2 dec hl
842b cb7e bit 7,(hl) \32
842d 2808 jr z,n2aa prepare to print one character
842f 3e0e ld a,14
8431 01ec83 ld bc,n1
8434 c3dc84 jp buduci
8437 e5 n2aa push hl
8438 6e ld l,(hl)
8439 2688 ld h,>znaky
843b 1650 ld d,#50
843d 0602 ld b,#02 /2*88=202
843f 7e dd1 ld a,(hl) print of one character
8440 12 ld (de),a
8441 24 inc h
8442 14 inc d
8443 7e ld a,(hl)
8444 12 ld (de),a
8445 24 inc h
8446 14 inc d
8447 7e ld a,(hl)
8448 12 ld (de),a
8449 24 inc h
844a 14 inc d
844b 7e ld a,(hl)
844c 12 ld (de),a
844d 24 inc h
844e 14 inc d
844f 10ee djnz dd1
8451 1c inc e
8452 e1 pop hl
8453 3e01 ld a,#01 \299
8455 c3e084 jp wait
8458
8458 fd7c n10 ld a,yh Auto-Kitt effect
845a fe02 cp #02 \56
845c 3857 jr c,n20 test if we can go into another phase
845e c3 db #c3
845f 9584 jump dw n12 \73
8461
8461 21a358 n11 ld hl,#58a3
8464 3647 ld (hl),#47
8466 2c incdec inc l
8467 7d ld a,l
8468 326284 ld (n11+1),a
846b 216684 ld hl,incdec \124
846e febc cp #bc
8470 3f ccf
8471 3802 jr c,iidd
8473 fea4 cp #a4
8475 9f iidd sbc a,a \151
8476 e601 and #01
8478 ae xor (hl)
8479 77 ld (hl),a
847a 218484 ld hl,jr+1
847d 3e03 ld a,#03
847f ae xor (hl)
8480 77 ld (hl),a
8481 3ea2 ld a,#a2 \210
8483 1803 jr jr #03
8485 329684 ld (n12+1),a
8488 fd2600 ld yh,#00
848b 3e02 ld a,#02
848d 219584 ld hl,n12
8490 225f84 buduce ld (jump),hl
8493 184b jr wait \267
8495
8495 216458 n12 ld hl,#5864 \73
8498 7d ld a,l
8499 febc cp #bc \94
849b 3011 jr nc,n12bb
849d 014104 ld bc,#0441 \111
84a0 7e n12aa ld a,(hl) /47
84a1 b9 cp c
84a2 3801 jr c,#01
84a4 3d dec a
84a5 77 ld (hl),a
84a6 2c inc l
84a7 10f7 djnz n12aa \4*47=188
84a9 229684 ld (n12+1),hl
84ac 1835 jr end
84ae
84ae 3e08 n12bb ld a,#08
84b0 216184 ld hl,n11
84b3 18db jr buduce /\174
84b5
84b5 210048 n20 ld hl,#4800 filling the middle third of the screen
84b8 3e33 udaj ld a,#33 \82
84ba 772c772c dw #2c77,#2c77 8*[INC C : LD (HL),A]
84be 772c772c dw #2c77,#2c77
84c2 772c772c dw #2c77,#2c77 8*11=88
84c6 772c772c dw #2c77,#2c77 \
84ca 2002 jr nz,n21 /
84cc 0f rot rrca
84cd 24 inc h
84ce cba4 n21 res 4,h
84d0 cbdc set 3,h
84d2 22b684 ld (n20+1),hl
84d5 32b984 ld (udaj+1),a
84d8 3e03 ld a,3
84da 1804 jr wait \68
84dc
84dc ed43d183 buduci ld (skok),bc setting of what part of code to call next
84e0 3d wait dec a
84e1 20fd jr nz,wait
84e3 d9 end exx all code parts last 200 cycles (+/- give or take)
84e4 a7 zzz and a end of my special subroutines
84e5 04 l222 inc b
84e6 c8 ret z
84e7 3eff ld a,#ff #FF instead of #7F does not respond to SPACEBAR
84e9 dbfe in a,(#fe)
84eb 1f rra
84ec d0 ret nc
84ed a9 xor c
84ee e620 and #20
84f0 28f3 jr z,l222
84f2 79 ld a,c
84f3 eefc xor xor #fc #FC instead of CPL makes prettier colours
84f5 4f ld c,a red-yellow boot tone and blue-blue bytes
84f6 e607 and #07
84f8 320058 out ld (#5800),a +52
84fb 321f58 ld (#581f),a setting of corner attributes
84fe f608 or #08
8500 d3fe out (#fe),a
8502 e607 and #07
8504 32e05a ld (#5ae0),a
8507 32ff5a ld (#5aff),a
850a 37 scf
850b c9 ret
850c
850c 110088 zn ld de,znaky charset conversion
850f d5 ll1 push de from ROM font makes contures
8510 7b ld a,e also font gets reorganized so that
8511 fe20 cp ' ' print routines can run faster
8513 3002 jr nc,#02
8515 c630 add a,'0'
8517 87 add a,a
8518 6f ld l,a
8519 260f ld h,#0f
851b 29 add hl,hl
851c 29 add hl,hl
851d cd4985 call read
8520 4f ld c,a
8521 23 inc hl
8522 cd5085 call write
8525 0606 ld b,#06
8527 cd4985 ll2 call read
852a 4f ld c,a
852b 2b dec hl
852c cd4985 call read
852f b1 or c
8530 4f ld c,a
8531 23 inc hl
8532 23 inc hl
8533 cd5085 call write
8536 10ef djnz ll2
8538 cd4985 call read
853b 4f ld c,a
853c 2b dec hl
853d cd4985 call read
8540 23 inc hl
8541 cd5485 call rit
8544 d1 pop de
8545 1c inc e
8546 20c7 jr nz,ll1
8548 c9 ret
8549
8549 7e read ld a,(hl)
854a 0f rrca
854b 0f rrca
854c b6 or (hl)
854d 07 rlca
854e b6 or (hl)
854f c9 ret
8550
8550 cd4985 write call read
8553 2b dec hl
8554 b1 rit or c
8555 ae xor (hl)
8556 12 ld (de),a
8557 23 inc hl
8558 14 inc d
8559 c9 ret
855a 00000a00 cas db 0,0,10,0,0 Buffer for time value
855f 4550 firma dw #5045
8561 42757379 db 'Busy software '
6
2
856f 70726573 db 'presen','t'+#80
8576 8350 demo dw #5083
8578 3e3e3e20 db '>>> THE OVER'
0
8584 5343414e db 'SCAN DEMO <<'
d
8590 bc db '<'+#80
8591 e250 loatim dw #50e2
8593 4c6f6164 db 'Loading'
859a 20636361 db ' cca 1 min'
d
85a4 ae db '.'+#80
85a5 8848 rew dw #4888
85a7 52657769 db 'Rewing the tape'
4
4
85b6 ac db ','+#80
85b7 c448 reload dw #48c4
85b9 70726573 db 'press any key '
e
5
85c7 616e6420 db 'and reload'
f
85d1 ae db '.'+#80
85d2 202121a1 krik db ' !!','!'+#80
85d6 k
85d6 l = k-p label "l" will be total code length
85d6
85d6 org #6000,0 next routines have no use
6000 f3 mrs di they were debug routines
6001 ed56 im1
6003 af xor a
6004 ed47 ld i,a
6006 c3b3f4 jp #f4b3 jump into MRS
6009
6009 cd0c85 zs call zn Test of conversion of charset
600c 210088 ld hl,znaky printing of all characters
600f 110040 ld de,#4000 on the screen
6012 010008 ld bc,#0800
6015 edb0 ldir
6017 c9 ret
6018
6018 3e0c poke ld a,12 sets testing im2 vector into ROM
601a d317 out (23),a (Outs for writing into ROM on MB01)
601c 218181 ld hl,#8181
601f 22ff3a ld (#3aff),hl
6022 3e04 ld a,4
6024 d317 out (23),a
6026 c9 ret
6027 end
(Translated by Ondřej Ficek)
Comments powered by Talkyard.