ZX Spectrum tape loader effects

(A repost of my older pieces about ZX Spectrum programming and hacking.)

Everyone who had once owned a ZX Spectrum, surely had at least once seen an amazing loading effect and wondered: How are they doing this? So, come and have a read…

Because ZX Spectrum didn’t have any dedicated circuit for storing and retrieving programs to and from a cassette tape (such as PMD), it had to do it purely by software. At the highest level, programs were loaded as a pair of “header + data”. In the header file there was information about the file (name, type, length, etc.) and the data file contained actual custom data.

The header file has a length of 17 bytes and a following structure:

PositionLengthDescription
1Type (0=program, 1=number array, 2=character array, 3=code)
110Name (right padded with spaces to 10 characters)
112Length of data block
132Parameter 1. Eg. for programs it is a parameter LINE, for ‘code’ it is the address of beginning of the code block
152Parameter 2. Eg. for programs it is the beginning of variables

Data block doesn’t have any such structure, it just contains data.

Let us descend one level: Each block was preceded by one flag byte ($00 for a header, $ff for data) and terminated with a checksum (all bytes, including the flag byte, XORed together). Header was therefore actually 19 bytes long – it began with 0 and ended with the checksum, but these bytes were “consumed” by the system and you couldn’t get your hands on them.

This is where the level at which you could get by calling routines from the ROM ends. The rest of it was just

Ones and zeroes

Data on the tape was written as a sequence of pulses. The processor took care of generating of these pulses, including the proper timing.

Each entry was preceded by a so-called pilot tone. It was created by a rectangular signal (where ON and OFF states were regularly alternated), which means that MIC output in the ON state was 2168 T long (T is the processor clock), and in the OFF state, 2168 T long as well. Boot tone prepared input circuits (if there were any) in a tape recorder to the correct volume level. This boot tone before the header lasted 5 seconds and before the data block two seconds. The duration of the phase tone was determined by a flag byte, all values less than 128 meant a “long” phase tone and more than 128 meant a short one.

After the pilot tone a sync pulse was generated. Its purpose was to let the loading routine know that the pilot tone ends and to begin to process the data. Sync pulse was 667 T ON and 735 T OFF long.

The sync pulse was then followed by actual bytes of data. First was the flag byte, then the contents, and eventually checksum. Bytes were sent bit by bit from the highest one. Every bit was represented by a pulse (state switching to ON and OFF), but differed in length. For logical 0 it was 855T in the ON state and 855T in the OFF state, for logical 1 it was doubled, thus 1710T and 1710T, respectively.

When reading data (which is what we are after here) processor then measured the time between each change of the input signal (ie. time between transitions, or the so-called “Edges”). The recording was therefore not sensitive to polarity (such as for aforementioned PMD in the first version), however the processor did not have much time to do anything else, since it had to listen to EAR input, count loops until the signal changed, and check for SPACE key being pressed (which interrupted the loading, as you surely remember).

Assembler time

Now this is the moment to extract the ROM contents (taken from ‘The Complete Spectrum ROM Disassembly’ by Dr. Ian Logan and Dr. Frank O’Hara, as published by Melbourne House in 1983, available at WOS):

THE 'LD-BYTES' SUBROUTINE
This subroutine is called to LOAD the header information (from 076E) and later
LOAD, or VERIFY, an actual block of data (from 0802).

0556 LD-BYTES   INC  D                  This resets the zero flag (D cannot
                                        hold +FF)
                EX   AF,AF'             The A register holds +00 for a header
                                        and +FF for a block of data
                                        The carry flag is reset for VERIFYing
                                        and set for LOADing
                DEC  D                  Restore D to its original value
                DI                      The maskable interrupt is now disabled
                LD   A,+0F              The border is made WHITE
                OUT  (+FE),A
                LD   HL,+053F           Pre-load the machine stack with the
                PUSH HL                 address - SA/LD-RET
                IN   A,(+FE)            Make an initial read of port 254
                RRA                     Rotate the byte obtained
                AND  +20                but keep only the EAR bit
                OR   +02                Signal RED border
                LD   C,A                Store the value in the C register
                                        (+22 for 'off' and +02 for 'on' - the
                                        present EAR state)
                CP   A                  Set the zero flag

The first stage of reading a tape involves showing that a pulsing signal
actually exists. (i.e. 'On/off' or 'off/on' edges.)

056B LD-BREAK   RET  NZ                 Return if the BREAK key is being pressed
056C LD-START   CALL 05E7,LD-EDGE-1     Return with the carry flag reset if
                JR   NC,056B,LD-BREAK   there is no 'edge' within approx.
                                        14,000 T states. But if an 'edge' is
                                        found the border will go CYAN

The next stage involves waiting a while and then showing that the signal is
still pulsing.

                LD   HL,+0415           The length of this waiting period will
0574 LD-WAIT    DJNZ 0574,LD-WAIT       be almost one second in duration.
                DEC  HL
                LD   A,H
                OR   L
                JR   NZ,0574,LD-WAIT
                CALL 05E3,LD-EDGE-2     Continue only if two edges are found
                JR   NC,056B,LD-BREAK   within the allowed time period.

Now accept only a 'leader signal'.

0580 LD-LEADER  LD   B,+9C              The timing constant
                CALL 05E3,LD-EDGE-2     Continue only if two edges are found
                JR   NC,056B,LD-BREAK   within the allowed time period
                LD   A,+C6              However the edges must have been found
                CP   B                  within about 3,000 T states of each
                JR   NC,056C,LD-START   other
                INC  H                  Count the pair of edges in the H
                JR   NZ,0580,LD-LEADER  register until 256 pairs have been found

After the leader come the 'off' and 'on' parts of the sync pulse.

058F LD-SYNC    LD   B,+C9              The timing constant
                CALL 05E7,LD-EDGE-1     Every edge is considered until two edges
                JR   NC,056B,LD-BREAK   are found close together - these will be
                LD   A,B                the start and finishing edges of the
                CP   +D4                'off' sync pulse
                JR   NC,058F,LD-SYNC
                CALL 05E7,LD-EDGE-1     The finishing edge of the 'on' pulse
                RET  NC                 must exist
                                        (Return carry flag reset)

The bytes of the header or the program/data block can now be LOADed or VERIFied.
But the first byte is the flag byte.

                LD   A,C                The border colours from now on will be
                XOR  +03                BLUE & YELLOW
                LD   C,A
                LD   H,+00              Initialize the 'parity matching' byte
                                        to zero
                LD   B,+B0              Set the timing constant for the flag
                                        byte.
                JR   05C8,LD-MARKER     Jump forward into the byte LOADing loop

The byte LOADing loop is used to fetch the bytes one at a time. The flag byte is
first. This is followed by the data bytes and the last byte is the 'parity'
byte.

05A9 LD-LOOP    EX   AF,AF'             Fetch the flags
                JR   NZ,05B3,LD-FLAG    Jump forward only when handling the
                                        first byte
                JR   NC,05BD,LD-VERIFY  Jump forward is VERIFYing a tape
                LD   (IX+00),L          Make the actual LOAD when required
                JR   05C2,LD-NEXT       Jump forward to LOAD the next byte
05B3 LD-FLAG    RL   C                  Keep the carry flag in a safe place
                                        temporarily
                XOR  L                  Return now if the flag byte does not
                RET  NZ                 match the first byte on the tape
                                        (Carry flag reset)
                LD   A,C                Restore the carry flag now
                RRA
                LD   C,A
                INC  DE                 Increase the counter to compensate for
                JR   05C4,LD-DEC        its decrease after the jump

If a data block is being verified then the freshly loaded byte is tested against
the original byte.

05BD LD-VERIFY  LD   A,(IX+00)          Fetch the original byte
                XOR  L                  Match it against the new byte
                RET  NZ                 Return if 'no match' (Carry flag reset)

A new byte can now be collected from the tape.

05C2 LD-NEXT    INC  IX                 Increase the 'destination'
05C4 LD-DEC     DEC  DE                 Decrease the 'counter'
                EX   AF,AF'             Save the flags
                LD   B,+B2              Set the timing constant
05C8 LD-MARKER  LD   L,+01              Clear the 'object' register apart from
                                        a 'marker' bit

The 'LD-8-BITS' loop is used to build up a byte in the L register.

05CA LD-8-BITS  CALL 05E3,LD-EDGE-2     Find the length of the 'off' and 'on'
                                        pulses of the next bit
                RET  NC                 Return if the time period is exceeded
                                        (Carry flag reset)
                LD   A,+CB              Compare the length against approx.
                CP   B                  2,400 T states; resetting the carry flag
                                        for a '0' and setting it for a '1'
                RL   L                  Include the new bit in the L register
                LD   B,+B0              Set the timing constant for the next bit
                JP   NC,05CA,LD-8-BITS  Jump back whilst there are still bits to
                                        be fetched

The 'parity matching' byte has to be updated with each new byte.

                LD   A,H                Fetch the 'parity matching' byte and
                XOR  L                  include the new byte
                LD   H,A                Save it once again

Passes round the loop are made until the 'counter' reaches zero. At that point
the 'parity matching' byte should be holding zero.

                LD   A,D                Make a furter pass if the DE register
                OR   E                  pair does not hold zero
                JR   NZ,05A9,LD-LOOP
                LD   A,H                Fetch the 'parity matching' byte
                CP   +01                Return with the carry flag set if the
                RET                     value is zero (Carry flag reset if in
                                        error)


THE 'LD-EDGE-2' and 'LD-EDGE-1' SUBROUTINES
These two subroutines form the most important part of the LOAD/VERIFY operation.
The subroutines are entered with a timing constant in the B register, and the
previous border colour and 'edge-type' in the C register.
The subroutines return with the carry flag set if the required number of 'edges'
have been found in the time allowed; and the change to the value in the B
register shows just how long it took to find the 'edge(s)'.
The carry flag will be reset if there is an error. The zero flag then signals
'BREAK pressed' by being reset, or 'time-up' by being set.
The entry point LD-EDGE-2 is used when the length of a complete pulse is
required and LD-EDGE-1 is used to find the time before the next 'edge'.

05E3 LD-EDGE-2  CALL 05E7,LD-EDGE-1     In effect call LD-EDGE-1 twice;
                RET  NC                 returning in between in there is an
                                        error.
05E7 LD-EDGE-1  LD   A,+16              Wait 358 T states before entering the
05E9 LD-DELAY   DEC  A                  sampling loop
                JR   NZ,05E9,LD-DELAY
                AND  A

The sampling loop is now entered. The value in the B register is incremented for
each pass; 'time-up' is given when B reaches zero.

05ED LD-SAMPLE  INC  B                  Count each pass
                RET  Z                  Return carry reset & zero set if
                                        'time-up'.
                LD   A,+7F              Read from port +7FFE
                IN   A,(+FE)            i.e. BREAK and EAR
                RRA                     Shift the byte
                RET  NC                 Return carry reset & zero reset if BREAK
                                        was pressed
                XOR  C                  Now test the byte against the 'last
                AND  +20                edge-type'
                JR   Z,05ED,LD-SAMPLE   Jump back unless it has changed

A new 'edge' has been found within the time period allowed for the search.
So change the border colour and set the carry flag.

                LD   A,C                Change the 'last edge-type' and border
                CPL                     colour
                LD   C,A
                AND  +07                Keep only the border colour
                OR   +08                Signal 'MIC off'
                OUT  (+FE),A            Change the border colour (RED/CYAN or
                                        BLUE/YELLOW)
                SCF                     Signal the successful search before
                RET                     returning


Note: The LD-EDGE-1 subroutine takes 464 T states, plus an additional 59 T
states for each unsuccessful pass around the sampling loop.
For example, therefore, when awaiting the sync pulse (see LD-SYNC at 058F)
allowance is made for ten additional passes through the sampling loop.
The search is thereby for the next edge to be found within, roughly, 1,100 T
states (464 + 10 * 59 overhead).
This will prove successful for the sync 'off' pulse that comes after the long
'leader pulses'.

In this article there is no space for detailed description of how the algorithm works, but let’s quickly go over some facts.

Routine disables interrupts. This is logical, since it is dependent on exact timing. That’s also the reason why loaders stored in slow RAM ($4000- $7FFF) will not work.

The last state of EAR input is stored in the C register, but it is shifted by 1 bit to the right, and the lowest three bits contains border color. For the leader it is 02 (red) / 05 (cyan), for data it is 01 (blue) / 06 (yellow). Why the right shift? During loading, a sequence LD A, $7F and IN A, ($FE) is processed. The value $7F is sent to the upper 8 bits of the address, therefore when reading a keyboard input it selects a line of keys B – N – M – Symbol Shift – Space. The status of these keys is in the lowest five bits (0-4), bit 6 contains an EAR state. Using the RRC instruction the lowest bit (status of the SPACE key) moves to the flag CY and EAR state moves to the 5th bit position. Therefore, if CY is zero, it means that a SPACE was pressed and loading ends.

The routine begins by detecting a signal that corresponds to the pilot tone (LD-LEADER). If 255 of such pulses were read, it was then believed that this was a pilot tone and the routine waits for a shorter synchronization pulse. When it arrives (LD-SYNC), you can retrieve data.

LD-MARKER reads 1 byte to register L. It begins with a value of 01, to serve as a counter. Gradually it fills with bits from the right by instruction RL L, the highest bits then passes into CY. If CY = 0, they keep loading further, but once CY = 1, it means that a complete set of eight bits was retrieved.

Key routines are LD-EDGE-n (wherein n is 1 or 2). LD-EDGE-1 first waits for a certain period of time (465T) and then determines whether the value of the EAR input has changed and compares it against the last stored value (in register C, see above). If it has not changed, the loop is repeated. For each loop the value in the registry B is increased. Once it gets to zero, it means “timeout” – the edge did not come in the expected time limit.

If the edge is found, the content of the register C is negated. This results both in a change of stored EAR value, but also in a change of the color border.

LD-EDGE-2 actually performs two LD-EDGE-1 in a sequence.

LD-EDGE output is as follows:

  • CY = 0, Z = 1 – during a time interval EAR change did not come (“timeout”)
  • CY = 0, Z = 0 – SPACE was pressed (BREAK)
  • CY = 1 – edge was found, the current value of the counter is in B

The counter in B counts, as I wrote, upwards. During every loop pass, which takes 58 processor ticks, B is incremented by one. For example, when reading a bit, routine LD-EDGE-2 is called, thus seeking two edges. The counter in B is set to $B0. This means that the timeout comes after $4F cycles ($FF- $B0). This represents a 2 x 465T of wait loop + 79 * 58T = 5512T. After all this time, the routine reports a signal failure.

The resulting counter value is compared to $CB. If it is smaller, it is then evaluated as two short pulses of log. 0, if it is larger, then log. 1. The value of $CB means that the loop was done 27 times ($CB-$B0), specificaly that the two edges came at a time of less than 2496T. Let’s recall: For log. 0 the last two pulses take 1710T, for log. 1 they take 3420T. Therefore the difference between these two times is 2565 and this is roughly what I came up with. It is a bit less because of overhead (subroutine calls, evaluation etc., See LD-8-BITS).

Loader hacking

This was not so difficult, was it? So now, let’s make some of those tricks …

First, reading routine is not completely “T-pedantic”, so few Ts here or there do not pose any major problem. If you want a simple effect, we can add it without any complicated adjustments.

Border effects

We can for example change color of the stripes, if we do not like the default two-tone ones. How about a rainbow? Simply rewrite the end of the routine LD-EDGE:

          LD      A,C 
          INC     A 
          XOR     $20 
          AND     $27 
          LD      C,A 
          AND     $07 
          OR      $08 
          OUT     ($FE),A 
          SCF     
          RET

What’s going on here? Instead of negating of the contents of register C, we increase its value by 1 and negate the value of the fifth bit. Thanks to masking of the value of $27, we avoid the overflow which would affect the EAR bit. Therefore it will vary in the range 0-7 and create a rainbow effect border.

Note: If you intend to try this, be sure to place the loading routine in the upper 32 Kbytes of RAM!

If we add a pair of instructions XOR A; OUT ($FE), A to the end and before the instruction SCF, stripes in the border will change into short lines on a black background.

Here we can put any effects that affect a color border either by amending it, or by using it. What about the effect that Busy used with loader for Song In Lines 3 (rounded corners), it looks impressive, huh?

Yet it is not very hard… Four squares in the corners contain a simple pattern (“rounding”). Pixels that are equal to 1 (the color INK) will look as if they were part of the border and will show streaks. I did not examine how Busy does it, but I’d bet that in principle it’s done somehow like this:

          LD      A,C ;Change of the last "edge type"
          CPL     ;as well as BORDER color.
          LD      C,A 
          AND     $07 ;Taking just only the BORDER color
          OR      $08 ;MIC off
          OUT     ($FE),A ; Change border color
          OR      $30 ; A contains: 0 0 1 1 1 b b b (b = border color)
                  ; so PAPER=7, INK=border color, 
                  ; BRIGHT 0, FLASH 0
          LD      ($5800),A ;left top corner attribute
          LD      ($581F),A ;right top corner
          LD      ($5AE0),A ;left bottom
          LD      ($5AFF),A ;right bottom

          SCF     ;Set CY=1 as a "success"
          RET     ;before return

Border effects are mostly simple and fast enough, so we can squash them here and not worry about the timing too much. Usually they fit within tolerance.

Simple effects with loaded content

During the loading we can certainly manage simpler operation with loaded data, either at the bit or byte level. Digisynth demos loader was able to perform real-time data extracting using Huffman decompression algorithm (Huffman suits this purpose quite well, you just need to have a decompression tree saved and pass through it according to the loaded bit). For those interested, I have prepared a reconstruction of the loader. But we will have a look at another case, and this will be a well known Mad Load – a routine that loads square images in a certain order. The video shows its improved version.

Mad Load used a very simple data format. After the flag byte, data for each square followed. Each took 11 bytes – lower and higher byte of a screen address where the square should be stored, then 8 bytes of video memory and a 1 byte of attribute. This was followed by another square …

This loading routine was slightly modified by Frantisek Fuka (FUXOFT) – he made LD-8-BITS into a subroutine which reads 1 byte to register L. This subroutine is then used in another subroutine to retrieve one of the squares (MAD_SQUARE). Squares loading subroutine is then called over and over again until there is data to be loaded from the tape, and when there isn’t, it stops and returns back. There is no checksum performed or anything.

Here is the Mad Loader source code. I only commented on the parts that differ from the standard code.

          LD      HL,MAD_RETURN 
          PUSH    HL 
          JP      MAD_LOAD 
          NOP     
          NOP     
          NOP     
          NOP     
MAD_RETURN:       
          EI      
          RET     

MAD_LOAD:         
          DI      
          IN      A,($FE) 
          RRA     
          AND     $20 
          LD      C,A 
          CP      A 
LD_BREAK:         
          RET     NZ 
LD_START:         
          CALL    LD_EDGE_1 
          JR      NC,LD_BREAK 
          LD      HL,$0415 
LD_WAIT:          
          DJNZ    LD_WAIT 
          DEC     HL 
          LD      A,H 
          OR      L 
          JR      NZ,LD_WAIT 
          CALL    LD_EDGE_2 
          JR      NC,LD_BREAK 

LD_LEADER:        
          LD      B,$9C 
          CALL    LD_EDGE_2 
          JR      NC,LD_BREAK 
          LD      A,$C6 
          CP      B 
          JR      NC,LD_START 
          INC     H 
          JR      NZ,LD_LEADER 

LD_SYNC:  LD      B,$C9 
          CALL    LD_EDGE_1 
          JR      NC,LD_BREAK 
          LD      A,B 
          CP      $D4 
          JR      NC,LD_SYNC 
          CALL    LD_EDGE_1 
          RET     NC 

                  ; Mad Load itself
          CALL    LD_ONE_BYTE ; The first one is a flag byte. Drop it!
MAD_LOOP:         
          CALL    MAD_SQUARE 
          RET     NC 
          JR      MAD_LOOP 

MAD_SQUARE:       
          CALL    LD_ONE_BYTE ; Lower byte of address
          LD      A,L 
          EX      AF,AF' ; save into AF'
          CALL    LD_ONE_BYTE ; Upper address byte
          RET     NC 
          EX      AF,AF' 
          LD      H,L ; to the H register
          LD      L,A ; and the first one to the L, so I have a full address in HL
          LD      B,$08 ;8 bitmap bytes for each square
MAD_SCRN:         
          PUSH    HL ;save the address
          PUSH    BC ;and the counter
          CALL    LD_ONE_BYTE ; read 1 byte
          POP     BC ; restore the counter
          LD      A,L ; byte to accumulator
          POP     HL ; restore address for this byte
          RET     NC ; If some error occured, return
          LD      (HL),A ;Else store the byte into screen memory
          INC     H ; addr + 256 - it means "next screen microline"
          DJNZ    MAD_SCRN ; repeat for all 8 bytes
          LD      A,H ; Convert address from screen memory to attribute memory
          SUB     $08 ; First, sub 8 to get the original value
          RRA     
          RRA     
          RRA     ; H div 8
          AND     $03 ; lowest 2 bits
          OR      $58 ;$58, $59 or $5a - attribute memory address
          LD      H,A ; So now I have an attribute address in HL
          PUSH    HL ; save it
          CALL    LD_ONE_BYTE ; read one byte
          LD      A,L ; save it to accumulator
          POP     HL ; restore the address
          LD      (HL),A ; and put the attribute byte to a proper place
          RET     ; Finished, one square is done



LD_ONE_BYTE:      
          LD      B,$B2 
          LD      L,$01 
LD_8_BITS:        
          CALL    LD_EDGE_2 
          RET     NC 
          LD      A,$CB 
          CP      B 
          RL      L 
          LD      B,$B0 
          JR      NC,LD_8_BITS 
          SCF     
          RET     

                  ;-----------------------------------

LD_EDGE_2:        
          CALL    LD_EDGE_1 
FF78:     RET     NC 
LD_EDGE_1:        
          LD      A,$16 
LD_DELAY:         
          DEC     A 
FF7C:     JR      NZ,LD_DELAY 
FF7E:     AND     A 
LD_SAMPLE:        
          INC     B 
          RET     Z 
          LD      A,$7F 
          IN      A,($FE) 
          RRA     
          XOR     C 
          AND     $20 
          JR      Z,LD_SAMPLE 
          LD      A,C 
          INC     A 
          XOR     $20 
          AND     $27 
          LD      C,A 
          NOP     
          NOP     
          AND     $07 
          OR      $08 
          OUT     ($FE),A 
          SCF     
          RET     

Not so visual, but useful loader

I unexpectedly discovered one nice piece about loaders which I would like to share. It’s not very impressive and its magic is hidden inside, but who knows, maybe someone will find it useful.

While elsewhere we often go from theory to practice, let’s do it the other way around this time and go straight to practice. Here’s a .tzx file, try to run it in the emulator…

(a necessary break to download the file, start the emulator, start loading… wait a bit… well… hmm… and what is this, is this all?)

Yeah, that’s all. It is just screens loading from tape. I told you, there’s not much of an effect. But try to look at the TAP file. Sure, it’s BASIC, loader, individual screens, yep, 2682 bytes, 2189, 3340, 4522 bytes, ah… they are compressed, but they are being loaded directly… the loader must be decompressing them on the fly!

Well there you go!

In the previous article I dandily claimed that DigiSynth was unpacking data during loading. Wow, really? Wasn’t I just dreaming that? I do have a leaky memory after all… So I downloaded DigiSynth, stared into the code for a while, and then I saw a familiar part: No, I wasn’t dreaming! Good, I wanted to let it go, but you know how this goes… In a subway, I started tinkering: After all, it cannot be that difficult to reconstruct the loader and make a packer for it … and Huffman compression is simple enough…

Huffman compression

…is really simple. At least once you know a bit about compression methods. If not, let me give you a quick 101:

The simplest compression methods are those which eliminate long sequences of bytes of the same value (RLE). They are quick and easy, so they can be used in a loader and deployed into a copying program (so you can fit more stuff into a memory because during loading it compresses data simply like this: “The following block contains sixty zeros!”)

Better compression methods exploit the fact that some sequences are often repeated. They therefore create a dictionary of repeating sequences, and replace those sequences with short codes. That is why they are called “dictionary methods”. They are based on ancient dictionary algorithms, namely LZ (Lempel-Ziv).

Mr. Huffman have chosen another approach, he suggested compression based not on sequences, but the frequency of occurrence of certain values. In short, it takes all values in the file (eg bytes, so 0 to 255) and count how many times the value occurs in the input file. Using this information it creates a code for each value (a sequence of bits), which has a property that the more frequent the value, the shorter the sequence. For example, those screens have very often zeroes in them. If there really is a frequent occurence of a zero value, it is encoded into some two bits. Even into one, in an extreme case. Yes, on the other hand, the less frequent values can easily occupy twelve or fifteen bits. But this loss is more than made up for by those frequent values.

If the input file contains different values and all of them have approximately the same count, then Huffman compression becomes ineffective, but it gives back good results for regular files. It is also often used to compress the values of LZ compression dictionary (LZHUF). It is also used in JPEG algorithm…

I will not go into the implementation details, it’s enough to know that the algorithm creates a binary tree, which eventually has exactly the same property that I described above.

The disadvantage is that we need to decipher this binary tree first. This disadvantage can be bypassed by adaptive Huffman coding, but that’s computationally demanding during decompression. And therefore not very suitable for Spectrum loaders…

Decompression tree as it is implemented, is in principle a very simple structure. Each item has two values, one for a zero bit, the other for a one bit. The value is either a reference to another item (if the code continues), or the resulting number.

Illustratively

Suppose we have a string ABRAKADABRA. You can see the tree here, and the resulting code is:

A: 0
B: 100
R: 11
K: 1011
D: 1010

The decompression tree will therefore look like this:

0: [*, A | ., 1] 1: [., 2 | *, R] 2: [*, B | ., 3] 3: [*, D | *, K]

Do not worry, I will explain in a jiffy. Decompression always starts on the first item. If the first bit is zero, the left part is taken (*, A); if the bit is a one, the right part is taken (., 1). The asterisk means “I already know this character, it’s this one!” Dot means “continue with that entry”.

Suppose incoming bits will be 0, 1, 0, 0, 1, 1, 0, … What will happen?

  • Start with record number 0.
  • Bit=0: the left side is telling us that we have found the character A. We have a first byte and we are starting again with record #0
  • Bit=1: the right part says that we should continue with record #1
  • Bit=0: the left side (.,2) says that we should continue with record #2
  • Bit=0: the left part of the record 2 says that we have found letter B. We have a second byte and We are starting again with record #0
  • Bit=1: the right part says that we should continue with record #1
  • Bit=1: the right part says that we found a character R. Again begin from record #0
  • Bit=0: the left part says that we have found a character A.
  • … And so on.

Implementation

I wrote the compression algorithm in JavaScript. You can use it yourself, it is a standard HTML page, where you use drag and drop to transfer the file you want to compress, and it returns .tap file with the result, suitable for the loader. Warning: IE3 running on Pentium MMX will probably not work. It doesn’t even work with new IE. Use Chrome or Firefox, thanks.

The resulting file has the following format:

  1. A flag byte. I am ignoring it
  2. Checksum. XOR of all the values of the input file
  3. Length of the decompression table (number of records)
  4. Data for decompression table. Each entry is stored as a 2×9 bits. The first bit is an attribute, the following eight bits are a value. The attribute determines whether it is a target value (1) or a reference (0). The first 9 bits is for a zero bit, the remaining 9 for a one bit.
  5. The length of the file in bytes. 2 bytes.
  6. Compressed file as bitstream
  7. Bit alignment to eightsome, so there are no problems when copying files

At the beginning the custom loader is a copy of a standard ROM loader, as described previously. To retrieve the entire byte I am using a slightly modified LD_8_BITS routine, which does not put the result into the register L, but instead into the registry E. LD_EDGE is not part of the routine, I’m calling those from the ROM (because I do not create any special effect, additionaly, emulators work better like this and allow for various accelerated loading).

For the decompression table we need to find 1kB of space from the address, which is aligned to the value of $400. I chose $FC00, but you can select a different one. When storing, the attribute of value/reference is stored as a whole byte (00/FF), testing is then simpler (by using simple rotation either zero or one is copied into CY register).

The routine does not check the flag byte nor the data length, all it needs is the address for storing data in the registry IX.

The routine uses no special tricks, everything is as straightforward as I described above.

Oh, and if you want, you can use it for your own creations, it is licensed under CC-0 (Public Domain) licence. Direct your thanks to the author of the original routine from DigiSynth

Of course, it is possible to improve the compression, remove repetitive sequences, precompress, thereby resulting in improved compression ratio. However, my goal wasn’t a beefy compressor, but to show you how you can incorporate an interesting functionality into a loader.

PS: Manic Miner, the Huffman way

.ORG    $f000 ;61440
          .ENGINE zxs 
 
                  ;loader test
AGAIN:            
          LD      ix,$4000 
          CALL    LD_BYTES 
          JP      again 
 
                  ;---- more or less copies of RAM routines - ignoring flag byte and attributes
 
LD_BYTES:         
          DI      ;disable interruption
          LD      A,$0F ;white BORDER.
          OUT     ($FE),A 
          LD      HL,$053F ;Adresss SA/LD_RET
          PUSH    HL ;into buffer
          IN      A,($FE) ;Test $FE gate
          RRA     ;Rotation of read byte
          AND     $20 ;bytes, but cosidering just EAR bit
          OR      $02 ;Signal BORDER red is also stored in
          LD      C,A ;register C. ($22 for OFF and $02 for ON state of the EAR input)
          CP      A ;zero flag is set to 1.
 
                  ;First task during loading is to determine
                  ;whether a pulse signal exists (therefore edges
                  ;on-off and off-on).
 
LD_BREAK:         
          RET     NZ ;return from BREAK.
LD_START:         
          CALL    LD_EDGE_1 ;if there is no signal during 1400 T
          JR      NC,LD_BREAK ;return with CY=1.
                  ;otherwise BORDER is set to cyan.
 
                  ;next we wait and check for signal presence
 
          LD      HL,$0415 ;the wait period is nearly a second 
LD_WAIT:          
          DJNZ    LD_WAIT 
          DEC     HL 
          LD      A,H 
          OR      L 
          JR      NZ,LD_WAIT ;waiting loop
          CALL    LD_EDGE_2 ;continue when catching two subseqent
          JR      NC,LD_BREAK ;edges in current period
 
                  ;now just the loading signal will be accepted
 
LD_LEADER:        
          LD      B,$9C ;timing constant
          CALL    LD_EDGE_2 ;continue when catching two subsequent
          JR      NC,LD_BREAK ;edges in current period
          LD      A,$C6 ;these edges must be caught during
          CP      B ;3000 T.
          JR      NC,LD_START 
          INC     H ;number of pairs of edges is stored into H
          JR      NZ,LD_LEADER ;until there is 256 of them
 
                  ;parts off and on of pulse sync come after boot signal 
 
LD_SYNC:          
          LD      B,$C9 ;timing constant.
          CALL    LD_EDGE_1 ;every edge is checked
          JR      NC,LD_BREAK ;until two edges are found close to each other
          LD      A,B ;(starting sync pulse).
          CP      $D4 
          JR      NC,LD_SYNC 
          CALL    LD_EDGE_1 ;at the end there must be a final edge of the on part 
          RET     NC ;of the sync pulse
 
                  ;now header or program bytes can be loaded
                  ;during operations LOAD, VERIFY.
                  ;the first byte defines a type 
 
          LD      A,C ;BORDER to green / magenta
          XOR     $06 
          LD      C,A 
 
                  ;---------
                  ; this is where actual loading begins
                  ; first the decompression tree gets created
                  ; it is stored from address FC00 (must be divisible by $400)
                  ; constant HUF_TABLE is this address / $400
 
HUF_TABLE EQU     $3F ; $3F * $400 => $FC00
 
          LD      hl,HUF_TABLE * $400 
          CALL    LD_byte ;flag byte -> E
          RET     nc 
                  ;throw away flag
          CALL    LD_byte ;checksum -> E
          RET     nc 
                  ;store checksum into A'
          LD      a,e 
          EX      af,af' 
          CALL    LD_byte ; number of quartlets in compression tree
          RET     nc 
          LD      d,e 
                  ; D values for table
HUF_DECODE:       
          LD      B,$B2 
          CALL    LD_EDGE_2 ;locate length of pulses of each bit
          RET     NC ;return if wrong (longer) pulse length (then CY=0)
          LD      A,$CB ;compare length to about 2400 T,
          CP      B ;when for a zero bit is CY=0 and for a one bit is CY=1.
                  ; first bit of recording is an attribute value / reference
          SBC     a,a ; CY=0 -> 00, CY=1 -> FF
          LD      (hl),a ; store into memory attribute as 00 or FF
          INC     hl ; after it the first value / reference
          CALL    ld_byte ; load a byte
          LD      (hl),e ; and store
          INC     hl ; both will repeat for a one bit
          LD      B,$AF 
          CALL    LD_EDGE_2 ;locate length of pulses of individual bits
          RET     NC ;return if wrong (longer) pulse length (then CY=0)
          LD      A,$CB ;compare length to about 2400 T,
          CP      B ;when for a zero bit is CY=0 and for a one bit is CY=1.
                  ; one bit of attribute
          SBC     a,a 
          LD      (hl),a 
          INC     hl 
          CALL    ld_byte 
          LD      (hl),e 
          INC     hl ; quartlet done
          DEC     d ; already a completed tree?
          JR      nz,huf_decode ; not yet, keep reading
 
                  ; table is ready, we can load data now
                  ; first length into registers DE
          CALL    ld_byte 
          LD      h,e 
          CALL    ld_byte 
          LD      d,e 
          LD      e,h 
 
          LD      A,C ;BORDER to blue and yellow
          XOR     $05 
          LD      C,A 
 
                  ; main loading loop
HUF_LOAD:         
          LD      b,$b2 
          LD      l,0 
HUF_BIT:          
          CALL    LD_EDGE_2 
          RET     nc 
          LD      a,b 
          CP      $cc ; slightly modified constant
          LD      h,HUF_TABLE ; H is an upper byte of address / 4
                  ; L lower (reference to a record)
          CCF     
          ADC     hl,hl 
          ADD     hl,hl 
 
                  ; HL = table address * 4 + 2 * CY
                  ; therefore for CY=0 it is 4*HL, for CY=1 it is 4*HL+2
 
          RRC     (hl) ; attribute value/reference into CY
          INC     hl 
          LD      l,(hl) ; in L is now value or reference
          LD      b,$b1 ; set up a timing constant
          JR      nc,huf_bit ; if it was a reference, continue with next bit
          LD      (ix+0],l ; if not, in L is a value to store
          INC     ix ; address++
          DEC     de ; counter--
          EX      af,af' 
          XOR     l ; checksum in A'
          EX      af,af' 
          LD      a,d ; all bytes read?
          OR      e 
 
          JR      nz,huf_load ; not yet!
 
          EX      af,af' 
          CP      01 ; if A' not zero, then CY=0 - therefore an error
 
          RET     
 
                  ; byte load into E
 
LD_BYTE:          
          LD      B,$B2 ;timing constant.
LD_MARKER:        
          LD      E,$01 ;storing of a marker bit 
 
                  ;this loop combines the loading byte into registry E
 
LD_8_BITS:        
          CALL    LD_EDGE_2 ;locate length of pulses of each bit
          RET     NC ;return if wrong (longer) pulse length (then CY=0)
          LD      A,$CB ;compare length to about 2400 T,
          CP      B ;when for a zero bit is CY=0 and for a one bit is CY=1.
          RL      E ;storing of a new bit into registry E.
          LD      B,$B0 ;timing constant for next bit
          JR      NC,LD_8_BITS ;it was not the last 8th bit
                  ;jump back into the loop.
                  ; 
          RET     ;return with CY=1
 
LD_EDGE_2 EQU     $05e3 
LD_EDGE_1 EQU     $05e7 

Each tick counts

In this part, we’ll need to count and weigh each and every processor tick. This time, our closest ally will be a waiting loop in a routine LD_EDGE_1. Let’s recall that routine:

LD_EDGE_2:
          CALL    LD_EDGE_1 ;calling, in fact, LD_EDGE_1 once more
          RET     NC ;return if error
LD_EDGE_1:
          LD      A,$16 ;7T
LD_DELAY:
          DEC     A ;4T
          JR      NZ,LD_DELAY ;12T / 7T
          AND     A ;4T

LD_SAMPLE:

Do you see it there in LD_DELAY? Yes? So that’s it right there! It takes 7T + 21*(4T+12T) + (4T+7T) = 354T, which is quite a lot of ticks for our purposes. Which purposes, you ask? Well, mostly likely a counter for how much data is remaining to be loaded.

Graphical indicator

In my text game Poradce (The Consultant) I used, besides for some scary flashing, some kind of a “thermometer” that showed how much remains to be loaded. It was that little something on the left side of the screen…

This effect is simpler than you might have expected. The length of the column is almost exactly “file length / 512” (I say “almost exactly” because I had painted it a little shorter…) So I’ll take the upper byte of remaining number of bytes (which is in register pair DE), divide it by two, add offset from the bottom of the screen, convert the coordinate to an address on the screen and mask the byte with value %11100111 – therefore everything else remains except for just two points in the middle, which, coincidentaly, exactly represent our “thermometer” and they get overwritten. To recalculate, I even use a routine PIXEL-ADD ($22AA) from Spectrum’s ROM, which takes the coordinates in B and C (the same coordinate system the PLOT has) and calculates an address on the screen (HL) and a mask of the point (A).

LD_EDGE_2:        
          CALL    LD_EDGE_1 
          RET     NC 
LD_EDGE_1:        
          JP      LD_OWN 
LD_BACK:          
          LD      A,03 ;7T
LD_DELAY:         
          DEC     A ;4T
          JR      NZ,LD_DELAY ;12T/7T
                  ;7T + 2*(4+12) + (4+7) = 50T
          AND     A 
LD_SAMPLE:        
          INC     B 
          RET     Z 
          LD      A,7f 
          IN      A,(fe) 
          RRA     
          RET     NC 
          XOR     C 
          AND     20 
          JR      Z,LD_SAMPLE 
          LD      A,C 
          CPL     
          LD      C,A 
          LD      A,06 
          RRCA    
          AND     07 
          OR      08 
          OUT     (fe),A 
          SCF     
          RET     

LD_OWN:           
          PUSH    DE ;11T
          PUSH    HL ;11T
          PUSH    BC ;11T
          PUSH    AF ;11T
          LD      B,D ;4T
          SRL     B ;8T
          NOP     ;4T
          INC     B ;4T
          INC     B ;4T
          LD      C,00 ;7T
          CALL    22aa ;17T + 132T routine
          LD      A,(HL) ;7T
          AND     e7 ;7T
          LD      (HL),A ;7T
          POP     AF ;10T
          POP     BC ;10T
          POP     HL ;10T
          POP     DE ;10T
          JP      LD_BACK ;10T

As you can see, I interfered with the timing loop. At the beginning I popped out into my own routine which operates that effect (it is 305T with that diversion) and after returning back I still have enough time, so I’ll wait another 50T, which translates to 355T, which it is pretty much exactly our magic number!

Numeric indicator

If you rather want to show a number instead of some dwindling column, things get more difficult. Each digit must in fact be drawn on the screen, which means eight writes for each digit. And for a three-digit counter this is quite a lot of time, it won’t fit into 354T. So you need to chop the algorithm into parts that fit within this time limit and call them in sequence. The second set of registers and index register IY will be your invaluable tool for this.

Games from Hewson and Czech programs from Universum (I think) had some very nice counters, they used their own font and they also shown fliping digits effect. In another text game I used just a simple counter of “the number of bytes / 64”, which for 48kB block still fits into three digits. At the beginning I prepared the desired digits (in decimal!) into registers D and E (resp. into the mirror ones):

          PUSH    DE
          EXX
          POP     HL
          LD      DE,00
          LD      BC,$40
LD_DIV:
          AND     A
          SBC     HL,BC
          JR      C,b0be
          LD      A,E
          ADD     A,01
          DAA
          LD      E,A
          LD      A,D
          ADC     A,00
          DAA
          LD      D,A
          JR      LD_DIV
LD_DIV2:
          LD      H,3d
          LD      BC,4001
          EXX
          LD      IY, LD_CHARS ; coming up next...

This effect did not take place directly in LD_EDGE but in LD_8_BITS instead. However, LD_EDGE has been modified so as to use significantly different timing values during loading:

          LD      B,b0
          LD      L,01
LD_8_BITS:
          CALL    LD_EDGE_2M
          RET     NC
          LD      A,d4
          CP      B
          RL      L
          LD      B,b0
          JP      NC,LD_LONGWAY
          LD      A,H
          XOR     L
          LD      H,A
;... the rest of it is normal...

;LD_EDGE_2M : 468 + 58 * B
;shortened by 462T
LD_EDGE_2M:
          CALL    LD_EDGE_1M
          RET     NC
          LD      A,10
          DEC     A
          JR      NZ,$-1 ;  254T
LD_EDGE_1M:
          AND     A
          INC     B
          RET     Z
          LD      A,7f
          IN      A,(fe)
          RRA
          XOR     C
          AND     20
          JR      Z,b15b
          LD      A,C
          CPL
          LD      C,A
          AND     04
          OR      08
          OUT     (fe),A
          SCF
          RET

The effect ifself then played out as follows:

LD_LONGWAY:
          EXX
          DEC     C
          JP      Z,LD_SUB1
          JP      (IY)

LD_RETHERE:
          POP     IY
          EXX
          JP      LD_8_BITS

LD_SUB1:
          LD      C,07
          DEC     B
          JP      Z,LD_SUB2
          LD      A,04
          DEC     A
          JR      NZ,b1a9 ;wait 59T
          EXX
          JP      LD_8_BITS

LD_SUB2:
          LD      A,E
          SUB     01
          DAA
          LD      E,A
          LD      A,D
          SBC     A,B ;in B there is 0
          DAA
          LD      D,A
          LD      B,$40
          LD      IY,LD_CHARS
          LD      HL,$3d00
          EXX
          JP      LD_8_BITS

                  ;1st digit
LD_CHARS:
          LD      A,E
          ADD     A,A
          ADD     A,A
          ADD     A,A
          OR      $81
          LD      L,A
          LD      A,(HL)
          LD      (50fd),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (51fd),A
          INC     L
          LD      A,(HL)
          LD      (52fd),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (53fd),A
          INC     L
          LD      A,(HL)
          LD      (54fd),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (55fd),A
          INC     L
          LD      A,(HL)
          LD      (56fd),A
          CALL    LD_RETHERE

                  ;2nd digit
          LD      A,E
          AND     $f0
          RRA
          OR      $81
          LD      L,A
          LD      A,(HL)
          LD      (50fc),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (51fc),A
          INC     L
          LD      A,(HL)
          LD      (52fc),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (53fc),A
          INC     L
          LD      A,(HL)
          LD      (54fc),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (55fc),A
          INC     L
          LD      A,(HL)
          LD      (56fc),A
          CALL    LD_RETHERE

                  ;3rd digit
          LD      A,D
          ADD     A,A
          ADD     A,A
          ADD     A,A
          OR      $81
          LD      L,A
          LD      A,(HL)
          LD      (50fb),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (51fb),A
          INC     L
          LD      A,(HL)
          LD      (52fb),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (53fb),A
          INC     L
          LD      A,(HL)
          LD      (54fb),A
          CALL    LD_RETHERE
          INC     L
          LD      A,(HL)
          LD      (55fb),A
          INC     L
          LD      A,(HL)
          LD      (56fb),A
          LD      (56fb),A
          NOP
          LD      IY,LD_CHARS
          EXX
          JP      LD_8_BITS

This, of course, deserves a few explanatory notes:

Registers DE contain the counter itself, stored in BCD coding. The register C contains a bits counter, register B contains a bytes counter. Once they calculate down to zero, the counter gets decreased by 1, the result gets adjusted by DAA instruction and prepared for the actual printing of characters. HL holds the value of $3D00, which coincidentally is an address in the ROM where numbers are stored, and IY holds an address of the routine LD_CHARS, which displays the digits.

When you chop the routine into pieces, you are left with two options. You either edit the code and rewrite the address where you need to jump next, or you store the address somehow. Here it is in the IY register. By using a simple trick CALL LD_RETHERE (as in RETurn HERE) you go back to LD_8_BITS, while the return address gets into IY, so that during the next call the JP (IY) will be directed to the next piece. Note that printing of characters is chopped into small pieces by calling LD_RETHERE.

Time indicator and more

Calculating remaining time is more complicated than counting bytes. You either have to count every bit differently (a one bit is twice as long as a zero bit), or use an interrupt (yes, interrupt in loader!)

I was looking for an example of the first approach, but before I could comment on it, Busy called me and said that he had found his Overscan loader source code, asking me if I want to publish it. So here we go, with Busy’s courtesy, Overscan loader!

(Time calculation and graphical effects are included!)

And here is the source code with Busy’s comments (Thanks!) Because it is a bit long, I shall say my goodbyes right now and wish you good luck in your own experiments.

5b00            *a
5b00            *s
5b00            ;===============================================================;
5b00            ;== Version 16 == Loader for Overscan == 12.08.1991 Busy soft ==;
5b00            ;===============================================================;
5b00            znaky  =    #8800	modifiet charset
5b00                   org  #8200,0
8200 f3         p      di
8201 310082            ld   sp,p
8204 fd217f00          ld   iy,#7f	timers init for time and auto kitt
8208 210080            ld   hl,#8000	IM2 vector init
820b 110180            ld   de,#8001
820e 011001            ld   bc,#0110
8211 3681              ld   (hl),#81
8213 edb0              ldir
8215 212183            ld   hl,rut	relocation of operation routine IM2
8218 118181            ld   de,#8181
821b 010800            ld   bc,load-rut
821e edb0              ldir
8220 067f       i      ld   b,#7f
8222 216711            ld   hl,#1167
8225 11d685            ld   de,k
8228 d5                push de
8229 7e         ll14   ld   a,(hl)	random memory inserts
822a ad                xor  l		(to fool an enemy)
822b 12                ld   (de),a
822c 02                ld   (bc),a
822d a9                xor  c
822e 0b                dec  bc
822f 02                ld   (bc),a
8230 0b                dec  bc
8231 13                inc  de
8232 2b                dec  hl
8233 7c                ld   a,h
8234 b5                or   l
8235 20f2              jr   nz,ll14
8237 e1                pop  hl
8238 012a7a            ld   bc,-k
823b edb0              ldir
823d 3e80       rst    ld   a,#80	IM2 launch
823f ed47              ld   i,a
8241 ed5e              im2
8243 fb                ei
8244 cd0c85            call zn		charset conversion
8247 3e48              ld   a,#48
8249 32b784            ld   (n20+2),a
824c 210040            ld   hl,#4000	screen init
824f 110140            ld   de,#4001
8252 010018            ld   bc,#1800
8255 71                ld   (hl),c
8256 edb0              ldir
8258 010603            ld   bc,#0306
825b 71                ld   (hl),c
825c edb0              ldir
825e 21a341     o      ld   hl,#41a3	squares under kitt effect
8261 0e06              ld   c,#06
8263 061a       ll11   ld   b,#1a
8265 e5                push hl
8266 367e       ll10   ld   (hl),#7e
8268 2c                inc  l
8269 10fb              djnz ll10
826b e1                pop  hl
826c 24                inc  h
826d 0d                dec  c
826e 20f3              jr   nz,ll11
8270 210059            ld   hl,#5900	red frame in the middle
8273 011208            ld   bc,#0812
8276 5d                ld   e,l
8277 73         ll12   ld   (hl),e
8278 2c                inc  l
8279 71                ld   (hl),c
827a 7d                ld   a,l
827b c61d              add  a,#1d
827d 6f                ld   l,a
827e 71                ld   (hl),c
827f 2c                inc  l
8280 73                ld   (hl),e
8281 2c                inc  l
8282 10f3              djnz ll12
8284 21e15a            ld   hl,#5ae1
8287 1e44              ld   e,#44
8289 cd0083            call sap
828c 21e158            ld   hl,#58e1
828f cdfe82            call pas
8292 21015a            ld   hl,#5a01
8295 cdfe82            call pas
8298 215f85            ld   hl,firma	texts printout
829b cd0783            call text
829e 217685            ld   hl,demo
82a1 cd0783            call text
82a4 219185            ld   hl,loatim
82a7 cd0783            call text
82aa cd2983            call load		calling the loader
82ad 08                ex   af,af
82ae af                xor  a
82af cdf884            call out		Border 0
82b2 08                ex   af,af
82b3 3839              jr   c,ok		if we loaded a block correctly, then jump
82b5 af         error  xor  a		if error during loading
82b6 210048            ld   hl,#4800	then show message
82b9 77         ll13   ld   (hl),a
82ba 23                inc  hl
82bb cb64              bit  4,h
82bd 28fa              jr   z,ll13
82bf 112548            ld   de,#4825
82c2 212515            ld   hl,#1525
82c5 cd0b83            call txt
82c8 21d285            ld   hl,krik
82cb cd0b83            call txt
82ce 21a585            ld   hl,rew
82d1 cd0783            call text
82d4 21b785            ld   hl,reload
82d7 cd0783            call text
82da af                xor  a
82db 32b784            ld   (n20+2),a
82de 06ff       press  ld   b,#ff	waiting for keypress
82e0 cdcf83            call mmm		we call loader during waiting
82e3 af                xor  a		so our effects still run
82e4 dbfe              in   a,(#fe)
82e6 f6e0              or   #e0
82e8 3c                inc  a
82e9 28f3              jr   z,press
82eb c33d82            jp   rst
82ee
82ee 210058     ok     ld   hl,#5800	if we loaded a block correctly
82f1 110158            ld   de,#5801	erase all attributes
82f4 0603              ld   b,#03
82f6 edb0              ldir
82f8 cde6c3            call 50150       demo decompression
82fb c376e9            jp   59766	demo launch
82fe
82fe 1e12       pas    ld   e,#12	some help subroutines
8300 061e       sap    ld   b,#1e	for frame drawing
8302 73         pp1    ld   (hl),e
8303 2c                inc  l
8304 10fc              djnz pp1
8306 c9                ret
8307
8307 5e         text   ld   e,(hl)	text printout
8308 23                inc  hl		HL = text address
8309 56                ld   d,(hl)	DE = text position on screen
830a 23                inc  hl
830b e5         txt    push hl
830c d5                push de
830d 6e                ld   l,(hl)
830e 2688              ld   h,>znaky
8310 0608              ld   b,#08
8312 7e         xtx    ld   a,(hl)	print of one character
8313 12                ld   (de),a
8314 14                inc  d
8315 24                inc  h
8316 10fa              djnz xtx
8318 d1                pop  de
8319 e1                pop  hl
831a 1c                inc  e
831b cb7e              bit  7,(hl)	text ends with char with set bit7=1
831d 23                inc  hl
831e 28eb              jr   z,txt
8320 c9                ret
8321
8321 08         rut    ex   af,af	routine from interrupt
8322 fd24              inc  yh		YH = timer for kitt effect
8324 fd2c              inc  yl		YL = timer for counting and time printing
8326 08                ex   af,af
8327 fb                ei
8328 c9                ret
8329
8329 af         load   xor  a		THIS IS WHERE LOADER BEGINS!!
832a cdf884            call out
832d 211f40     rohy   ld   hl,#401f	drawing of cut off corners
8330 11ff57            ld   de,#57ff
8333 7b                ld   a,e
8334 77                ld   (hl),a
8335 12                ld   (de),a
8336 d9                exx
8337 010207            ld   bc,#0702
833a 210040            ld   hl,#4000
833d 11e057            ld   de,#57e0
8340 77                ld   (hl),a
8341 12                ld   (de),a
8342 7e         rr1    ld   a,(hl)
8343 24                inc  h
8344 15                dec  d
8345 cb27              sla  a
8347 77                ld   (hl),a
8348 12                ld   (de),a
8349 d9                exx
834a 7e                ld   a,(hl)
834b 24                inc  h
834c 15                dec  d
834d cb3f              srl  a
834f 77                ld   (hl),a
8350 12                ld   (de),a
8351 d9                exx
8352 10ee              djnz rr1
8354            nnn
8354 3ef8       rrr    ld   a,#f8	catching the leader tone
8356 32f484            ld   (xor+1),a
8359 2600              ld   h,#00
835b 06ff       djnz   ld   b,#ff
835d cdcf83            call mmm
8360 25                dec  h
8361 20f8              jr   nz,djnz
8363 cdcf83            call mmm
8366 30ec              jr   nc,nnn
8368 cdcb83            call ppp
836b 30e7              jr   nc,nnn
836d 069c       sss    ld   b,#9c
836f cdcb83            call ppp
8372 30e0              jr   nc,nnn
8374 3ec6              ld   a,#c6
8376 b8                cp   b
8377 30db              jr   nc,rrr
8379 24                inc  h
837a 20f1              jr   nz,sss
837c 3efc              ld   a,#fc
837e 32f484            ld   (xor+1),a
8381 06c9       ttt    ld   b,#c9
8383 cdcf83            call mmm
8386 30cc              jr   nc,nnn
8388 78                ld   a,b
8389 fed4              cp   #d4
838b 30f4              jr   nc,ttt
838d cdcf83            call mmm
8390 d0                ret  nc
8391 79                ld   a,c		leader and sync-pulse OK, we can start loading
8392 ee03              xor  #03
8394 4f                ld   c,a
8395 cdb583            call byte         flagbyte Load
8398 d0                ret  nc
8399
8399 dd21e6c3   loa    ld   ix,50150	demo bytes loading loop
839d 116a24            ld   de,9322
83a0            ;      ld   ix,#4000	during testing of effects in loader
83a0            ;      ld   de,#1b00	the block was loaded onto the screen
83a0 cdb583     loa1   call byte
83a3 d0                ret  nc
83a4 dd7500            ld   (ix+#00),l
83a7 dd23              inc  ix
83a9 1b                dec  de
83aa 7a                ld   a,d
83ab b3                or   e
83ac 20f2              jr   nz,loa1
83ae cdb583            call byte          load parity
83b1 d0                ret  nc
83b2 fe01              cp   #01
83b4 c9                ret
83b5
83b5 06b2       byte   ld   b,#b2	Load of one byte
83b7 2e01              ld   l,#01
83b9 cdcb83     loa9   call ppp
83bc d0                ret  nc
83bd 3ecb       kk     ld   a,#cb
83bf b8                cp   b
83c0 cb15              rl   l
83c2 06b0              ld   b,#b0
83c4 30f3              jr   nc,loa9
83c6 7c                ld   a,h
83c7 ad                xor  l
83c8 67                ld   h,a
83c9 37                scf
83ca c9                ret
83cb
83cb cdcf83     ppp    call mmm		Load of one byte
83ce d0                ret  nc
83cf d9         mmm    exx		    load of one halfperiod
83d0 c3                db   #c3		there are my special subroutines instead of wating loop
83d1 ec83       skok   dw   n1		jump to part of code to execute next
83d3
83d3 7e         n1sub  ld   a,(hl)              /0
83d4 81                add  a,c		time value recalcullation
83d5 fe0a              cp   #0a
83d7 0e00              ld   c,#00
83d9 3802              jr   c,n1aa
83db 0c                inc  c
83dc af                xor  a
83dd 77         n1aa   ld   (hl),a
83de 23                inc  hl                  \50
83df 7e                ld   a,(hl)
83e0 81                add  a,c
83e1 fe06              cp   #06
83e3 0e00              ld   c,#00
83e5 3802              jr   c,n1bb
83e7 0c                inc  c
83e8 af                xor  a
83e9 77         n1bb   ld   (hl),a
83ea 23                inc  hl                  \100
83eb c9                ret                      \110+call=117
83ec
83ec fd7d       n1     ld   a,yl
83ee fe33              cp   51                  \29
83f0 3866              jr   c,n10	Test if a second has passed
83f2 0e01              ld   c,#01	so that we could draw another time value
83f4 fd7d              ld   a,yl
83f6 d632              sub  50
83f8 fd6f              ld   yl,a
83fa 215a85            ld   hl,cas              \76
83fd cdd383            call n1sub
8400 3e04              ld   a,#04
8402 110784            ld   de,n3
8405 181a              jr   jpde                \222 +/-
8407
8407 23         n3     inc  hl		every second we also change
8408 cdd383            call n1sub	an efect in middle third
840b ed5f              ld   a,r
840d 32b984            ld   (udaj+1),a
8410 3a5b85            ld   a,(cas+1)
8413 0f                rrca
8414 9f                sbc  a,a
8415 e60f              and  #0f
8417 f607              or   #07
8419 32cc84            ld   (rot),a
841c 3e05              ld   a,#05
841e 112a84            ld   de,n2
8421 ed53d183   jpde   ld   (skok),de
8425 1ef9              ld   e,#f9
8427 c3e084            jp   wait                \222 +/-
842a
842a 2b         n2     dec  hl
842b cb7e              bit  7,(hl)              \32
842d 2808              jr   z,n2aa	prepare to print one character
842f 3e0e              ld   a,14
8431 01ec83            ld   bc,n1
8434 c3dc84            jp   buduci
8437 e5         n2aa   push hl
8438 6e                ld   l,(hl)
8439 2688              ld   h,>znaky
843b 1650              ld   d,#50
843d 0602              ld   b,#02               /2*88=202
843f 7e         dd1    ld   a,(hl)	print of one character
8440 12                ld   (de),a
8441 24                inc  h
8442 14                inc  d
8443 7e                ld   a,(hl)
8444 12                ld   (de),a
8445 24                inc  h
8446 14                inc  d
8447 7e                ld   a,(hl)
8448 12                ld   (de),a
8449 24                inc  h
844a 14                inc  d
844b 7e                ld   a,(hl)
844c 12                ld   (de),a
844d 24                inc  h
844e 14                inc  d
844f 10ee              djnz dd1
8451 1c                inc  e
8452 e1                pop  hl
8453 3e01              ld   a,#01               \299
8455 c3e084            jp   wait
8458
8458 fd7c       n10    ld   a,yh		Auto-Kitt effect
845a fe02              cp   #02                 \56
845c 3857              jr   c,n20	test if we can go into another phase
845e c3                db   #c3
845f 9584       jump   dw   n12                 \73
8461
8461 21a358     n11    ld   hl,#58a3
8464 3647              ld   (hl),#47
8466 2c         incdec inc  l
8467 7d                ld   a,l
8468 326284            ld   (n11+1),a
846b 216684            ld   hl,incdec           \124
846e febc              cp   #bc
8470 3f                ccf
8471 3802              jr   c,iidd
8473 fea4              cp   #a4
8475 9f         iidd   sbc  a,a                 \151
8476 e601              and  #01
8478 ae                xor  (hl)
8479 77                ld   (hl),a
847a 218484            ld   hl,jr+1
847d 3e03              ld   a,#03
847f ae                xor  (hl)
8480 77                ld   (hl),a
8481 3ea2              ld   a,#a2               \210
8483 1803       jr     jr   #03
8485 329684            ld   (n12+1),a
8488 fd2600            ld   yh,#00
848b 3e02              ld   a,#02
848d 219584            ld   hl,n12
8490 225f84     buduce ld   (jump),hl
8493 184b              jr   wait                \267
8495
8495 216458     n12    ld   hl,#5864            \73
8498 7d                ld   a,l
8499 febc              cp   #bc                 \94
849b 3011              jr   nc,n12bb
849d 014104            ld   bc,#0441            \111
84a0 7e         n12aa  ld   a,(hl)              /47
84a1 b9                cp   c
84a2 3801              jr   c,#01
84a4 3d                dec  a
84a5 77                ld   (hl),a
84a6 2c                inc  l
84a7 10f7              djnz n12aa               \4*47=188
84a9 229684            ld   (n12+1),hl
84ac 1835              jr   end
84ae
84ae 3e08       n12bb  ld   a,#08
84b0 216184            ld   hl,n11
84b3 18db              jr   buduce              /\174
84b5
84b5 210048     n20    ld   hl,#4800	filling the middle third of the screen
84b8 3e33       udaj   ld   a,#33               \82
84ba 772c772c          dw   #2c77,#2c77	8*[INC C : LD (HL),A]
84be 772c772c          dw   #2c77,#2c77
84c2 772c772c          dw   #2c77,#2c77         8*11=88
84c6 772c772c          dw   #2c77,#2c77         \
84ca 2002              jr   nz,n21              /
84cc 0f         rot    rrca
84cd 24                inc  h
84ce cba4       n21    res  4,h
84d0 cbdc              set  3,h
84d2 22b684            ld   (n20+1),hl
84d5 32b984            ld   (udaj+1),a
84d8 3e03              ld   a,3
84da 1804              jr   wait                \68
84dc
84dc ed43d183   buduci ld   (skok),bc	setting of what part of code to call next
84e0 3d         wait   dec  a
84e1 20fd              jr   nz,wait
84e3 d9         end    exx		all code parts last 200 cycles (+/- give or take)
84e4 a7         zzz    and  a		end of my special subroutines
84e5 04         l222   inc  b
84e6 c8                ret  z
84e7 3eff              ld   a,#ff	#FF instead of #7F does not respond to SPACEBAR
84e9 dbfe              in   a,(#fe)
84eb 1f                rra
84ec d0                ret  nc
84ed a9                xor  c
84ee e620              and  #20
84f0 28f3              jr   z,l222
84f2 79                ld   a,c
84f3 eefc       xor    xor  #fc		#FC instead of CPL makes prettier colours
84f5 4f                ld   c,a		red-yellow boot tone and blue-blue bytes
84f6 e607              and  #07
84f8 320058     out    ld   (#5800),a           +52
84fb 321f58            ld   (#581f),a	setting of corner attributes
84fe f608              or   #08
8500 d3fe              out  (#fe),a
8502 e607              and  #07
8504 32e05a            ld   (#5ae0),a
8507 32ff5a            ld   (#5aff),a
850a 37                scf
850b c9                ret
850c
850c 110088     zn     ld   de,znaky	charset conversion
850f d5         ll1    push de		from ROM font makes contures
8510 7b                ld   a,e		also font gets reorganized so that
8511 fe20              cp   ' '		print routines can run faster
8513 3002              jr   nc,#02
8515 c630              add  a,'0'
8517 87                add  a,a
8518 6f                ld   l,a
8519 260f              ld   h,#0f
851b 29                add  hl,hl
851c 29                add  hl,hl
851d cd4985            call read
8520 4f                ld   c,a
8521 23                inc  hl
8522 cd5085            call write
8525 0606              ld   b,#06
8527 cd4985     ll2    call read
852a 4f                ld   c,a
852b 2b                dec  hl
852c cd4985            call read
852f b1                or   c
8530 4f                ld   c,a
8531 23                inc  hl
8532 23                inc  hl
8533 cd5085            call write
8536 10ef              djnz ll2
8538 cd4985            call read
853b 4f                ld   c,a
853c 2b                dec  hl
853d cd4985            call read
8540 23                inc  hl
8541 cd5485            call rit
8544 d1                pop  de
8545 1c                inc  e
8546 20c7              jr   nz,ll1
8548 c9                ret
8549
8549 7e         read   ld   a,(hl)
854a 0f                rrca
854b 0f                rrca
854c b6                or   (hl)
854d 07                rlca
854e b6                or   (hl)
854f c9                ret
8550
8550 cd4985     write  call read
8553 2b                dec  hl
8554 b1         rit    or   c
8555 ae                xor  (hl)
8556 12                ld   (de),a
8557 23                inc  hl
8558 14                inc  d
8559 c9                ret
855a 00000a00   cas    db   0,0,10,0,0	Buffer for time value
            855f 4550       firma  dw   #5045
8561 42757379          db   'Busy software '
            6
            2
            856f 70726573          db   'presen','t'+#80
            8576 8350       demo   dw   #5083
8578 3e3e3e20          db   '>>> THE OVER'
            0
            8584 5343414e          db   'SCAN DEMO <<'
            d
            8590 bc                db   '<'+#80
8591 e250       loatim dw   #50e2
8593 4c6f6164          db   'Loading'
            859a 20636361          db   ' cca 1 min'
            d
            85a4 ae                db   '.'+#80
85a5 8848       rew    dw   #4888
85a7 52657769          db   'Rewing the tape'
            4
            4
            85b6 ac                db   ','+#80
85b7 c448       reload dw   #48c4
85b9 70726573          db   'press any key '
            e
            5
            85c7 616e6420          db   'and reload'
            f
            85d1 ae                db   '.'+#80
85d2 202121a1   krik   db   ' !!','!'+#80
85d6            k
85d6            l      =    k-p		label "l" will be total code length
85d6
85d6                   org  #6000,0	next routines have no use
6000 f3         mrs    di		they were debug routines
6001 ed56              im1
6003 af                xor  a
6004 ed47              ld   i,a
6006 c3b3f4            jp   #f4b3	jump into MRS
6009
6009 cd0c85     zs     call zn		Test of conversion of charset
600c 210088            ld   hl,znaky	printing of all characters
600f 110040            ld   de,#4000	on the screen
6012 010008            ld   bc,#0800
6015 edb0              ldir
6017 c9                ret
6018
6018 3e0c       poke   ld   a,12		sets testing im2 vector into ROM
601a d317              out  (23),a	(Outs for writing into ROM on MB01)
601c 218181            ld   hl,#8181
601f 22ff3a            ld   (#3aff),hl
6022 3e04              ld   a,4
6024 d317              out  (23),a
6026 c9                ret
6027                   end

(Translated by Ondřej Ficek)

Comments powered by Talkyard.

Martin Maly

Martin Maly

Programmer, journalist, writer and electronic hobbyist. Vintage CPU lover. Creating new computers with the spirit of 80's.
Czechia