Getting a foothold into 6502 machine language, part 4

Yeah, I know, I know. I promised a completed function to test if a number is prime. However, I want to write an article a day, and there’s only so much one can code in a day. This time I used code by other programmers, because, you know, I’m a novice. I did, however, try to understand the code, so I could rewrite where it was needed.

[ part 1 | part 2 | part 3 | part 5 | bonus ]

As established in part 3, we need two functions, one to square a number and one to take the modulo between two numbers. These will be the subjects of today’s article. I will analyze the original code, rewrite it, and test it with a Basic program, containing a machine code loader in data statements.

Here we go!

Binary division and the binary square

Suppose we wanted a 16-bit unsigned number to be divided by a non-zero 8-bit number to obtain its quotient and remainder. We are particularly interested in the remainder, since it’s the result of the modulo operation dividend mod divisor, and we need it for checking divisibility (true if the remainder of a division is zero, otherwise false).

On the website LLX.com about the Apple II, I found an article called Multiplying and Dividing on the 6502, written by Neil Parker. In the section “Dividing Arbitrary Numbers” it gets interesting for our purposes (modulo).

We will shift the bits of the dividend, one at a time, into a work area, and then try subtracting the divisor from the work area. If the subtraction succeeded, we replace the work area with the result of the subtraction and record a 1 bit in the quotient. If the subtraction failed, we discard its result and record a 0 bit in the quotient. The process is repeated until all bits of the dividend are used up.

Here are the steps:

clear the remainder
set the bit counter to 16
shift the dividend to the left into the remainder
trial subtract the divisor from the remainder
was there a borrow (negative result), then step 8
store the result in remainder
increment the dividend (quotient)
decrement bit counter
if bit counter not zero, then step 3

There are three 16-bit registers, remainder, dividend and divisor. After exiting the routine, the dividend has become the quotient.


            LDA #0      ;Initialize REM to 0
            STA REM
            STA REM+1
            LDX #16     ;There are 16 bits in NUM1
    L1      ASL NUM1    ;Shift hi bit of NUM1 into REM
            ROL NUM1+1  ;(vacating the lo bit, which will be used for the quotient)
            ROL REM
            ROL REM+1
            LDA REM
            SEC         ;Trial subtraction
            SBC NUM2
            TAY
            LDA REM+1
            SBC NUM2+1
            BCC L2      ;Did subtraction succeed?
            STA REM+1   ;If yes, save it
            STY REM
            INC NUM1    ;and record a 1 in the quotient
    L2      DEX
            BNE L1

Since we aren’t interested in the quotient, just the remainder, we can eliminate the INC NUM1 the listing above. The divisor is smaller than 256, so an 8-bit version will do. We still need to go through 16 bits, though. We can set NUM2+1 in Neil’s listing above to zero, by replacing SBC NUM2+1 with SBC #$00.

Here’s the new routine, I call “modulo”. For testing it in basic with a SYS command, I’ll use the freely available zero page addresses $FA through $FE to store the values. We start by POKEing the dividend in locations 250 and 251 (lo/hi bytes) and the divisor in location 252. The result will be in locations 253 and 254 (lo/hi bytes).


; modulo.asm

; Calculate the modulo of an 8-bit number with a 16-bit number.
; Destroys all registers.

        num1 = $fa      ; 16-bit dividend
        num2 = $fc      ; 8-bit divisor
        rem = $fd       ; 16-bit remainder

        *=$c000         ; can be called from Basic with SYS 49152

modulo:
        lda #$00        ; clear remainder
        sta rem
        sta rem+1
        ldx #16         ; do 16 bits
modloop:
        asl num1        ; shift num1 to the left
        rol num1+1
        rol rem         ; into remainder
        rol rem+1
        sec
        lda rem         ; do trial subtraction
        sbc num2        ; with divisor
        tay
        lda rem+1
        sbc #$00        ; divisor is 8-bit value
        bcc modskip     ; was there a borrow, then next bit
        sta rem+1       ; save the result
        sty rem
modskip:
        dex             ; repeat for 16 bits
        bne modloop
        rts

Here’s my test program, called “testmod 0.1”. It contains the converted hexdump from the Virtual 6502 Assembler after pasting in the above source listing “modulo.asm”.


10 rem testmod 0.1
20 forn=49152to49187:readb:poken,b:next:rem read in machine code
30 data 169,000,133,253,133,254,162,016
31 data 006,250,038,251,038,253,038,254
32 data 165,253,056,229,252,168,165,254
33 data 233,000,144,004,133,254,132,253
34 data 202,208,229,096
40 input "dividend";dd
50 input "divisor";dv
60 poke 251, int(dd/256):poke 250,dd-256*peek(251): rem poke dividend to memory
70 poke 252,dv:rem poke divisor to memory
80 sys 49152
90 rm=peek(253)+256*peek(254)
100 print "remainder is";rm
110 print "press any key to repeat"
120 poke 198,0:wait 198,1:poke 198,0:goto 40

Here is some input and output I created in the C64 Basic interpreter of the V.I.C.E. C64 emulator on my Raspberry Pi-400, running Raspberry Pi OS.


run
dividend? 61
divisor? 3
remainder is 1
press any key to repeat
dividend?  65535
divisor? 16
remainder is 15
press any key to repeat
dividend?  17
divisor? 4
remainder is 1
press any key to repeat

break in 120
ready.

It seems to work, nice job, me 😊

The same article on the site LLX.com mentions how to do multiplication, but what we need is a square function (a number multiplied by itself). So I found another resource on 6502.org, called Square Calculator and was written by by Lee Davison. It deals with signed integers, but we only need unsigned 8-bit integers, so there’s probably some rewriting to do, simplifying the code.

Here’s the full listing.


; Calculates the 16 bit unsigned integer square of the signed 16 bit integer in
; Numberl/Numberh.  The result is always in the range 0 to 65025 and is held in
; Squarel/Squareh
;
; The maximum input range is only +/-255 and no checking is done to ensure that
; this is so.
;
; This routine is useful if you are trying to draw circles as for any circle
;
; x^2+y^2=r^2 where x and y are the co-ordinates of any point on the circle and
; r is the circle radius
;
; Destroys all registers
    
    *= 8000                     ; these must be in RAM
    
    Numberl                     ; number to square low byte
    Numberh = Numberl+1         ; number to square high byte
        .word $FFFF
    
    Squarel                     ; square low byte
    Squareh = Squarel+1         ; square high byte
        .word $FFFF
    
    Tempsq:                     ; temp byte for intermediate result
        .byte $00
    
    *= 8192                     ; any address will do
    
    Square:
            LDA #$00            ; clear A
            STA Squarel         ; clear square low byte
                                ; (no need to clear the high byte, it gets
                                ; shifted out)
            LDA Numberl         ; get number low byte
            LDX Numberh         ; get number high  byte
            BPL NoNneg          ; if +ve don't negate it
                                ; else do a two's complement
            EOR #$FF            ; invert
            SEC                 ; +1
            ADC #$00            ; and add it
    
    NoNneg:
            STA Tempsq          ; save ABS(number)
            LDX #$08            ; set bit count
    
    Nextr2bit:
            ASL Squarel         ; low byte *2
            ROL Squareh         ; high byte *2+carry from low
            ASL A               ; shift number byte
            BCC NoSqadd         ; don't do add if C = 0
            TAY                 ; save A
            CLC                 ; clear carry for add
            LDA Tempsq          ; get number
            ADC Squarel         ; add number^2 low byte
            STA Squarel         ; save number^2 low byte
            LDA #$00            ; clear A
            ADC Squareh         ; add number^2 high byte
            STA Squareh         ; save number^2 high byte
            TYA                 ; get A back
    
    NoSqadd:
            DEX                 ; decrement bit count
            BNE Nextr2bit       ; go do next bit
            RTS

I like the coding practices of this author. The source code is easy to read and to the point.

After modifying Lee’s excellent source code listing, I came up with this listing, called “square.asm”. It squares an 8-bit unsigned integer into a 16-bit unsigned integer. Since I wanted to test it in Basic, I used the zero page addresses $FA through $FD, that are unused by the OS.


; square.asm
;
; Calculates the 16-bit unsigned square of an 8-bit unsigned integer.
; Destroys all registers.

        number = $fa            ; 8-bit number to be squared

        squarel = $fb           ; 16-bit squared number
        squareh = squarel+1

        tempsq = $fd            ; temporary storage of original number

        *=$c000                 ; can be called from Basic with SYS 49152

        square:
                lda #$00        ; clear A
                sta squarel     ; clear square low byte
                                ; (no need to clear the high byte,
                                ; it gets shifted out)
                lda number      ; get number in .A
                sta tempsq      ; save original for later use
                ldx #$08        ; 8 bits to process
        sqloop:
                asl squarel     ; shift square to left
                rol squareh     ; (i.e. multiply by 2)
                asl a           ; get next highest bit of number
                bcc sqnoadd     ; don't do add if carry is zero
                tay             ; save .A for later use
                clc
                lda tempsq      ; add original number value
                adc squarel     ; to square
                sta squarel
                lda #$00        ; add a possible carry
                adc squareh     ; to the high byte of square
                sta squareh
                tya             ; get .A back
        sqnoadd:
                dex             ; decrement bit counter
                bne sqloop      ; go do next bit
                rts

Again, I assembled the assembly listing into a hexdump, pasting it into the source code field of the Virtual 6502 Assembler. I converted the hexdump into data statements for the following Basic program, called “tsquare 0.1”.


0 rem tsquare 0.1
10 forn=49152to49187:readb:poken,b:next:rem read in machine code
20 data 169,000,133,251,165,250,133,253
21 data 162,008,006,251,038,252,010,144
22 data 015,168,024,165,253,101,251,133
23 data 251,169,000,101,252,133,252,152
24 data 202,208,231,096
30 input "number";n
40 poke 250,n
50 sys 49152
60 s = peek(251)+256*peek(252)
70 print "square is";s
80 print "press any key to repeat"
90 poke 198,0:wait 198,1:poke 198,0:goto 30

Here’s some values I tried:


run
number? 2
square is 4
press any key to repeat
number? 5
square is 25
press any key to repeat
number? 49
square is 2401
press any key to repeat
number? 255
square is 65025
press any key to repeat
number? 0
square is 0
press any key to repeat

break in 90
ready.

I love when code works at first attempt. Of course, I had lots of help from people from the past. Still, it’s satisfying if things work as they should.