Getting a foothold into 6502 machine language, part 4
Yeah, I know, I know. I promised a completed function to test if a number is prime. However, I want to write an article a day, and there’s only so much one can code in a day. This time I used code by other programmers, because, you know, I’m a novice. I did, however, try to understand the code, so I could rewrite where it was needed.
[ part 1 | part 2 | part 3 | part 5 | bonus ]
As established in part 3, we need two functions, one to square a number and one to take the modulo between two numbers. These will be the subjects of today’s article. I will analyze the original code, rewrite it, and test it with a Basic program, containing a machine code loader in data statements.
Here we go!
Binary division and the binary square
Suppose we wanted a 16-bit unsigned number to be divided by a non-zero 8-bit number to obtain its quotient and remainder. We are particularly interested in the remainder, since it’s the result of the modulo operation dividend mod divisor
, and we need it for checking divisibility (true if the remainder of a division is zero, otherwise false).
On the website LLX.com about the Apple II, I found an article called Multiplying and Dividing on the 6502, written by Neil Parker. In the section “Dividing Arbitrary Numbers” it gets interesting for our purposes (modulo).
We will shift the bits of the dividend, one at a time, into a work area, and then try subtracting the divisor from the work area. If the subtraction succeeded, we replace the work area with the result of the subtraction and record a 1 bit in the quotient. If the subtraction failed, we discard its result and record a 0 bit in the quotient. The process is repeated until all bits of the dividend are used up.
Here are the steps:
- clear the remainder
- set the bit counter to 16
- shift the dividend to the left into the remainder
- trial subtract the divisor from the remainder
- was there a borrow (negative result), then step 8
- store the result in remainder
- increment the dividend (quotient)
- decrement bit counter
- if bit counter not zero, then step 3
There are three 16-bit registers, remainder, dividend and divisor. After exiting the routine, the dividend has become the quotient.
LDA #0 ;Initialize REM to 0
STA REM
STA REM+1
LDX #16 ;There are 16 bits in NUM1
L1 ASL NUM1 ;Shift hi bit of NUM1 into REM
ROL NUM1+1 ;(vacating the lo bit, which will be used for the quotient)
ROL REM
ROL REM+1
LDA REM
SEC ;Trial subtraction
SBC NUM2
TAY
LDA REM+1
SBC NUM2+1
BCC L2 ;Did subtraction succeed?
STA REM+1 ;If yes, save it
STY REM
INC NUM1 ;and record a 1 in the quotient
L2 DEX
BNE L1
Since we aren’t interested in the quotient, just the remainder, we can eliminate the INC NUM1
the listing above. The divisor is smaller than 256, so an 8-bit version will do. We still need to go through 16 bits, though. We can set NUM2+1
in Neil’s listing above to zero, by replacing SBC NUM2+1
with SBC #$00
.
Here’s the new routine, I call “modulo”. For testing it in basic with a SYS
command, I’ll use the freely available zero page addresses $FA through $FE to store the values. We start by POKEing the dividend in locations 250 and 251 (lo/hi bytes) and the divisor in location 252. The result will be in locations 253 and 254 (lo/hi bytes).
; modulo.asm
; Calculate the modulo of an 8-bit number with a 16-bit number.
; Destroys all registers.
num1 = $fa ; 16-bit dividend
num2 = $fc ; 8-bit divisor
rem = $fd ; 16-bit remainder
*=$c000 ; can be called from Basic with SYS 49152
modulo:
lda #$00 ; clear remainder
sta rem
sta rem+1
ldx #16 ; do 16 bits
modloop:
asl num1 ; shift num1 to the left
rol num1+1
rol rem ; into remainder
rol rem+1
sec
lda rem ; do trial subtraction
sbc num2 ; with divisor
tay
lda rem+1
sbc #$00 ; divisor is 8-bit value
bcc modskip ; was there a borrow, then next bit
sta rem+1 ; save the result
sty rem
modskip:
dex ; repeat for 16 bits
bne modloop
rts
Here’s my test program, called “testmod 0.1”. It contains the converted hexdump from the Virtual 6502 Assembler after pasting in the above source listing “modulo.asm”.
10 rem testmod 0.1
20 forn=49152to49187:readb:poken,b:next:rem read in machine code
30 data 169,000,133,253,133,254,162,016
31 data 006,250,038,251,038,253,038,254
32 data 165,253,056,229,252,168,165,254
33 data 233,000,144,004,133,254,132,253
34 data 202,208,229,096
40 input "dividend";dd
50 input "divisor";dv
60 poke 251, int(dd/256):poke 250,dd-256*peek(251): rem poke dividend to memory
70 poke 252,dv:rem poke divisor to memory
80 sys 49152
90 rm=peek(253)+256*peek(254)
100 print "remainder is";rm
110 print "press any key to repeat"
120 poke 198,0:wait 198,1:poke 198,0:goto 40
Here is some input and output I created in the C64 Basic interpreter of the V.I.C.E. C64 emulator on my Raspberry Pi-400, running Raspberry Pi OS.
run
dividend? 61
divisor? 3
remainder is 1
press any key to repeat
dividend? 65535
divisor? 16
remainder is 15
press any key to repeat
dividend? 17
divisor? 4
remainder is 1
press any key to repeat
break in 120
ready.
It seems to work, nice job, me 😊
The same article on the site LLX.com mentions how to do multiplication, but what we need is a square function (a number multiplied by itself). So I found another resource on 6502.org, called Square Calculator and was written by by Lee Davison. It deals with signed integers, but we only need unsigned 8-bit integers, so there’s probably some rewriting to do, simplifying the code.
Here’s the full listing.
; Calculates the 16 bit unsigned integer square of the signed 16 bit integer in
; Numberl/Numberh. The result is always in the range 0 to 65025 and is held in
; Squarel/Squareh
;
; The maximum input range is only +/-255 and no checking is done to ensure that
; this is so.
;
; This routine is useful if you are trying to draw circles as for any circle
;
; x^2+y^2=r^2 where x and y are the co-ordinates of any point on the circle and
; r is the circle radius
;
; Destroys all registers
*= 8000 ; these must be in RAM
Numberl ; number to square low byte
Numberh = Numberl+1 ; number to square high byte
.word $FFFF
Squarel ; square low byte
Squareh = Squarel+1 ; square high byte
.word $FFFF
Tempsq: ; temp byte for intermediate result
.byte $00
*= 8192 ; any address will do
Square:
LDA #$00 ; clear A
STA Squarel ; clear square low byte
; (no need to clear the high byte, it gets
; shifted out)
LDA Numberl ; get number low byte
LDX Numberh ; get number high byte
BPL NoNneg ; if +ve don't negate it
; else do a two's complement
EOR #$FF ; invert
SEC ; +1
ADC #$00 ; and add it
NoNneg:
STA Tempsq ; save ABS(number)
LDX #$08 ; set bit count
Nextr2bit:
ASL Squarel ; low byte *2
ROL Squareh ; high byte *2+carry from low
ASL A ; shift number byte
BCC NoSqadd ; don't do add if C = 0
TAY ; save A
CLC ; clear carry for add
LDA Tempsq ; get number
ADC Squarel ; add number^2 low byte
STA Squarel ; save number^2 low byte
LDA #$00 ; clear A
ADC Squareh ; add number^2 high byte
STA Squareh ; save number^2 high byte
TYA ; get A back
NoSqadd:
DEX ; decrement bit count
BNE Nextr2bit ; go do next bit
RTS
I like the coding practices of this author. The source code is easy to read and to the point.
After modifying Lee’s excellent source code listing, I came up with this listing, called “square.asm”. It squares an 8-bit unsigned integer into a 16-bit unsigned integer. Since I wanted to test it in Basic, I used the zero page addresses $FA through $FD, that are unused by the OS.
; square.asm
;
; Calculates the 16-bit unsigned square of an 8-bit unsigned integer.
; Destroys all registers.
number = $fa ; 8-bit number to be squared
squarel = $fb ; 16-bit squared number
squareh = squarel+1
tempsq = $fd ; temporary storage of original number
*=$c000 ; can be called from Basic with SYS 49152
square:
lda #$00 ; clear A
sta squarel ; clear square low byte
; (no need to clear the high byte,
; it gets shifted out)
lda number ; get number in .A
sta tempsq ; save original for later use
ldx #$08 ; 8 bits to process
sqloop:
asl squarel ; shift square to left
rol squareh ; (i.e. multiply by 2)
asl a ; get next highest bit of number
bcc sqnoadd ; don't do add if carry is zero
tay ; save .A for later use
clc
lda tempsq ; add original number value
adc squarel ; to square
sta squarel
lda #$00 ; add a possible carry
adc squareh ; to the high byte of square
sta squareh
tya ; get .A back
sqnoadd:
dex ; decrement bit counter
bne sqloop ; go do next bit
rts
Again, I assembled the assembly listing into a hexdump, pasting it into the source code field of the Virtual 6502 Assembler. I converted the hexdump into data statements for the following Basic program, called “tsquare 0.1”.
0 rem tsquare 0.1
10 forn=49152to49187:readb:poken,b:next:rem read in machine code
20 data 169,000,133,251,165,250,133,253
21 data 162,008,006,251,038,252,010,144
22 data 015,168,024,165,253,101,251,133
23 data 251,169,000,101,252,133,252,152
24 data 202,208,231,096
30 input "number";n
40 poke 250,n
50 sys 49152
60 s = peek(251)+256*peek(252)
70 print "square is";s
80 print "press any key to repeat"
90 poke 198,0:wait 198,1:poke 198,0:goto 30
Here’s some values I tried:
run
number? 2
square is 4
press any key to repeat
number? 5
square is 25
press any key to repeat
number? 49
square is 2401
press any key to repeat
number? 255
square is 65025
press any key to repeat
number? 0
square is 0
press any key to repeat
break in 90
ready.
I love when code works at first attempt. Of course, I had lots of help from people from the past. Still, it’s satisfying if things work as they should.