I was looking through the Commodore 64 Programmer’s Reference Guide, in the chapter about graphics, how I could POKE screen codes to the screen, so to speak, in 6502 assembly. Here is what I came up with.

First of all, what do I mean with “POKE” and “screen codes”?

POKE is a CBM Basic 2.0 command, also available on many Microsoft Basic variants on 1980’s and 1990’s home computers. It allowed the programmer to place a value anywhere in the 64 Kb of memory addressing space available to the Central Processing Unit (CPU, which is the 6510, a variant of the 6502). Since the screen in the default configuration is located at memory addresses 1024 to 2023, one could put a value A into column C and row R as follows:


POKE 1024 + 40*R + C, A

where:

  • 0 <= R <= 24
  • 0 <= C <= 39
  • 0 <= A <= 255

This is all well and good, but it seems rather slow, even if we invoked these commands in 6502 machine code. I’ll come to that later.

Screen codes are Commodore specific values, which are used internally to represent characters. There is a direct correlation between a screen code an its position in the character ROM. The codes are different than the codes used for printing (which codes are collectively called PETSCII, Commodore’s own version of ASCII). For instance the letter ‘A’ is the value 1 in screen code and 65 in PETSCII.

Why do I even want to POKE screen codes to the screen, if the same can be done using PETSCII and printing? Well, I want to put blocks of screen codes onto screen, using a self-defined character set, to use for a video game. In that case the Basic commands and even the routines in the underlying operating system (Kernal) are just too slow. I need custom routines to put screencodes onto screen.

So I started to explore what is needed.

It seems the video chip, the VIC II, can only “see” a quarter of the 64 Kb of addressing space, i.e. 16 Kb, in four banks:

  • bank 0, $0000 - $3FFF (default)
  • bank 1, $4000 - $7FFF
  • bank 2, $8000 - $BFFF
  • bank 3, $C000 - $FFFF

This is controlled through bits 0 and 1 of data register A (DRA) of the Complex Interface Adapter (CIA) chip. There are four possible values:

  • %11 VIC II bank 0 (default)
  • %10 VIC II bank 1
  • %01 VIC II bank 2
  • %00 VIC II bank 3

So bits 0 and 1 have to be inverted to get the binary value of the VIC II bank. Furthermore, in order to read data register A, their bits have to be set to zero selectively in the data direction register (DDRA) for data register A. All the other bits of DDRA have to left alone.

Within the 16 Kb available at one time to the VIC II, there are 16 possible relative locations of screen memory (25 lines of 40 characters, or 1000 bytes):

  • $0000 - $03E7
  • $0400 - $07E7 (default)
  • $0800 - $0BE7
  • $0C00 - $0FE7
  • $1000 - $13E7
  • $1400 - $17E7
  • $1800 - $1BE7
  • $1C00 - $1FE7
  • $2000 - $23E7
  • $2400 - $27E7
  • $2800 - $2BE7
  • $2C00 - $2FE7
  • $3000 - $33E7
  • $3400 - $37E7
  • $3800 - $3BE7
  • $3C00 - $3FE7

The relative base address of screen memory is controlled by the lower 4 bits of register 24 of the VIC II. Again, only these bits should be manipulated, while the upper 4 bits should not be changed for selecting the screen location. Of course, I am only reading the bits, so I’m safe in that regard.

The Kernal has (half) a pointer to the full address of the beginning of screen memory, the high-byte. This pointer is stored in location 648 ($288). I can either use that, or calculate it myself.

  • $288 (648): high-byte of pointer to the base of screen memory

Of course, if I would relocate the screen, I would have to change this pointer in my code, so there’s value in knowing how to calculate this high-byte, even if it isn’t needed if I don’t change the screen location in memory. I could just as well use the value in $288.

I calculated the high-byte of this pointer as follows:


; some addresses defined

vicmemptr       eqm $d018       ; VIC II memory pointer register
cia2rega        eqm $dd00       ; CIA 2 register A
cia2ddra        eqm $dd02       ; CIA 2 data direction register A

; determine location of screen memory
; returns .A hi byte of base address of screen memory
; modifies .A, flags

screenbas:

                ; determine the VIC II base address

                lda cia2ddra    ; set
                and #%11111100  ; bits 0, 1
                sta cia2ddra    ; of reg A to "read"
                lda cia2rega    ; read reg A
                and #%00000011  ; only bits 0, 1
                eor #%00000011  ; invert bits 0, 1
                lsr             ; optimization -> shift into bits 6, 7
                ror             ; in effect, multiply by 2^6
                sta screenb1+1  ; self-mod code

                ; determine location of screen memory inside the 16 Kb VIC II bank

                lda vicmemptr   ; get
                and #%11110000  ; upper 4 bits
                lsr
                ror             ; make it hi byte * 2^2

                ; add base address of VIC II to relative base address of screen memory

                clc
screenb1:       adc #$ff        ; self-modded
                rts

I hope you didn’t mind my code self-modification trick, but it does work in RAM. In ROM one would do it differently, naturally.

Now I know where the screen is located in memory, how can I put a screen code into a particular column and row? There are 40 columns and 25 rows, numbered from 0 to 39, and to 25, respectively. This means the memory location, based on column and row value can be calculated as follows:

  • screen base + column * 40 + row

I’ll use indirect indexing to point to this address, where the .Y register serves as the index:


                 sta (putadr),y ; store it in the yth column

where putadr holds the address of the first row of the column on screen. The .Y register then accesses the Y-th column in the instruction above. It makes sense to store the column value in the .Y register (the same as the Kernal does).

Adding address values is pretty straight forward, but how to calculate times forty?

Remember that the 6502 has a shift left instruction, which is equivalent to multiplying by two. So, with some shifting and adding it is possible to get to times forty quicker than doing it traditionally (multiply two 8-bit values into a 16-bit value):

  • value * 40 = ( value + (value * 2^2) ) * 2^3

That is 5 left shifts and 2 additions. However, while five times a column value still fits into a single byte, multiplying it by 8 doesn’t fit in that single byte anymore. At that point two bytes are needed.

Here is the code:


; put screen code onto screen
; parameters:
;     .A screen code
;     .Y screen column, between 0 and 39
;     .X screen row, between 0 and 24
; returns:
;     carry flag set means an error occurred
; all registers are affected
; bits 0, 1 of DDRA of CIA 2 ($DD02) are set to zero (read)
; zero page $02, $03 are used

putadr:          eqm $02        ; 2-byte screen address

putonscn:        cpy #40        ; column >= 40 ?
                 bcs putexit    ; yes, then exit w/ error
                 cpx #25        ; row >= 25 ?
                 bcs putexit    ; yes, then exit w/ error
                 pha            ; save for later
                 stx putadr     ; initialyze screen address
                 txa            ; transfer row to .A
                 asl
                 asl            ; row * 2^2
                 adc putadr     ; row * 2^2 + row
                 sta putadr
                 ldx # 0        ; zero hi byte
                 stx putadr+1   ; of screen address
                 asl
                 rol putadr+1
                 asl
                 rol putadr+1
                 asl
                 rol putadr+1
                 sta putadr     ; (row * 5) * 2^3
                 jsr screenbas  ; add hi byte of screen base address
                 adc putadr+1
                 sta putadr+1   ; baseadr + (row * 40)
                 pla            ; get screen code back
                 sta (putadr),y ; store it in the yth column
                 clc            ; no errors
putexit          rts

To check if this code actually works I added a combination of assembly and Basic language, so it could run from the Basic prompt on the Commodore 64 with the RUN command.

It working made me a happy coder.