This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Simulating decimal mode

Simulating decimal mode
by on (#106375)
I'm trying to find a way to replace the missing decimal mode on the 2A03. Currently I handle scores with each of the 6 digits separately, but I figured out that I could pack 2 digits in one byte, so it'd take 3 bytes of RAM instead of 6.

For this, I'd need some system that reliably add 2 packed BCD numbers together, and set the carry if the result is greater than 100, while keeping the low 2 digits correctly. I don't care the results for invalid BCD numbers.

Now the question is, is it more efficient (ROM and Speed wise) to do so ?

1st idea : Do a binary addition, and do a software "decimal adjust"
Code:
           lda _packed_bcd1
           clc
           adc _packed_bcd2
           php

           sta _packed_result
           eor _packed_bcd1
           eor _packed_bcd2    ;Check if half-carry was set
           and #$10
           bne halfcarryset
           lda _packed_result
           and #$0f                ;Check if any non-valid BCD number
           cmp #$0a
           bcc nohalfcarry
        halfcarryset:
           lda #$06               ;If any of those was true -> decimal adjust low digit
           clc
           adc _packed_result
           sta _packed_result
           bcc nohalfcarry
           pla
           php
        nohalfcarry:

           lda _packed_result
           plp
           bcs carryset            ;Check if full carry was set
           cmp #$a0               ;Check for invalid high digit
           bcc nocarry
        carryset:
           adc #$60-1            ;If any of those true -> decimal adjust high digit
           sta _packed_result
           sec
nocarry:


2nd idea : Handle each digit separatedly :
Code:
           lda _packed_bcd2
           and #$0f
           sta _c
           lda _packed_bcd1
           and #$0f
           clc
           adc _c
           cmp #$0a
           bcc nouovfl
           sbc #$0a
        nouovfl:
           sta _packed_result
           lda #$00
           bcc nodadjust
           lda #$0f
        nodadjust:
           adc _packed_bcd1
           and #$f0
           adc _packed_bcd2
           and #$f0
           bcs dovfl
           cmp #$a0
           bcc nodovfl
        dovfl:
           sbc #$a0
           sec
        nodovfl:
           ora _packed_result
           sta _packed_result


I think both solutions are somewhat complex and un-elegant.
Re: Simulating decimal mode
by on (#106377)
Bregalad wrote:
I'm trying to find a way to replace the missing decimal mode on the 2A03. Currently I handle scores with each of the 6 digits separately, but I figured out that I could pack 2 digits in one byte, so it'd take 3 bytes of RAM instead of 6.

For this, I'd need some system that reliably add 2 packed BCD numbers together, and set the carry if the result is greater than 100, while keeping the low 2 digits correctly.

You could look at what Thwaite does: the score is built from bytes of value 0-99 ($00-$63) which are individually converted to decimal at 80 cycles each for display.
Re: Simulating decimal mode
by on (#106378)
It's a pretty good idea, especially since I already have a $00 to $63 to decimal converter for other numbers in the game.

The only inconvenient I think is that summing the score with a constant would be weird - I would probably have to store the decimal + tens in a byte coded in binary, and the hundreds and thousands in another byte coded in binary. How weird, but if it is efficient I don't care how weird it is to store stuff.

What I do currently is that I store the score adding constants (when defeating enemies, etc...) in packed BCD on 2 bytes (values between 0 to 9999 can be added), but I unpack it into 4 digits-bytes, and sum with the unpacked 6 digit-byte score.
I tought that doing 2 sums of packed numbers and then a dummy add with 0 for the upper 2 digits would be more efficient.

But pehaps doing it like you say is even more efficient... ? I guess I'll have to write all 3 possibilities, count bytes, and draw my conclusions.
Re: Simulating decimal mode
by on (#106379)
Bregalad wrote:
The only inconvenient I think is that summing the score with a constant would be weird - I would probably have to store the decimal + tens in a byte coded in binary, and the hundreds and thousands in another byte coded in binary. How weird, but if it is efficient I don't care how weird it is to store stuff.

Especially when a macro can abstract away all the weirdness of constants.

Quote:
I guess I'll have to write all 3 possibilities, count bytes, and draw my conclusions.

In any case, you can start with the relevant code from Thwaite:
Code:
BG_GRASSNUM = $70  ; row of tiles with 0-9 on status bar background

.segment "BSS"
.houseXferBuf: .res 64

.segment "CODE"
.macro bcd8bit_iter value
  .local skip
  cmp value
  bcc skip
  sbc value
skip:
  rol highDigits
.endmacro

;;
; Converts a decimal number to two or three BCD digits
; in no more than 84 cycles.
; @param a the number to change
; @return a: low digit; 0: upper digits as nibbles
.proc bcd8bit
highDigits = 0
  pha
  lda #0
  sta 0
  pla

  ; Each iteration takes 11 if subtraction occurs or 10 if not.
  ; But if 80 is subtracted, 40 and 20 aren't, and if 200 is
  ; subtracted, 80 is not, and at least one of 40 and 20 is not.
  ; So this part takes up to 6*11-2 cycles.
  bcd8bit_iter #200
  bcd8bit_iter #100
  bcd8bit_iter #80
  bcd8bit_iter #40
  bcd8bit_iter #20
  bcd8bit_iter #10
  rts
.endproc

.proc buildStatusBar
  lda bgDirty
  and #~BG_DIRTY_STATUS
  sta bgDirty
  lda #$23
  sta houseXferDstHi
  lda #$40
  sta houseXferDstLo

  ; Omitted: draw the rest of the status bar

  ; Draw the score
  lda #BG_GRASSNUM
  sta houseXferBuf+38
  sta houseXferBuf+39
  lda score100s
  beq noScore100s
  jsr bcd8bit
  ora #BG_GRASSNUM
  sta houseXferBuf+35
  lda 0
  beq noScore100s
  ora #BG_GRASSNUM
  sta houseXferBuf+34
noScore100s:
  lda score1s
  jsr bcd8bit
  ora #BG_GRASSNUM
  sta houseXferBuf+37
  lda 0
  ora score100s
  beq noScore10s
  lda 0
  ora #BG_GRASSNUM
  sta houseXferBuf+36
noScore10s:

  rts
.endproc

;;
; Adds between 1 and 255 points to the score.
; X, Y, and memory (apart from score) are unchanged.
.proc addScore
  clc
  adc score1s
  bcc notOver256
  inc score100s
  inc score100s
  adc #55
notOver256:
  cmp #100
  bcc notOver100
  sbc #100
  inc score100s
  bcs notOver256
notOver100:
  sta score1s
  lda bgDirty
  ora #BG_DIRTY_STATUS
  sta bgDirty
  rts
.endproc
Re: Simulating decimal mode
by on (#106381)
If you use base100 to store numbers you can very quickly convert them to standard bcd if it helps:
Code:

bcd_number:
.byte $0, $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25
.byte $26, $27, $28, $29, $30, $31, $32, $33, $34, $35, $36, $37, $38, $39, $40, $41, $42, $43, $44, $45, $46, $47, $48, $49, $50
.byte $51, $52, $53, $54, $55, $56, $57, $58, $59, $60, $61, $62, $63, $64, $65, $66, $67, $68, $69, $70, $71, $72, $73, $74, $75
.byte $76, $77, $78, $79, $80, $81, $82, $83, $84, $85, $86, $87, $88, $89, $90, $91, $92, $93, $94, $95, $96, $97, $98, $99

convertToBCD:
lda bcd_number,x
rts
Re: Simulating decimal mode
by on (#106383)
Well tepples, thank you for sharing your code, but just a little trought make me think that of course, counting it in base 100 will obviously make it more efficient than anything else.

Also, when I think about it it's not that much weird, for example to add 1234 to the scrore, to have it like this :
Code:
.db 12, 32


instead of the standard BCD I have currently :
Code:
.db $12, $32


I mean, visually, it's almost the same, just without the $.

@Movax12 : Hell no, my game has to fit in 32kb of ROM, there is no way I'd waste 100 bytes like this ! Anyways thank you for this suggetion, but convert to "standard" BCD won't be needed.
Re: Simulating decimal mode
by on (#106384)
Bregalad wrote:
Hell no, my game has to fit in 32kb of ROM, there is no way I'd waste 100 bytes like this !

You do have to take into account how much the conversion routines can be simplified by the look-up table in order to calculate the actual cost of using it.
Re: Simulating decimal mode
by on (#106385)
True, my binary -> decimal routines takes a total of 23 bytes, and I don't think it's possible to shrink it further (it could be simplified if 0-9 digits were 0-9 in the pattern table instead of $d0-$d9 but I don't want to do that).
Code:
   ldy #$00
-   cmp #10
   bcc +
   sbc #10
   iny            ;Y count tens
   bne -

+   ora #$d0
   sta StringBuffer+1.w,X   ;The units is written 1 tile forwad
   tya
   ora #$d0            ;Convert into tile #
   sta StringBuffer.w,X   ;Write the tens
   rts


EDIT : now I implemented it, I must thank you very much tepples, I could save quite a few precious bytes !
Re: Simulating decimal mode
by on (#106405)
I think this works.

Code:
   ldy #$00
   sec
-  sbc #10
   bcc +
   iny            ;Y count tens
   bne -

+   adc #$da
   sta StringBuffer+1.w,X   ;The units is written 1 tile forwad
   tya
   ora #$d0            ;Convert into tile #
   sta StringBuffer.w,X   ;Write the tens
    rts


It's a bit faster too.
Re: Simulating decimal mode
by on (#106406)
Yeah, you saved me ONE byte !!
Re: Simulating decimal mode
by on (#106536)
Related: Here is the code from Blargg's test framework to print decimal (non-BCD) numbers up to 65535. I no longer remember how much of it was written or changed by me.

Code:
.pushseg
.segment "RODATA"
        ; >= 60000 ? (EA60)
        ; >= 50000 ? (C350)
        ; >= 40000 ? (9C40)
        ; >= 30000 ? (7530)
        ; >= 20000 ? (4E20)
        ; >= 10000 ? (2710)
digit10000_hi: .byte $00,$27,$4E,$75,$9C,$C3,$EA
digit10000_lo: .byte $00,$10,$20,$30,$40,$50,$60
        ; >= 9000 ? (2328 (hex))
        ; >= 8000 ? (1F40 (hex))
        ; >= 7000 ? (1B58 (hex))
        ; >= 6000 ? (1770 (hex))
        ; >= 5000 ? (1388 (hex))
        ; >= 4000 ? (FA0 (hex))
        ; >= 3000 ? (BB8 (hex))
        ; >= 2000 ? (7D0 (hex))
        ; >= 1000 ? (3E8 (hex))
digit1000_hi: .byte $00,$03,$07,$0B,$0F,$13,$17,$1B,$1F,$23
digit1000_lo: .byte $00,$E8,$D0,$B8,$A0,$88,$70,$58,$40,$28
; >= 900 ? (384 (hex))
; >= 800 ? (320 (hex))
; >= 700 ? (2BC (hex))
; >= 600 ? (258 (hex))
; >= 500 ? (1F4 (hex))
; >= 400 ? (190 (hex))
; >= 300 ? (12C (hex))
; >= 200 ? (C8 (hex))
; >= 100 ? (64 (hex))
digit100_hi: .byte $00,$00,$00,$01,$01,$01,$02,$02,$03,$03
digit100_lo: .byte $00,$64,$C8,$2C,$90,$F4,$58,$BC,$20,$84
.popseg

.macro dec16_comparew table_hi, table_lo
        .local @lt
        cmp table_hi,y
        bcc @lt
        bne @lt ; only test the lo-part if hi-part is equal
        pha
         txa
         cmp table_lo,y
        pla
@lt:
.endmacro
.macro do_digit table_hi, table_lo
        pha
         ; print Y as digit; put X in A and do SEC for subtraction
         jsr @print_dec16_helper
         sbc table_lo,y
         tax
        pla
        sbc table_hi,y
.endmacro

; Prints A:X as 2-5 digit decimal value, NO space after.
; A = high 8 bits, X = low 8 bits.
print_dec16:
        ora #0
        beq @less_than_256

        ldy #6
        sty print_temp_

        ; TODO: Use binary search?
:       dec16_comparew digit10000_hi,digit10000_lo
        bcs @got10000
        dey
        bne :-
        ;cpy print_temp_
        ;beq @got10000 
@cont_1000:
        ldy #9
:       dec16_comparew digit1000_hi,digit1000_lo
        bcs @got1000
        dey
        bne :-          ; Y = 0.
        cpy print_temp_ ; zero print_temp_ = print zero-digits
        beq @got1000
@cont_100:
        ldy #9
:       dec16_comparew digit100_hi,digit100_lo
        bcs @got100
        dey
        bne :-
        cpy print_temp_
        beq @got100
@got10000:
        do_digit digit10000_hi,digit10000_lo
        ; value is now 0000..9999
        ldy #0
        sty print_temp_
        beq @cont_1000
@got1000:
        do_digit digit1000_hi,digit1000_lo
        ; value is now 000..999
        ldy #0
        sty print_temp_
        beq @cont_100 
@got100:
        do_digit digit100_hi,digit100_lo
        ; value is now 00..99
        txa
        jmp print_dec_00_99
@less_than_256:
        txa
        jmp print_dec
@print_dec16_helper:
         tya
         jsr print_digit
         txa
         sec
        rts

; Prints A as 2-3 digit decimal value, NO space after.
; Preserved: Y
print_dec:
        ; Hundreds
        cmp #10   
        blt print_digit
        cmp #100
        blt print_dec_00_99
        ldx #'0'-1      ;DTE_CHARMAP
:       inx
        sbc #100
        bge :- 
        adc #100
        jsr print_char_x
       
        ; Tens
print_dec_00_99:
        sec
        ldx #'0'-1      ;DTE_CHARMAP
:       inx
        sbc #10
        bge :-
        adc #10
        jsr print_char_x
        ; Ones
print_digit: 
        ora #'0'        ;DTE_CHARMAP
        jmp print_char
        ; Print a single digit
print_char_x:
        pha 
        txa 
        jsr print_char
        pla
        rts


This code from my CV2 retranslation patch just outputs a two-digit non-BCD decimal number (00-99).
Note that compared to the earlier posted code, this does one branch instructino per sbc-loop rather than two.
Code:
        sec
        ldy #'0'-1
        : iny
          sbc #10
          bcs :-
        adc #10+'0'
        pha   
         tya
         PutChar 1
        pla
        PutChar 2
Re: Simulating decimal mode
by on (#107098)
By the way, sort of relevant, sort of not: seems even actual commercial game programmers were wondering why Nintendo didn't include decimal mode:

http://tcrf.net/Pachi_Com_%28NES%29
Re: Simulating decimal mode
by on (#107102)
Man, that's one angry programmer!
Re: Simulating decimal mode
by on (#107105)
tokumaru wrote:
Man, that's one angry programmer!

No, this is one angry programmer. And apparently the translation is more or less accurate. *chuckles* Japanese, having to stomach all their anger because it's not polite to be angry...
Re: Simulating decimal mode
by on (#107127)
I suppose it helps if you have a decimal limit on what your value can be, rather than a hex limit. This ensures the amount of decimal digits you have to account for. My method uses 3 fixed routines, one for 8 bits, one for 16, and one for 24. It uses tables holding preconverted values for $0-$F, $00 - $F0 (Counting $10s), $000 -$F00 (counting $100s), and $0000 - $F000 (counting $1000s). Using these preconverted values, you can add them all together by breaking a value like $E834 into $E000 + $800 + $30 + $4 and simulating pen and paper addition (ends up adding 57344 + 768 + 48 + 4, handling each decimal digit in the addition seperately). That might make sense only to me; it's kind of hard to explain. I'm pretty happy with the performance I got out of it, and the fact that it takes a fixed amount of time (118 cycles for 8 bits, 263 for 16 bits, and 475 for 24). However, it probably uses about 500-600 bytes of ROM for the whole thing. I seem to recall that Blargg's solution was smaller and faster. I just never took the time to understand how it works.

If you were just doing 8 bit conversion, you'd want to go with a smaller solution. It still takes about 100 bytes for just 8 bits. But the tables from that routine are re-used in larger hex conversions.
Re: Simulating decimal mode
by on (#107355)
In my case summing scores and stuff in binary and then converting to decimal once a frame for showing takes a lot less time than trying to calculate stuff in BCD so in the ond you could show it in an easier manner...
Re: Simulating decimal mode
by on (#129789)
Bregalad wrote:
True, my binary -> decimal routines takes a total of 23 bytes, and I don't think it's possible to shrink it further (it could be simplified if 0-9 digits were 0-9 in the pattern table instead of $d0-$d9 but I don't want to do that).
Code:
   ldy #$00
-   cmp #10
   bcc +
   sbc #10
   iny            ;Y count tens
   bne -
+   ora #$d0

   sta StringBuffer+1.w,X   ;The units is written 1 tile forwad
   tya
   ora #$d0            ;Convert into tile #
   sta StringBuffer.w,X   ;Write the tens
   rts



Hello Bregalad,

I was looking at your code, and I believe you can reduce it further like this:

Code:
    ldy    #$D0-1
    sec
.loopDivTen:
    iny
    sbc    #10
    bcs    .loopDivTen
    adc    #$DA
    sta    StringBuffer+1,X
    sty    StringBuffer,X
    rts



I am new to the NES, but no stranger to 6502 assembly. I've programmed a lot of 2600 code... not too sure if there is anyone else on these boards from AtariAge, but they may recognize my username from there. :)

I have always been interested in math routines. I have wrote a lot of unsigned integer division routines, and lately have become more interested in hex to decimal routines. All of these routines are posted in my blog here:

http://atariage.com/forums/blog/563-omegamatrixs-blog/

Early in this thread it was mentioned limiting each byte to 0-63 Hex, and then convert each byte as need be. This seems like a reasonable approach to me. Is this what most NES programmers do? It would keep the addition pretty simple. In my blog I have tackled a hex to decimal (0-65535) conversion, and while my routine does the job in 156-168 cycles, it still requires 263 bytes.
Re: Simulating decimal mode
by on (#129790)
Oh! Hey, Omegamatrix. I do recognize the name from this thread: http://atariage.com/forums/topic/113254 ... ven/page-5

Off topic: I'd planned to ask a few people eventually, but I've generally operated under the assumption routines posted in threads like that are public domain. Is this true for yours? I'm currently using your divide by 5/10 routine. More on topic, how about this decimal routine that probably beats whatever I'm using now? Was gonna PM you on atariage to ask how/if you'd like credit, but now you're here.
Re: Simulating decimal mode
by on (#129791)
Kasumi wrote:
Oh! Hey, Omegamatrix. I do recognize the name from this thread: http://atariage.com/forums/topic/113254 ... ven/page-5

Off topic: I'd planned to ask a few people eventually, but I've generally operated under the assumption routines posted in threads like that are public domain. Is this true for yours? I'm currently using your divide by 5/10 routine. More on topic, how about this decimal routine that probably beats whatever I'm using now? Was gonna PM you on atariage to ask how/if you'd like credit, but now you're here.

Hello Kasumi,

All of the math routines I post are public domain. I want people to use them. :) Credit is always nice if you re-post them later.


If you are using the divide by 5 and divide by 10 routines, then be sure to check my blog for the latest and greatest as I didn't post all of the updated routines in the thread you linked to.


By the way, here is the divide by 5:
Code:
  sta  temp
  lsr
  adc  #13
  adc  temp
  ror
  lsr
  lsr
  adc  temp
  ror
  adc  temp
  ror
  lsr
  lsr


If you are doing a smaller range of 0-129, then you can omit the ADC #13 to save 2 cycles and two bytes. Likewise you would add one more LSR to the end of the routine to have a limited (0-129 start value) unsigned integer divide by 10.
Re: Simulating decimal mode
by on (#129816)
I just realized that you are storing Absolute,X and not Zeropage,X.

In this case a TYA has to be added back in since there is no STY Absolute,X instruction. It now uses 18 bytes which is still not too bad.

Code:
    ldy    #$D0-1
    sec
.loopDivTen:
    iny
    sbc    #10
    bcs    .loopDivTen
    adc    #$DA
    sta    StringBuffer+1,X
    tya
    sta    StringBuffer,X
    rts