This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

cmp/beq vs. simple branch table advice?

cmp/beq vs. simple branch table advice?
by on (#56095)
I have some places in my game's code where I do a series of cmp and then beq to decide where to branch based on some state (usually a simple enumeration counting from 0). I've seen a couple of hints on here about how to implement a simple branch table, but I was wondering what might be the best way of doing this. My current idea:

Code:
locationlowbyte:
    .byte <branch1 ;state 0
    .byte <branch2 ;state 1
locationhibyte:
   .byte >branch1  ;state 0
   .byte >branch2 ;state 1

routine:
   ldx state
  lda locationlowbyte,x
  sta w0
  lda locationhibyte,x
  sta w0+1
  jmp (w0)

branch1:
  ;do stuff
   jmp switchbreak
branch2:
  ;do different stuff
switchbreak:


My very first thought was to have a table of words with the absolute addresses of the branch labels, but then I thought you'd have to shift your state left by 1 bit (unless you make sure your states are already 0, 2, 4, 6, etc.) before finding the entry in the table.
Re: cmp/beq vs. simple branch table advice?
by on (#56096)
Gradualore wrote:
I've seen a couple of hints on here about how to implement a simple branch table, but I was wondering what might be the best way of doing this.

Didn't you just ask about this?

The following is actual playfield state dispatch code used in one of my projects:
Code:
game_cycle:
  ; ...
  lda cur_state,x
  asl a
  tax
  lda state_jtable+1,x
  pha
  lda state_jtable,x
  pha
straight_rts:
  rts

.align 2
state_jtable:
  .addr wait_for_join-1
  .addr do_new_game_menu-1
  .addr make_new_game-1
  .addr make_new_piece-1
  .addr falling_piece-1
  .addr falling_piece-1
  .addr find_full_lines-1
  .addr post_clear-1
  .addr move_down-1
  .addr move_up-1
  .addr game_over_anim-1

(Ignore the straight_rts; that's used by the handler for wait_for_join.)

As for the left shift, it will probably make the code clearer and easier to maintain when you add new states.

by on (#56097)
*edit* In the last thread I was seeking primarily how to do an indirect jsr, which I am applying for intelligence routines for game objects. Now I am just trying to clean up bits of code where I have a sequence of cmp/beq to decide what branch of code to take. Most places where I have these "switch statements" have so few cases that I don't think splitting the upper and lower bytes of the addresses would cause much of a readability issue. Either way would beat a sequence of cmp/beq, though.

by on (#56098)
If you have a lot of cases, you can either use a jump table, or use a series of conditional branches shaped like a binary tree, to limit the number of comparisons.

by on (#56099)
So I would basically load the state, and then basically say, is this the low half of my states (0 thru 1 for example with 4 states), if so, brach here, but if it is the higher half (2 thru 3), branch here, and then in each of those branches, split it up further? Interesting idea. I like how direct the jump table technique is, though. *edit* I guess I could just AND out bits in descending power. So for four states I'd actually do AND #2 to see whether I'm in the upper or lower half of my states.

by on (#56100)
Gradualore wrote:
So I would basically load the state, and then basically say, is this the low half of my states (0 thru 1 for example with 4 states), if so, brach here, but if it is the higher half (2 thru 3), branch here, and then in each of those branches, split it up further?

That's how binary search works. After every compare you get rid of half of the possibilities. With 256 possibilities you can imagine it would take quite a while to reach the higher ones, but with binary search you are guaranteed to find the number you are looking for after comparing at most 8 numbers. It's quite a difference.

Quote:
I like how direct the jump table technique is, though.

When I need to index words/addresses I usually use only even codes, so that I don't have to shift. Most of the times 128 possibilities is enough, so I don't see any problem. I have done this for program modes. I have a table like this:

Code:
ProgramModes:

   ModeTitleScreen .dw ShowTitle
   ModeGameplay .dw PlayLevel
   ModeGameOver .dw ShowGameOver
   (...)

And then when I want to use one of these codes I do:

Code:
ldx #ModeGameOver-ProgramModes

so I don't even have to worry about what the codes actually are, I just use their names.

by on (#56110)
Unless I'm missing something or you're wedded to fake jumps using
an rts for some reason, isn't the straight forward thing to use the jmp
indirect (directly, so to speak) with your jump table?

something like:

Code:

 lda state
 asl
 sta JUMP+1
JUMP
 jmp (xx00)  ;xx being the page address for the jump table



of course you're limited to 128 addresses and you have to be
mindful of the jump indirect bug and be able to write the indirect
address

edit: oops, corrected the indirect address to be less little endian and
more like the usual four digit number

by on (#56118)
Self modifying code is neat and all, but to make that practical you'd probably have to jsr to ram and rts back anyway.

Unless ofcourse you were already executing from ram for some reason, then that trick is very neat.

EDIT:
Since you only have 3 posts you might be new to the NES platform. The NES doesn't load programs to ram in order to execute them, it executes them straight from the rom.

by on (#56119)
Anders_A wrote:
Unless ofcourse you were already executing from ram for some reason, then that trick is very neat.

If the goal is to simulate an indirect JSR, you' need to use JSR anyway to save the return address, so you might as well have that subroutine bogax posted in RAM for the fastest code possible, without needing the stack or variables (although the space used by the code is more than a variable would occupy). Even if you don't want to save the return address, the 3 cycles needed to JMP to that code in RAM aren't a big deal (I haven't counted all the cycles to know if it's worth it though).
Re: cmp/beq vs. simple branch table advice?
by on (#56330)
I found a convenient way to build a branch table like in my original example above but with cleaner syntax. I just stumbled upon this in the CA65 documentation, having examined the ".global" directive miau mentioned in another post.

Code:
.HIBYTES and .LOBYTES
Define byte sized data by extracting only the high byte (that is, bits 8-15) from each expression. This is equivalent to .BYTE with the operator '>' prepended to each expression in its list.

Example:

            .lobytes         $1234, $2345, $3456, $4567
            .hibytes         $fedc, $edcb, $dcba, $cba9
     

which is equivalent to

            .byte            $34, $45, $56, $67
            .byte            $fe, $ed, $dc, $cb
     

Example:

            .define MyTable TableItem0, TableItem1, TableItem2, TableItem3

            TableLookupLo:   .lobytes MyTable
            TableLookupHi:   .hibytes MyTable
     

which is equivalent to

            TableLookupLo:   .byte <TableItem0, <TableItem1, <TableItem2, <TableItem3
            TableLookupHi:   .byte >TableItem0, >TableItem1, >TableItem2, >TableItem3
     


So, this would allow you to have up to 256 entries in your branch table (extreme, yes I agree), avoid the extra step of a left shift, but keep the code readable/maintainable. Kinda neat!