This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Best practices for instancing?

Best practices for instancing?
by on (#227916)
Hey! I'm curious about different techniques to tackle instanced entities on the 6502.

Using asm6 where there are no structs, I've come up with my own solution where I define constants for the index of the properties of my data types:

Code:
PROJECTILE_STATE      = 0
PROJECTILE_POS_X_LO   = 1
PROJECTILE_POS_X_HI   = 2
PROJECTILE_POS_Y_LO   = 3
PROJECTILE_POS_Y_HI   = 4
PROJECTILE_...


I then have a constant defining what page my projectiles will be on and use that to loop through all of them.

Code:
lda #PROJECTILES_RAM+#PROJECTILE_STATE, x
; do some game logic

lda #PROJECTILES_RAM+#PROJECTILE_POS_X_LO, x
; do some other game logic


this approach works for me but when I want to deal with data types that don't have a whole page reserved to them and also might cross pages, I end up writing a lot of code indexing with indirect addresses.

Code:
lda #<(player1)
ldx #>(player1)

jsr CheckSomeCollision

...

CheckSomeCollision:

sta curr_pointer_lo
stx curr_pointer_hi

ldy #0

...

lda curr_pointer_lo
clc
adc #PLAYER_POS_X_LO
sta curr_pointer_lo
lda curr_pointer_hi
adc #0
sta curr_pointer_hi

lda (curr_pointer), y
; do some stuff

...

rts


Depending on how complex logic I need to do, I can end up adding and subtracting from those curr_pointer values a lot and I can't help but wonder if I'm doing it right. This approach must be quite performance expensive?

So another approach I've tried is to have variables dedicated to be parameters or temp values:

Code:
lda player1_state
sta curr_player_state
lda player1_posx_lo
sta curr_player_posx_lo
lda player1_posx_hi
sta curr_player_posx_hi
; I have quite a few variables to do this with

jsr UpdatePlayer

; Now I have to pass the values back to the correct player variables

lda curr_player_state
sta player1_state
lda curr_player_posx_lo
sta player1_posx_lo
lda curr_player_posx_hi
sta player1_posx_hi
; and so on...


This approach works but I end up basically having to reserve RAM for three players when there are actually only two playable ones in the game. One tradeoff though, is that I can reuse my "curr_player_" variables for other datatypes before or after I update my players.

My approaches to this works but having no previous experience in assembly and also failing to find anything covering this subject on google, I'm really curious about how other people here on the forum tackle this on the 6502.

Cheers!
Re: Best practices for instancing?
by on (#227917)
The usual solution on 6502 is to make parallel arrays, one for each byte of properties of an entity. Then you can set X to, say, 3 to access all properties of entity ID 3.
Code:
NUM_ACTORS equ 16

actor_xsub: dsb NUM_ACTORS  ; low byte of 24-bit coordinate
actor_x: dsb NUM_ACTORS
actor_xscr: dsb NUM_ACTORS  ; high byte
actor_ysub: dsb NUM_ACTORS
actor_y: dsb NUM_ACTORS
actor_yscr: dsb NUM_ACTORS
actor_facing: dsb NUM_ACTORS
actor_frame: dsb NUM_ACTORS
actor_frame_sub: dsb NUM_ACTORS
actor_health: dsb NUM_ACTORS
Re: Best practices for instancing?
by on (#227918)
One trick that's quite common is to store your entities live values in "parallel" rather than as blobs. So instead of having projectile 1's state, x_hi, x_lo, y_hi, y_lo.... and then projectile 2's state x_hi, x_lo, y_hi, y_lo.... instead you store it like this:

Code:
    proj_state:               DSB 30
    proj_x_hi:                DSB 30
    proj_x_lo:                DSB 30
    proj_y_hi:                DSB 30
    proj_y_lo:                DSB 30


So here I've allocated for up to 30 projectiles at the same time in my RAM. But instead of having each projectile state bunched together into a blob, they are all mixed up in parallel. To read out one specific projectile's values you use the X register to offset:

Code:
    LDX #5             ; We want to look at the 6th projectile
    LDA proj_x_lo,X    ; Load the 6th projectile's low x position


Thus, when you want to call JSR UpdatePlayer, you just make sure that the X register points to the "slot" of the player you want to update. A huge advantage is that you have much more control over where and how your memory is located, you can make sure you always stay within one page, and you can have up to 256 projectiles before you run into trouble.

Edit: tepples beat me to it :beer:
Re: Best practices for instancing?
by on (#227919)
Drakim wrote:
One trick that's quite common is to store your entities live values in "parallel" rather than as blobs. So instead of having projectile 1's state, x_hi, x_lo, y_hi, y_lo.... and then projectile 2's state x_hi, x_lo, y_hi, y_lo.... instead you store it like this:

Code:
    proj_state:               DSB 30
    proj_x_hi:                DSB 30
    proj_x_lo:                DSB 30
    proj_y_hi:                DSB 30
    proj_y_lo:                DSB 30


So here I've allocated for up to 30 projectiles at the same time in my RAM. But instead of having each projectile state bunched together into a blob, they are all mixed up in parallel. To read out one specific projectile's values you use the X register to offset:

Code:
    LDX #5             ; We want to look at the 6th projectile
    LDA proj_x_lo,X    ; Load the 6th projectile's low x position


Thus, when you want to call JSR UpdatePlayer, you just make sure that the X register points to the "slot" of the player you want to update. A huge advantage is that you have much more control over where and how your memory is located, you can make sure you always stay within one page, and you can have up to 256 projectiles before you run into trouble.

Edit: tepples beat me to it :beer:


I think I follow what you're saying, and I'm looking forward to what improvements I can make to my code by storing values in parallel. One thing I guess I have to make sure is that none of my properties cross a page. For that I guess I would have to use indirect addressing with the y register.

The approach you guys described feels very similar to what I'm doing in my first example but I totally get how it will make things easier to be able to address an entity by it's id rather than something like ldx PLAYER_1_ENTITY_ID*PLAYER_ENTITY_SIZE.

EDIT: You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page? That would be helpful in this case.

Cheers!
Re: Best practices for instancing?
by on (#227921)
pwnskar wrote:
You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page? That would be helpful in this case.


I don't think so, but it shouldn't be too hard to write an ASM6 macro that does this for you, something like this maybe:

Code:
MACRO CHECKSAMEPAGE InputLabel
  IF >InputLabel != >$
    ERROR "This is not the same page"
  ENDIF
ENDM


Usable like this:

Code:
    MyArray:      DSB 50
    CHECKSAMEPAGE MyArray
Re: Best practices for instancing?
by on (#227923)
pwnskar wrote:
One thing I guess I have to make sure is that none of my properties cross a page.

Why? An indexed read with page crossing costs one cycle. If you're just barely hitting lag frames, and your profiler says you're hitting a lot of penalties from indexed reads that cross a page, then you can align variables later. But at that point, it might be to move your most often accessed variables to zero page, where you save 1 cycle for writing as well.
Re: Best practices for instancing?
by on (#227924)
tepples wrote:
pwnskar wrote:
One thing I guess I have to make sure is that none of my properties cross a page.

Why? An indexed read with page crossing costs one cycle. If you're just barely hitting lag frames, and your profiler says you're hitting a lot of penalties from indexed reads that cross a page, then you can align variables later. But at that point, it might be to move your most often accessed variables to zero page, where you save 1 cycle for writing as well.


Oh, I was under the impression that crossing a page while indexing would loop you back to the top of the page?

Code:
ldx #$10
lda $00ff, x    ; = lda $000f


But I guess the case is actually:

Code:
ldx #$10
lda $00ff, x    ; = lda $010f


But that doesn't work with indirect addressing? I remember trying something like this

Code:
; pointer_lo = $ff
; pointer_hi = $00

ldy #$10
lda (pointer_lo), y


and that would result in me getting the value of $000f rather than $010f.
Re: Best practices for instancing?
by on (#227925)
Wrapping within a page happens in three cases:

  1. Zero page indexed modes wrap within page $00: ldx #$10 lda $fc,x will access $000C. So does the rarely used indexed indirect mode (dd,X).
  2. Stack operations wrap within page $01.
  3. JMP indirect with the address at $xxFF retrieves the high byte from the same page instead of the next.

Both absolute indexed aaaa,X and indirect indexed (dd),Y will cross pages with a 1-cycle penalty on read. This penalty is applied to all absolute indexed and indirect indexed writes, whether or not they cross a page.

Cycle-by-cycle operations for lda aaaa,X that does not cross pages

  1. Read lda aaaa,X opcode
  2. Read low byte of address
  3. Read high byte of address while adding X to the low byte
  4. Read uncorrected address

Cycle-by-cycle operations for lda aaaa,X that crosses pages

  1. Read lda aaaa,X opcode
  2. Read low byte of address
  3. Read high byte of address while adding X to the low byte
  4. Read uncorrected address, such as $03FC + $08 = $030C, while realizing "oh sh-- there's a carry"
  5. Read carry-adjusted address

Cycle-by-cycle operations for lda (dd),Y that does not cross pages

  1. Read lda aaaa,X opcode
  2. Read address of pointer from zero page
  3. Read low byte of address
  4. Read high byte of address while adding X to the low byte
  5. Read uncorrected address

Cycle-by-cycle operations for lda (dd),Y that crosses pages

  1. Read lda aaaa,X opcode
  2. Read address of pointer from zero page
  3. Read low byte of address
  4. Read high byte of address while adding X to the low byte
  5. Read uncorrected address while realizing "oh sh-- there's a carry"
  6. Read carry-adjusted address
Re: Best practices for instancing?
by on (#227926)
pwnskar wrote:
One thing I guess I have to make sure is that none of my properties cross a page.

The penalty for crossing a page is just one cycle, not a big deal, but if you absolutely need to avoid page crossing at all costs, the good thing about this method is that different properties don't need to be contiguous in RAM, so you can freely mix the property arrays with other variables in whatever order you want.

Quote:
For that I guess I would have to use indirect addressing with the y register.

Why? If you start using indirect addressing and pointer manipulation you're basically killing off all the advantages of using parallel arrays.

Quote:
EDIT: You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page?

Nothing built-in, but you can probably make a macro. Something like this:

Code:
.macro CheckPage Address
  ;if the high byte of the specified address doesn't match the high byte of the current address
  .if >Address != >$
    .error "Page crossed!"
  .endif
.endm

Which you can use like this:

Code:
MyArray .dsb 30
CheckPage MyArray

If you don't want to "CheckPage" after every declaration you can create another macro that will reserve the bytes AND check for page crossing in one go:

Code:
.macro FastArray Size
  Start:  .dsb Size
  CheckPage Start
.endm

Which you use like this:

Code:
MyArray FastArray 30


pwnskar wrote:
Oh, I was under the impression that crossing a page while indexing would loop you back to the top of the page?

Only with ZP indexed addressing, but you normally wouldn't have big arrays in ZP anyway.

Quote:
Code:
; pointer_lo = $ff
; pointer_hi = $00

ldy #$10
lda (pointer_lo), y


and that would result in me getting the value of $000f rather than $010f.

That's not how it works, you should be getting the value at $010f. Wrapping only occurs in ZP indexed addressing.
Re: Best practices for instancing?
by on (#227928)
Thank you so much for the replies, guys! You're straightening out so much of this that I had totally misunderstood. I'm having high hopes of making my code smaller, more efficient and more human readable with this knowledge.

Cheers!
Re: Best practices for instancing?
by on (#227930)
I just realized that my page crossing macro would be off by one when testing arrays, because it's testing the address immediately after the array, but you get the idea.
Re: Best practices for instancing?
by on (#228060)
pwnskar wrote:
[code]lda #PROJECTILES_RAM+#PROJECTILE_STATE, x

OT, but what's up with that second "#". Does this code compile in your assembler of choice?
Re: Best practices for instancing?
by on (#228082)
thefox wrote:
pwnskar wrote:
[code]lda #PROJECTILES_RAM+#PROJECTILE_STATE, x

OT, but what's up with that second "#". Does this code compile in your assembler of choice?


It might not. It's not a copy paste from my actual project, just an example I wrote here. Still being quite green, I sometimes forget when the # is needed or not. :)

Cheers!
Re: Best practices for instancing?
by on (#228085)
The # is needed when you want to tell the assembler/6502 to use an immediate value (think "literal number") in the operand itself, rather than "an address (in RAM or ROM) to read/write to/from". It's not a prefix used to represent base of numbers (e.g. $ for hexadecimal or % for binary), or "types" of numbers on a per-number or per-variable basis. Better explained in code with comments:

Code:
FOO   = $01
BAR   = $02
UGH   = $0a
DERP  = %11111110
HOBER = 26

lda FOO           ; equivalent to lda $01 -- reads value from RAM location $01 and puts it into the accumulator -- assembles to a5 01
lda #FOO          ; equivalent to lda #$01 -- puts the literal value 1 ($01) into the accumulator -- assembles to a9 01

lda FOO+BAR       ; equivalent to lda $01+$02, i.e. lda $03 -- assembles to a5 03
lda #FOO+BAR      ; equivalent to lda #$01+$02, i.e. lda #$03 -- assembles to a9 03

lda DERP          ; equivalent to lda %11111110 or lda $fe -- assembles to a5 fe
lda #DERP         ; equivalent to lda #%11111110 or lda #$fe -- assembles to a9 fe

lda HOBER         ; equivalent to lda 26 or lda $1a -- assembles to a5 1a
lda #HOBER        ; equivalent to lda #26 or lda #$1a -- assembles to a9 1a

lda HOBER*2       ; equivalent to lda 26*2 or lda 52 or lda $1a*2 or lda $34 -- assembles to a5 34
lda HOBER*100     ; equivalent to lda 26*100 or lda 2600 or lda $1a*100 or lda $0a28 -- assembles to ad 28 0a
lda #HOBER*100    ; equivalent to lda #26*100 or lda #2600 -- will fail to assemble because 2600 is too large; values can only be 0-255 (8-bit)
lda $HOBER        ; undetermined -- behaviour may vary per assembler depending on parser; might do lda $26 or might do lda $1a or might throw an error
lda #$HOBER       ; undetermined -- behaviour may vary per assembler depending on parser; might do lda #$26 or might do lda #$1a or might throw an error

lda #(UGH+16)*8   ; let's expand variables, use decimal rather than hexadecimal, and break the math down:
                  ; ($0a+16)*8 == (10+16)*8 == 26*8 == 208
                  ; thus this is the equivalent to lda #208 or lda #$d0 -- assembles to a9 d0

lda ((UGH+3)*$1000)+10   ; let's expand variables and break the math down piece by piece, in order of operation, and do some base conversion:
                         ; ($0a+3) == (10+3) == 13 == $0d
                         ; ($0d*$1000) == $d000
                         ; $d000+10 == $d00a
                         ; thus this is the equivalent to lda $d00a -- assembles to ad 0a d0


It's important to understand two things here (#1 may help you):

1. The CPU uses completely different instructions/opcodes depending on what addressing mode you're using. For example, lda #$05 uses immediate addressing, which assembles to bytes a9 05. But lda $05 would use zero page addressing and assemble to a5 05 -- note the difference of the first byte! Same goes for absolute addressing, where lda $1234 would assemble to ad 34 12.

2. The mathematics you see above is being done at assemble-time, calculated the assembler and NOT done at run-time by the CPU. There's a world of difference between the two. Doing mathematics on the 6502 at run-time is a substantially more involved process (simple addition/subtraction is easy, multiplying/dividing by 2/4/8/16/32/64/128 is easy, anything else is much more advanced).

Does this help?

One thing that will confuse you: you'll see parenthesis () used for mathematical order-of-operation, but you'll also see it in instructions like lda ($16),y. The latter isn't assemble-time mathematics being done by the assembler -- it's real/actual 6502 code using a form of indirect addressing (already discussed). This can add to some confusion as I'm sure you can imagine.

"So how does the assembler know when to use () for instructions and when to use it for math?" It varies per assembler, and you have to read the assembler's documentation to get a feel for it. Some assemblers like NESASM actually use brackets [] to represent the instruction-level addressing mode, and uses parenthesis purely for assemble-time mathematics. Other assemblers are smart/intelligent enough to know what you want.

If this paragraph is confusing, then I'll make it simple: in most assemblers you'd say lda ($16),y or lda ($16,x) or jmp ($1234) to do indirect addressing, while in NESASM you'd need to write lda [$16],y or lda [$16,x] or jmp [$1234]. NESASM tends to be "the odd man out" in this respect, and this major difference can cause a lot of problems for both newbies and experienced programmers.
Re: Best practices for instancing?
by on (#228086)
Thank you for that detailed breakdown, koitsu! Much appreciated! Yeah, for now I'm not doing too much complicated compile time math, mostly just simple adds or subtracts.

BTW, I've started converting my player variables to be stored in parallel and it's gone well so far, just a lot of code to refactor. But I've nearly converted everything concerning the players and will do the same for projectiles and platforms.

I have some routines that do a lot of indirect addressing with pointers that I've not started to convert yet. What I'm doing is:

Code:
ldy #0
lda (pointer_lo), y
sta val1

inc pointer_lo
bne @dont_inc_hi_ptr
inc pointer_hi
@dont_inc_hi_ptr:

lda (pointer_lo), y
sta val2

;... and so on


Whereas from what I can tell i should be able to just increment y, as long as y never exceeds 255?

Code:
ldy #0
lda (pointer_lo), y
sta val1

iny
lda (pointer_lo), y
sta val2

;... and so on


I'll do a little test next time I sit down with my project.

Code:
ldy #10
lda ($90ff), y


If I've understood correctly, this would give me the same result as:

Code:
lda $910f


If so, this should improve the speed of a lot of my OAM buffering and nametable updates.
Re: Best practices for instancing?
by on (#228095)
pwnskar wrote:
Code:
ldy #10
lda ($90ff), y

If I've understood correctly, this would give me the same result as:
Code:
lda $910f

Not exactly, but close -- and a common stumbling point, so don't feel bad. There are a few forms of indirect addressing on the 6502, but WRT to what you just described, there's mainly two:

1. Indexed indirect (sometimes called "pre-indexed") e.g. lda ($12,x) -- you can only use the X register for this
2. Indirect indexed (sometimes called "post-indexed"), e.g. lda ($12),y -- you can only use the Y register for this

The links there explain the differences with code. The difference is that one adds the index register *before* indirection, the other adds the index register *after* indirection. I should also note that you can't use absolute 16-bit addresses with either of these modes, only zero page.

Don't confuse either of those with simple absolute indexed addressing, e.g. lda $1234,x or lda $1234,y, which I think is what what you were thinking of with your above code (i.e. no indirection used).

I think this 6502 opcode chart might have mistakes in it, I forget, but you can see what available addressing modes there are per-opcode. Look up LDA for example. Once you see the available options, it should become a bit more clear.
Re: Best practices for instancing?
by on (#228096)
I hope that chart doesn't have mistakes -- I've referenced it a lot. The only thing I know it's missing is that both PLA and TSX set the S and Z flags.
Re: Best practices for instancing?
by on (#228105)
One of the common go-to 6502 resources/charts/opcode descriptions is known to have problems/bugs/issues. I just can't be bothered to remember which one.
Re: Best practices for instancing?
by on (#228106)
Hmmm.. I think I might be getting contradicting explanations from you guys about indirect indexed addressing?

koitsu wrote:
1. Indexed indirect (sometimes called "pre-indexed") e.g. lda ($12,x) -- you can only use the X register for this
2. Indirect indexed (sometimes called "post-indexed"), e.g. lda ($12),y -- you can only use the Y register for this

The links there explain the differences with code. The difference is that one adds the index register *before* indirection, the other adds the index register *after* indirection. I should also note that you can't use absolute 16-bit addresses with either of these modes, only zero page.


tokumaru wrote:

pwnskar wrote:
Code:
; pointer_lo = $ff
; pointer_hi = $00

ldy #$10
lda (pointer_lo), y


and that would result in me getting the value of $000f rather than $010f.

That's not how it works, you should be getting the value at $010f. Wrapping only occurs in ZP indexed addressing.


So koitsu, from your explanation I actually should loop around to get the value from $000f while tokumaru says I should be getting the value from $010f. I'm going to try this out and see what I get. :P
Re: Best practices for instancing?
by on (#228114)
Koitsu is saying that the *pointers* themselves have to be in ZP, but they can point to anywhere from $0000 to $FFFF.
Re: Best practices for instancing?
by on (#228116)
Both his and my explanations are correct. Look very carefully at the locations of the ZP variables being used in his examples vs. my examples.

I really hate code quotes on this forum, so I'm just going to do it this way:

Code:
Memory contents:

$0000 = $ee
$00ff = $34
$0100 = $12
$1234 = $aa
$ee39 = $bb

Code:

pointer_lo = $ff

ldy #5
lda (pointer_lo),y

You know what's going on with ldy #5 so no need to explain that.

The first thing the CPU is going to do is read a byte (the "low byte") from ZP location $ff. That value is $34.

The next thing the CPU is going to do is read a byte (the "high byte") from ZP location $ff+1. What address do you think the CPU is going to read from (for the "high byte" of the effective address it's trying to calculate), and what value do you think it's going to get?

If you answered $100 (thus a value of $12 for the "high byte"), you'd be incorrect. The CPU will actually read the 2nd byte from ZP location $00, thus value $ee.

This is because of page wrapping; ZP is called zero page for a reason: it represents memory $0000-$00FF. With ZP addressing, you cannot "wrap" from $ff ($00ff) to $100 ($0100) -- the addressing has to stay within the $00xx range. This concept applies to all ZP addressing modes.

So this is what actually happens with the above code and those values in memory:

The CPU reads the low byte of the effective address from ZP location $ff. It gets a value of $34.
The CPU reads the high byte of the effective address from ZP location $00. It gets a value of $ee.
The CPU then adds Y to that effective address; $ee34 + 5 = $ee39.
The CPU then reads the byte located at memory location $ee39 (value $bb) and puts that in the accumulator.

If you want me to explain indexed indirect ("pre-indexed", e.g. lda ($ff,x) with an example, I can do that too. The situation is more or less the same, just that the addition of X to the effective address is being done at a different time (earlier in the process, rather than later). There was a discussion on the forum recently about how the "pre-indexed" mode on 6502 is substantially less useful than the "post-indexed" mode, but I can't be bothered to find it.

All of this is entirely different if you're using absolute addressing. But, sadly, there is no indirect indexed or indexed indirect addressing mode that uses absolute addresses on the 6502 -- i.e. there is no lda ($1234),y or lda ($1234,x). You're stuck using ZP.

Likewise, understanding page wrapping as a concept is equally important. A "page wrap" is when the low byte of an address causes the upper byte to have to be incremented. A great example is this simple code:

Code:
ldx #2
lda $80ff,x

Here, the CPU will calculate an effective address of $80ff + 2 ($8101), thus reading from memory location $8101. This caused a "page wrap" because $ff+1 had to increase the value in the upper byte of the effective address ($80 had to become $81). This costs 1 extra CPU cycle, too.

This type of math that happens is actually something that you'll have to do yourself when you get into more advanced situations, like if you want to add 2 to an unsigned 16-bit value (2 bytes) in memory and ensure that the upper byte of the 16-bit value gets incremented properly. There's a really cute/clever way to this on 6502 that surprises people when they see it. I consider this "advanced" material and isn't immediately necessary for understanding CPU basics, but it works like this:

Code:
Memory contents:

$0014 = $fe
$0015 = $20

Subroutine:

add_a_to_14:
  clc
  adc $14
  sta $14
  lda $15
  adc #0
  sta $15
  rts

Main program:

  lda #4
  jsr add_a_to_14

Before the jsr, the 16-bit value in $0014/0015 is $20fe ($0014 = $fe, $0015 = $20)

After the jsr, the 16-bit value in $0014/0015 is $2102 ($0014 = $02, $0015 = $21)

If you want to know how this trick works, just ask. You might think the adc #0 serves no purpose, but it's incredibly important. Hint: it involves use of the carry flag and what adc does both WITH it and TO it. Despite having done 65xxx for a lot of my life (though I'm quite rusty), I still find stuff like this awesome/cool/clever.

You'll find things like this incredibly useful when needing to do things like math on 16-bit values that are used for PPU RAM addressing (e.g. what ends up in $2006).

Edit: forgot the important lda $15 before the adc #0. Yikes!
Re: Best practices for instancing?
by on (#228130)
koitsu wrote:
Before the jsr, the 16-bit value in $0014/0015 is $20fe ($0014 = $fe, $0015 = $20)

After the jsr, the 16-bit value in $0014/0015 is $2102 ($0014 = $02, $0015 = $21)

If you want to know how this trick works, just ask. You might think the adc #0 serves no purpose, but it's incredibly important. Hint: it involves use of the carry flag and what adc does both WITH it and TO it. Despite having done 65xxx for a lot of my life (though I'm quite rusty), I still find stuff like this awesome/cool/clever.

You'll find things like this incredibly useful when needing to do things like math on 16-bit values that are used for PPU RAM addressing (e.g. what ends up in $2006).


That part I think I might understand. I should hope so, because I'm doing a lot of pointer increments that way, as well as some 16-bit collision detection that I've decided not to use, as my game has no scrolling. :)

Am I right to believe that the carry from the first adc gets added onto the second one and then cleared?

I've been doing a lot of refactoring today and I'm soon onto testing replacing all my:

Code:
ldy #0
lda (some_pointer_lo), y
... do stuff with accumulator

lda some_pointer_lo    ; increment pointers
clc
adc #1
sta some_pointer_lo
lda some_pointer_hi
adc #0
sta some_pointer_hi

lda (some_pointer_lo), y
... do stuff again with accumulator


with:

Code:
ldy #0
lda (some_pointer_lo), y
... do stuff with accumulator

iny
lda (some_pointer_lo), y
... do stuff again with accumulator


If I've understood correctly, this should be possible as long as both pointer variables (in this case some_pointer_lo and some_pointer_hi) are on ZP and y never goes above 255?

Again, thank you so much for all the detailed explanations, they are very much appreciated!

Cheers!
Re: Best practices for instancing?
by on (#228136)
pwnskar wrote:
That part I think I might understand. I should hope so, because I'm doing a lot of pointer increments that way, as well as some 16-bit collision detection that I've decided not to use, as my game has no scrolling. :)

Am I right to believe that the carry from the first adc gets added onto the second one and then cleared?

Yup, correct! But it seems I made a catastrophic mistake in my previous code: I forgot an lda $15 before the adc #0. I've edited my post to fix that. Something easily overlooked when we get "used" to seeing a particular routine. Quite a major mistake though. :-)

Detailed explanation for future readers who aren't sure:

clc clears the carry. Next, the adc $14 gets executed. Here, a couple things happen. This is how I mentally envision it:

1. The CPU does: $fe (value in $14) + 4 (accumulator) + 0 (carry) == result of $102
2. Value $102 is too large for an 8-bit register, which is known as "unsigned overflow" in this particular case. Because of this, the carry flag c gets set
3. Likewise, in this situation, the two's complement math done does not contain an error, so the overflow flag v is clear. (We aren't using the overflow flag in this operation, so it's irrelevant, but I wanted to note that it does get cleared here -- two's complement math is one of the biggest struggling points there is on the 65xxx architecture, discussed heavily over the years on this forum)
4. The final result: the accumulator holds the value $02, and the carry flag is set

Next, sta $14 writes $02 to $14. Following that, we have lda $15, so the accumulator now holds $20. Next, we have adc #0. The same process as described above happens, except this is the resulting math:

1. The CPU does: 0 (value in operand) + $20 (value in accumulator) + 1 (carry) == result $21
2. Value $21 fits into an 8-bit register, so carry is clear. Likewise, overflow is also clear
3. The final result: accumulator holds the value $21, carry flag is clear, overflow flag is clear

Finally we do sta $15, which writes $21 to $15.

Thus, our 16-bit pointer at $14/$15 now contains the value $2102, which is exactly what we wanted: $20fe + 4 = $2102. In essence, we use the carry flag as a way to handle the "page wrap" (of our math) for us.

pwnskar wrote:
If I've understood correctly, this should be possible as long as both pointer variables (in this case some_pointer_lo and some_pointer_hi) are on ZP and y never goes above 255?

The former part of your sentence is correct: basically, ensure that some_pointer_lo is not ever $ff, otherwise this would cause the CPU to read the high byte of the effective address from $00, not $0100 like your brain might think.

The latter part of your sentence is incorrect: Y can safely be any value (0-255). The CPU, when adding Y to the effective address (calculated from reading the low byte of the address from some_pointer_lo and the high byte of the address from some_pointer_lo+1), can handle wrapping (ex. $20ff->$2100) just fine.

For example: in my previous post's first code block, if you changed ldy #5 to ldy #$ff, it would still work fine (the final effective address would be $ee34 + $ff == $ef33).

The short of it is: when working with ZP, always remember that ZP addressing stays within page 0 (the $00xx region, or $0000-00FF) and can never "wrap" into page 1 ($01xx, or $0100-01FF). Absolute addressing can/will page wrap, for nice/clean/linear 16-bit addressing, but there's 1 CPU cycle penalty when a page wrap happens.
Re: Best practices for instancing?
by on (#228157)
koitsu wrote:
pwnskar wrote:
If I've understood correctly, this should be possible as long as both pointer variables (in this case some_pointer_lo and some_pointer_hi) are on ZP and y never goes above 255?

The former part of your sentence is correct: basically, ensure that some_pointer_lo is not ever $ff, otherwise this would cause the CPU to read the high byte of the effective address from $00, not $0100 like your brain might think.

The latter part of your sentence is incorrect: Y can safely be any value (0-255). The CPU, when adding Y to the effective address (calculated from reading the low byte of the address from some_pointer_lo and the high byte of the address from some_pointer_lo+1), can handle wrapping (ex. $20ff->$2100) just fine.

For example: in my previous post's first code block, if you changed ldy #5 to ldy #$ff, it would still work fine (the final effective address would be $ee34 + $ff == $ef33).

The short of it is: when working with ZP, always remember that ZP addressing stays within page 0 (the $00xx region, or $0000-00FF) and can never "wrap" into page 1 ($01xx, or $0100-01FF). Absolute addressing can/will page wrap, for nice/clean/linear 16-bit addressing, but there's 1 CPU cycle penalty when a page wrap happens.


Yes, what I meant was it should work as long as i don't increment y past 255, because that would have it wrap back to a lower number. So if I've already incremented y to 255 and my pointers point to say $9000, the next time I do iny it would wrap around to 0 and I would end up reading the value of$9000 instead of $9100? Or would the carry from my iny be used when I do lda (some_pointer_lo), y ?
Re: Best practices for instancing?
by on (#228167)
pwnskar wrote:
Yes, what I meant was it should work as long as i don't increment y past 255, because that would have it wrap back to a lower number. So if I've already incremented y to 255 and my pointers point to say $9000, the next time I do iny it would wrap around to 0 and I would end up reading the value of$9000 instead of $9100? Or would the carry from my iny be used when I do lda (some_pointer_lo), y ?

First question: yes.

Second question: no. There's no way in that situation the effective address could ever be $9100 because registers on the 6502 are only 8-bit; Y can only range from $00 to $ff (0-255), thus $9000+Y can only range from $9000-90FF. Here's some code/memory to make it more clear. I made two separate pointers to show you what happens:

Code:
Memory contents:

$000a = $00
$000b = $90
$0010 = $50
$0011 = $90

$9000 = $aa
$9050 = $bb
$90ff = $cc
$914f = $dd

Code:

pointer1 = $0a
pointer2 = $10

ldy #$ff            ; Y = $ff (255)
lda (pointer1),y    ; Effective address is $9000+Y, hence $90ff, thus a value of $cc
iny                 ; Y = $00 (0)
lda (pointer1),y    ; Effective address is $9000+Y, hence $9000, thus a value of $aa

ldy #$ff            ; Y = $ff (255)
lda (pointer2),y    ; Effective address is $9050+Y, hence $914f, thus a value of $dd
iny                 ; Y = $00 (0)
lda (pointer2),y    ; Effective address is $9050+Y, hence $9050, thus a value of $bb

I don't think this is what you're asking, but: if you're asking if the iny opcode affects the carry flag, the answer is no -- it only affects the n (negative) and z (zero) CPU flags (a.k.a. P).

This opcode chart reports that "S" and "Z" get changed; for whatever reason, they call the negative flag (bit 7 of P) "S" for "sign bit", which is confusing because S usually refers to the stack pointer. *cringing*