This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Accessing an array of pointers in 6502

Accessing an array of pointers in 6502
by on (#238720)
I'm in the process of optimizing some of my C code with inline Assembly. And I'm struggling with the proper way of assigning a pointer variable by taking a value from an array that contains pointers.

Here is the relevant data:
Code:
typedef unsigned char byte;

const byte Data1[] = { 1, 2, 3 };
const byte Data2[] = { 4, 5, 6, 7 };
const byte Data3[] = { 8, 9, 10, 11, 12 };

const byte *const AllData[1000] = { Data1, Data2, Data3, /* ... */ };

const byte *CurrentData;
unsigned int CurrentDataIndex;


And I want to do this:
Code:
CurrentData = AllData[CurrentDataIndex];


So, not only does the array that I try to access have more than 256 (or 128) entries, so the access doesn't work with a simple
Code:
LDY Index
LDA (Array), Y
since the index can definitely be in the range of a two bytes value.

It's also the case that the values inside the array aren't simple single-byte values, but pointers, i.e. also two bytes big.


The code that is generated by cc65 doesn't help that much since it uses some internal functions like aslax1 and ldaxi.
If I replace these functions with their actual contents, the generated code is 22 lines long.
I assume the "canonical" manually-written way of doing this kind of array access is shorter.
Re: Accessing an array of pointers in 6502
by on (#238721)
I don't see many solutions besides creating a pointer using the base address of the array plus the offset, and using that to copy the final pointer to ZP, where you can use it:

Code:
clc
lda #<AllData
adc Offset+0
sta ArrayPointer+0
lda #>AllData
adc Offset+1
sta ArrayPointer+1
ldy #$0
lda (ArrayPointer), y
sta FinalPointer+0
iny
lda (ArrayPointer), y
sta FinalPointer+1

Not a quick procedure by any means...

I don't know what you need this for, but if speed is important here, maybe you should consider a more efficient data structure, or think of a different way to implement the functionality this is related to.

How frequently do you think you'll be running this code?
Re: Accessing an array of pointers in 6502
by on (#238722)
tokumaru wrote:
Not a quick procedure by any means...

Well, at least it looks quicker than the one generated by cc65.
I'll try it out.

tokumaru wrote:
I don't know what you need this for, but if speed is important here, maybe you should consider a more efficient data structure, or think of a different way to implement the functionality this is related to.

How frequently do you think you'll be running this code?

Speed isn't really that relevant here since this code is used rarely.

Although this is supposed to be a general purpose function for this kind of array access, so I'm of course looking for the best method regardless.


The reason why I need this:

My game has screen by screen scrolling.
Every screen is an array of byte values that are the data values for this screen.
So far, so good. A single screen can be accessed easily because it only has byte values and no screen will have more than 256 bytes.
So, as soon as my pointer to a single screen is set, I can do the simple [tt]LDA (Screen), Y[tt] stuff to get single items within the screen, like background or enemy placement.

But of course I have to be able to get the correct screen in the first place.
So, I need an array that stores the address to every screen, i.e. an array with two-byte pointer values.
And since I'll have more than 256 screens, the index variable for this array is also a two byte value.

Same situation for in-game texts: I have a whole bunch of arrays with the actual dialogs. But to be able to access dialog number 375, I need an array that has the addresses of all the dialog arrays.

So, what other data structure would you suggest?
Re: Accessing an array of pointers in 6502
by on (#238723)
tokumaru wrote:
Code:
clc
lda #<AllData
adc Offset+0
sta ArrayPointer+0
lda #>AllData
adc Offset+1
sta ArrayPointer+1
ldy #$0
lda (ArrayPointer), y
sta FinalPointer+0
iny
lda (ArrayPointer), y
sta FinalPointer+1

Are you sure this is correct? You take the value from AllData, add the offset and store it to ArrayPointer.
However, since AllData contains pointers, i.e. two-byte values, the offset has to be doubled first:

AllData[25] in C is actually AllData + 50 in assembly because each entry in AllData is two bytes.

Or am I overlooking something here?
Re: Accessing an array of pointers in 6502
by on (#238724)
Wait... You're using a 2-byte index to look-up a 2-byte value? Why not store the 2-byte value (pointer) directly and get rid of the array of pointers altogether? I assume that you're not using all 16 bits to index the array, is that it?
Re: Accessing an array of pointers in 6502
by on (#238725)
If you have a good way of doing this, sure.

Here's the data again:
Code:
const byte Screen1[] = { 1, 2, 3 };
const byte Screen2[] = { 4, 5, 6, 7 };
const byte Screen3[] = { 8, 9, 10, 11, 12 };
/* ... (more screens) */

How do I get the address of any of these arrays into a pointer without using this:
Code:
const byte *const AllScreens[1000] = { Screen1, Screen2, Screen3, /* ... */ };
?

How do I manage to get my const byte *CurrentScreen pointer to point to the start address of, for example, the 87th screen?
Re: Accessing an array of pointers in 6502
by on (#238727)
DRW wrote:
However, since AllData contains pointers, i.e. two-byte values, the offset has to be doubled first

Yeah, I assumed it'd be doubled in advance, or already stored doubled. If you have to double it on the fly, you'll need more code.

If this index comes from a place where it's mixed with other bit fields, you could make it so it starts from bit 1 up, with some other kind of information using bit 0, so a simple AND operation could be used to mask off the irrelevant bits and result in an already doubled offset. It it isn't mixed with bit fields and uses the whole 16 bits, you can store the value however you want.
Re: Accessing an array of pointers in 6502
by on (#238728)
Where are your (16-bit?) screen numbers coming from? The only way to avoid an array to go from Screen Number to Address of Screen is to use a direct pointer everywhere.

But how often are you converting from screen number to screen address anyway? If there's anything I learned from optimization, it's that there is Fast Code and Slow Code. Fast code is code that absolutely needs to be fast, and is found inside a loop that executes thousands of times. Slow Code is code that executes less frequently, such as only once, or once per level, or something like that. I'd just eat the cost of a lookup if it's not something done at least once per frame.
Re: Accessing an array of pointers in 6502
by on (#238729)
DRW wrote:
How do I manage to get my const byte *CurrentScreen pointer to point to the start address of, for example, the 87th screen?

How are you piecing these 1000 screens together? Do you have a game map made of indices between 0 and 999? What I'm suggesting is, instead of building the map with 16-bit screen indices, build it with 16-bit screen pointers, and get rid of the AllScreens array altogether. There's no need to order the screens if you're referencing them directly.
Re: Accessing an array of pointers in 6502
by on (#238730)
Dwedit wrote:
I'd just eat the cost of a lookup if it's not something done at least once per frame.

If you're not using all 16 bits for indexing, sure, I agree, but if you are, it makes absolutely no sense to index something that's as big as the index itself! You'd just be wasting time (with the unnecessary indirection) and space (with the unnecessary list of pointers).
Re: Accessing an array of pointers in 6502
by on (#238731)
tokumaru wrote:
Yeah, I assumed it'd be doubled in advance, or already stored doubled.

Doubling the index is one of the things that makes the code bigger. Since the index is an uint, it's not just a simple ASL.

That's why I posted this single line of C code:
Code:
CurrentData = AllData[CurrentDataIndex];

I need to do whatever this code is accomplishing. This includes doubling the value of CurrentDataIndex because that's what C does here.

If the code doesn't become much smaller than the C implementation, I can just leave it in C.
But I was thinking that the cc65 compiler maybe bloats the code and a handwritten version is much smaller.
(Just like the compiler needlessly copies your pointer to ptr1 whenever you try to access an array through a pointer, even if your own pointer is already in the zeropage.
Or if you access two arrays one after another with the same index variable: The compiler will not notice that it doesn't need to do another LDY Index again if Index still has the same value as before.)

tokumaru wrote:
If you have to double it on the fly, you'll need more code.

Yeah, it has to be doubled on the fly. After all, this is the screen ID, a dedicated piece of information that controls the bank switching and which little square on the map is blinking etc.
I wouldn't want it doubled in advance just to acommodate to some array storage internals.
The ID belongs to the actual game logic: "This is screen number 439."
That the start address for this screen is stored in AllScreens + 878 should only come into play when AllScreens is actually used, but CurrentScreenIndex shouldn't hold the value 878 natively.

Dwedit wrote:
Where are your (16-bit?) screen numbers coming from?

The screen numbers come from their position in the pointer array:

Code:
const byte Screen0[] = { 1, 2, 3 };
const byte Screen1[] = { 4, 5, 6, 7 };
const byte Screen2[] = { 8, 9, 10, 11, 12 };
const byte Screen3[] = { /* ... */ };
const byte Screen4[] = { /* ... */ };
const byte Screen5[] = { /* ... */ };
/* ... (more screens) */

const byte *const AllScreens[1000] = { Screen0, Screen1, Screen2, Screen3, Screen4, Screen5, /* ... */ };

In this example, if the index is 4, then AllScreens[Index] will load Screen4 simply because that's the address stored at position 4.

How do I know which screen to load when the hero leaves the current screen? Simple:
If he goes right, then CurrentScreenIndex = CurrentScreenIndex + 1.
Left: -1.
Down: +20.
Up: -20.

Dwedit wrote:
The only way to avoid an array to go from Screen Number to Address of Screen is to use a direct pointer everywhere.

I don't know what you mean. I cannot write CurrentScreen = Screen487 (or CurrentScreen = ScreenKingsCastle) unless I make a switch statement with a few hundred cases.

Dwedit wrote:
But how often are you converting from screen number to screen address anyway?

It's not that important in my current case. But it's a general purpose question, so I'm curious about it anyway.

If you have issues with the pointers, just imagine that I have an array of unsigned integers and the array can be larger than 256 bytes:
How would I access the array in an efficient way?

Here, this example is basically the same thing from an assembly code point of view:
Code:
enum EnemyType { Slime, Spider, Bat, Golem, Medusa ... /* 300 more enemy types */ };

const uint EnemyMaxEnergies[] =
{
    50, /* Slime */
    100, /* Spider */
    120, /* Bat */
    500, /* Golem */
    2000, /* Medusa */
    ... /* 300 more */
};

void ProcessEnemy(byte enemiesSlot)
{
    static uint enemyType;
    static uint maxEnergy;

    enemyType = AllActiveEnemies[enemiesSlot].Type;
    maxEnergy = EnemyMaxEnergies[enemyType];
}


How would one implement
Code:
maxEnergy = EnemyMaxEnergies[enemyType]

in Assembly if maxEnergy is a two byte unsigned integer value, but the number of enemies is also larger than 256, so the array index is also unsigned int?

tokumaru wrote:
What I'm suggesting is, instead of building the map with 16-bit screen indices, build it with 16-bit screen pointers, and get rid of the AllScreens array altogether. There's no need to order the screens if you're referencing them directly.

I'm still curious how you would actually accomplish this. How shall I access the pointer directly? Let's say I have a screen named ScreenTown. Then I have a screen named ScreenTownEntrance.
When I'm in ScreenTownEntrance and I walk right, how exactly shall the program know that it has to load the ScreenTown pointer now? Where is this information stored?

In my case: It is stored in an AllScreens array. The screens in this array are laid out like in a rectangle. The top left screen is the first entry, the bottom right screen is the last entry.
When I move up, down, left right along the screens, I simply change my index to -20, +20, -1 or +1 respectively.


As far as I see, accessing pointers directly only works in two cases:

1. All screen data has the same size. Then you can add and subtract the screen size to the pointer. But my screens have a variable size.
2. If each screen stores the addresses of the four neighboring screens. Which is a total waste of space since the screens are laid out in a rectangle anyway.
Re: Accessing an array of pointers in 6502
by on (#238736)
Quote:
The screen numbers come from their position in the pointer array

I know what the numbers *mean*, my question was "where are you using them"? Where you get these numbers from right before having to use them as offsets to look screen pointers up. Do they come from a level/world map? My suggestion is to change the level/world map so it doesn't use these indices, but the screen pointers directly. I'm not very good with C, so pardon me if something is really wrong, but I mean something like this:

Code:
//instead of this:
const unsigned int WorldMap[] = {0, 4, 33, 999, 48, 17 /*...*/};
//do this:
const byte *const WorldMap[] = {Screen0, Screen4, Screen33, Screen999, Screen48, Screen17 /*...*/}
Re: Accessing an array of pointers in 6502
by on (#238737)
tokumaru wrote:
Code:
//instead of this:
const unsigned int WorldMap[] = {0, 4, 33, 999, 48, 17 /*...*/};
//do this:
const byte *const WorldMap[] = {Screen0, Screen4, Screen33, Screen999, Screen48, Screen17 /*...*/}

Erm, that's exactly what I'm doing, right from the very first post:
Code:
const byte *const AllData[1000] = { Data1, Data2, Data3, /* ... */ };

You see that I do not have an unsigned int array? I do have an array of pointers (const byte *const), just like you suggested.

Also, my screen numbers aren't arbitrary like in your first example. In my case, screen number 25 means the 25th entry in the const byte *const AllData[] array.
Re: Accessing an array of pointers in 6502
by on (#238738)
Yeah, I first thought that AllScreens was just a means of ordering the screens for indexing in the world map, but I read your post again and it turns out that AllScreens *IS* your world map (the array name didn't make this very clear)... In that case, my suggestion is you don't calculate the pointer from scratch each time. Calculate the value of the pointer when the level starts, then, as the player moves, just add -2 (moving left), 2 (moving right), -40 (moving up) or 40 (moving down). If that's not possible, due to having to mimic exactly what the C code would do, then yeah, there's not much to gain if you have to calculate the whole thing every time. Since this is not done often, i'd just use C.

Sometimes, the speed of assembly doesn't come from the fact that you can do the exact same calculations using less instructions, but from the fact that you can cache and reuse partial results due to not being constrained by the constructs of a high level language.
Re: Accessing an array of pointers in 6502
by on (#238740)
tokumaru wrote:
In that case, my suggestion is you don't calculate the pointer from scratch each time. Calculate the value of the pointer when the level starts, then, as the player moves, just add -2 (moving left), 2 (moving right), -40 (moving up) or 40 (moving down).

I don't think this will gain a lot.

When I increment my CurrentScreen pointer, this wouldn't move the pointer from Screen25 to Screen26, it would simply move the pointer from Screen25 to two bytes after Screen25.

Therefore, I'd have to use an in-between pointer that points to the address of AllScreens + an offset.

And then I'd have to let CurrentScreen point to the value of this in-between pointer.

I doubt that this would really be much less code. And even if it was, that's too much of a hassle.


But let's forget about the pointers.

Since the whole pointer stuff seems to be a bit confusing, let's do the example again that sounds a bit simpler, but that would be exactly the same code in assembly:


I have an array of unsigned integers.
(In assembly: I have an array of words.)

This array has, let's say, 300 entries, i.e. it's 600 bytes large:
Code:
In C:
const unsigned int Data[300] = { 25, 375, 89, 500, 1099 ... };

In Assembly:
Data: .word 25, 375, 89, 500, 1099 ...


I have two variables:
Code:
In C:
unsigned int Value;
unsigned int Index;

In Assembly:
Value: .res 2
Index: .res 2


How do I do this in Assembly?
Code:
Value = Data[Index];

I assume this shouldn't be too much of an unusual situation, right? Reading a number from an array and, unfortunately, the number itself as well as the array offset can be larger than 255.

So, what's the way to do this?

(And no, you cannot pre-buffer the current (array + offset) address and then simply increment and decrement from this position. The index variable can be a new arbitrary value each time the array is accessed.)
Re: Accessing an array of pointers in 6502
by on (#238750)
Unfortunately, in the situation you proposed, the answer is the same, since the list has over 256 elements and each one is 2 bytes. The 6502 is an 8-bit CPU after all, so it's not surprising that it's not particularly swift when dealing with more than 256 of anything.

What we do in assembly is we often try to avoid these problematic situations. You may have over 1000 screens, but since you don't need random access to all of them at any given time, you can resort to some sort of "banking", and change which set of 256 elements or less is visible at any given time as you go (totally not worth the trouble for something that only happens once every several seconds, but you get the point). It's true that by breaking up large amounts of data into smaller chunks you increase the path needed to reach it (there's more indirection), but as long as you don't need full random access, you don't need to reconstruct the entire path on each access, just the final step, which's ideally comprised of 8-bit operations. Whenever a big change happens (new screen, new level, etc.), you might need to recalculate the earlier sections of these paths, but that will happen way less frequently.

And even if you need total random access to a large data set, you can still optimize the most common cases, maybe by placing the 128 most commonly accessed elements first, and only doing the slower type of look-up when the index is larger than 127.

Also, when dealing with 16-bit data, in assembly it's common for the data to be split into two arrays of 8-bit values, one with the low bytes and one with the high bytes, so that you don't lose any range (an 8-bit index can still still access 256 elements) or time doubling the index.

The thing is that in assembly, we usually have more control over how we store or stuff, so we try to pick the most suitable schemes for each type of data, as opposed to always using linear arrays for everything. I don't know if the same approach is feasible in C, or if a C programmer would even want to spend time on that kind of optimization...
Re: Accessing an array of pointers in 6502
by on (#238753)
Well anyway, the generic problem of 16-bit base address and 16-bit word offset looks like this:

Double the offset (because the data type inside is 16-bit and not 8-bit)
Add to base address
Load a 16-bit word

In 6502, it's something like this:
Code:
lda indexLow
asl A
sta tempLow
lda indexHigh
rol A
sta tempHigh
clc
lda tempLow
adc #baseLow
sta tempLow
lda tempHigh
adc #baseHigh
sta tempHigh
;now tempLow/tempHigh contains the address of the pointer to load.
ldy #0
lda (tempLow),y
sta pointerLow
iny
lda (tempLow),y
sta pointerHigh

If you are able to double the number ahead of time, you can do this instead:
Code:
clc
lda indexLow
adc #baseLow
sta tempLow
lda indexHigh
adc #baseHigh
sta tempHigh
ldy #0
lda (tempLow),y
sta pointerLow
iny
lda (tempLow),y
sta pointerHigh
Re: Accessing an array of pointers in 6502
by on (#238759)
Yeah, maybe splitting it into two or more arrays would be better. However, indeed, stuff like this is never time-critical since I only have two of these arrays anyway: Screens and dialogs. Every other item in the game (for example character types, palette data etc.) should come in units of less than 128.

But nevertheless, it's still a good thing to optimize whatever the compiler produces here.

@Dwedit: Thanks for the code. I'll have a look at it later.
Re: Accessing an array of pointers in 6502
by on (#238762)
The compiler's code for this particular case is not bad. It can only be optimized by the things you listed, the redundant Y load and using your own pointer directly.
Re: Accessing an array of pointers in 6502
by on (#238768)
If CurrentDataIndex cannot exceed 127:
Code:
  lda _CurrentDataIndex
  asl a
  tax
  lda _AllData+0,x
  sta _CurrentData+0
  lda _AllData+1,x
  sta _CurrentData+1


Otherwise, this is how I'd write it:
Code:
  ; cc65 makes 18 bytes of ZP variables available for callees' use:
  ; ptr1-ptr4 (2 bytes each), tmp1-tmp4 (1 byte each),
  ; regsave (4 bytes), and sreg (2 bytes)
  ; https://github.com/cc65/wiki/wiki/Using-runtime-zeropage-locations-in-assembly-language
  .importzp ptr1

  ; Multiply map position (20*Y+X) by
  ; sizeof(const byte *) which equals 2
  lda _CurrentDataIndex+0
  asl a
  tay         ; low byte goes to Y
  lda _CurrentDataIndex+1
  rol a
  sta ptr1+1  ; high byte goes here

  ; Add the base address of world map array
  lda #<AllData
  sta ptr1+0
  clc
  lda ptr1+1
  adc #>AllData
  sta ptr1+1

  ; Read the pointer from the world map
  lda (ptr1),y
  sta _CurrentData+0
  iny
  lda (ptr1),y
  sta _CurrentData+1
Re: Accessing an array of pointers in 6502
by on (#238769)
Yeah, you can save a little bit of time by putting the low byte of the doubled offset in Y and letting the CPU's indexing do the addition for you, instead of calculating the compete pointer and loading Y with 0, but that's probably as far as you can go when optimizing the generic case.

You can probably get rid of that CLC too, since the index can only go up to 32767, so the carry would always be clear after shifting that left. Unless you're using that bit for something else, of course.
Re: Accessing an array of pointers in 6502
by on (#239125)
Thanks for the generic function and the optimization.

It turns out that cc65 produces even smaller code than the handwritten Assembly version due to using some internal subroutines (ldaxi, aslax).
Since speed is not an issue (I only use this kind of pointer array access with interger index during the screen buildup) and therefore, I can swallow two additional JSR/RTS, I will simply keep the C version for less ROM space usage after all.

But it was still very informative. (For example, I now know how to double an unsigned integer variable in Assembly.)