This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Sprite data caching or reuse?

Sprite data caching or reuse?
by on (#211684)
So, after some profiling I've realized that a lot of my precious CPU time is being spent on calculating sprites. Not just the reads and writes, but also all the meta stuff, picking the right palette, doing offsets to read out animation data, flipping the sprite if it's turned around, etc. I got a neat animation system that I'm very happy with, but it's a little costy.

I've been optimizing all of this to be as fast and clever as possible, but the one thing that really irks me is that most object's sprites end up being exactly the same each frame, so I'm constantly recalculating the same results over and over.

So I've been trying to implement some sort of caching mechanism, so that I don't have to recalculate everything if nothing has changed. I usually know when things have changed (changed object state, scrolled the background, etc) so I know exactly when to reuse the cache and when to refresh it.

But building an efficient and lightweight cache has proved difficult.

The simplest and fastest way would be to simply reuse my DMA's Sprite RAM, and not clear it every frame. Maybe adjusting some x and y positions if the game has scrolled. But, all the sprite flickering techniques I know involves scrambling the order of the sprites every frame, which means no object ever gets the same Sprite RAM position twice in a row. This ruins everything.

Next up I tried allocating some more RAM as a temporary buffer, so that objects could put their sprite data there, and then it could be copy-scrambled over to the real Sprite RAM right before the DMA. But, to allow for all 64 sprites that's 256 bytes of RAM down the drain. Ouch. Not sure I want to spend that much memory.

According to the wiki, a "simple OAM cycling technique" can be implemented by using a write to OAMADDR before the DMA transfer. However, due to OAMADDR writes also having a "corruption" effect this technique is not recommended. Also, if the technique works like how I think it does, the OAM cycling would be very crude and might leave objects invisible for several frames.

So, is my quest impossible, or are there some other ideas or techniques? :D
Re: Sprite data caching or reuse?
by on (#211686)
Post some code? It's surprising to me that sprites would be so expensive.

Quote:
But, all the sprite flickering techniques I know involves scrambling the order of the sprites every frame

I've never implemented flickering, but I thought you could write OAMADDR before OAMDMA to do the shuffle on the DMA. I don't think you have to reorder the sprites in CPU RAM, but maybe I'm wrong.
Re: Sprite data caching or reuse?
by on (#211687)
Metasprite rendering continues to be the bottleneck for me, too. There are some improvements I could make, but the biggest I got so far was to simply move to 8x16 sprites, halving the number of iterations the meta sprite drawing routines must do. That's been good enough for now and has given me the performance I want for the game I'm building.
Re: Sprite data caching or reuse?
by on (#211688)
pubby wrote:
Post some code? It's surprising to me that sprites would be so expensive.

I'll write up an explanation, it's a tad complex so it might take some minutes.

Quote:
I've never implemented flickering, but I thought you could write OAMADDR before OAMDMA to do the shuffle on the DMA. I don't think you have to reorder the sprites in CPU RAM, but maybe I'm wrong.


The wiki recommends against this technique, but maybe it's too conservative? Does anybody know the ups and downs in more detail?
Re: Sprite data caching or reuse?
by on (#211689)
Quote:
But, all the sprite flickering techniques I know involves scrambling the order of the sprites every frame, which means no object ever gets the same Sprite RAM position twice in a row. This ruins everything.

Well, as usual in computing and in particular in retro-computing, you have to sacrifice thigns in order to get the desired features. You should just have two OAM pages, one where the sprites are not shuffled, which is your cache, and one where you shuffle the sprites from the cache so they're re-ordered and flickers properly when there's more than 8 per line instead of disappearing. That sounds rather simple to do.

Quote:
Ouch. Not sure I want to spend that much memory.

Well, that's the price for your sprite caching system. You can save memory by caching only some of the 4 parameters if RAM usage is really this much a problem.
Re: Sprite data caching or reuse?
by on (#211690)
Sprite updates are indeed pretty expensive. Some switch to 8x16 sprites purely so less time is spent rendering an object.

If your game has larger objects, having a separate render routine when you know the object is entirely onscreen can skip a lot of extra logic for checking offscreen per sprite. (Alternatively... don't check for offscreen per sprite.)

Code:
.macro DMSNORMALBODY
   ;35 bytes
   iny
   
   lda [reserved4],y;Y position
   clc
   adc <reserved2
   sta OAM,x

   iny
   
   lda [reserved4],y;X Position
   ;clc
   adc <reserved0
   sta OAM+3,x
   
   iny

   
   lda [reserved4],y
   ;clc
   adc <reserved7;This should guarantee a clear carry
   sta OAM+1,x
   iny
   
   lda [reserved4],y
   sta OAM+2,x
   
   txa
   ;clc;guaranteed clear above
   adc #4
   tax;Carry not guaranteed anything after that add since oampos wraps
   .endm

Reserved0 is low X, reserved2 is low Y, reserved7 is a tile offset. You can totally get rid of that if the tiles used to render your object don't "move" in CHR.

Checking offscreen is not much harder, but you lose the guarantee of the clear carry.
Code:
dms.partial.o.loop:;{
   iny
   
   lda [reserved4],y;Y position
   clc
   adc <reserved2
   sta OAM,x
   
   lda <reserved3
   adc #$00
   bne dms.partial.o.yoffscreen

   iny
   
   lda [reserved4],y;X Position
   clc
   adc <reserved0
   sta OAM+3,x
   
   lda <reserved1
   adc #$00
   bne dms.partial.o.skipsprite.twoiny
   
   iny

   
   lda [reserved4],y
   clc
   adc <reserved7
   sta OAM+1,x
   iny
   
   lda [reserved4],y
   and #%11111100;Clear out the palette
   ora <reserved8
   sta OAM+2,x
   
   txa
   ;clc;guaranteed clear above
   adc #4
   tax;Carry not guaranteed anything after that add since oampos wraps

   dec <reserved6
   bne dms.partial.o.loop
dms.partial.o.end:
   rts
dms.partial.o.yoffscreen:
   iny;move to next OAM entry
dms.partial.o.skipsprite.twoiny:
   iny
   iny
   
   lda #$FF
   sta OAM,x;Set it offscreen

   ;Just repeat what we'd branch to and save a branch
   dec <reserved6
   bne dms.partial.o.loop
   rts;}

Reserved1 is high X, reserved 3 is high Y. Reserved6 is how many sprites are left.

But I made this note as an optimization (untested, so I'm not including it with the code I know works)
Quote:
Indivisible's rolled drawmetasprite loop can probably be made faster. They end with

dec <reserved6
bne dms.o.loop

But:
cpy <reserved6; (or some other zero page variable, since reserved6 can't really be changed
bne dms.o.loop; without making the other code slower)

cpy is 2 cycles faster than dec, but it also ensures a clear carry when the loop begins again.

Basically the setup code should just add <reserved6 *4 to y and store it somewhere. I imagine the reason I didn't is because reserved6 is technically variable (due to the greater than 64 sprite stuff), but it wouldn't really affect this if the loop were set up properly.


I also have a separate subroutine when I want to do "versatile" things like dynamically changing the palette of every sprite in the object.
Edit: Oh wait, no it's just the one above. That's what the reserved8 thing is. So basically I have a fast and a slow one.

Basically I recommend having a different routines for every case. Usually you don't want to do anything advanced, so at least have one for the fastest possible case. (Guaranteed on screen, no dynamic anything.)

But post some of your advanced code, maybe we can improve it.
Re: Sprite data caching or reuse?
by on (#211691)
The use of OAMADDR with values other than zero is heavily discouraged, since that can result in sprite corruption.

The best method for caching sprites I can think of is indeed using another 256-bytes for a second OAM shadow, so you can alternate between them every frame and copy data from one to the other if the sprites are known to not have changed.

What kind of sprite cycling method are you currently using? Are you willing to change that to accommodate the sprite caching? Maybe you can come up with a solution that swaps individual OAM entries when they need to be kept, and simply overwrites the ones that don't.
Re: Sprite data caching or reuse?
by on (#211692)
Bregalad wrote:
Quote:
Ouch. Not sure I want to spend that much memory.

Well, that's the price for your sprite caching system. You can save memory by caching only some of the 4 parameters if RAM usage is really this much a problem.


That's fair enough, and it's probably what I will fall back to if no other secret technique pops up. I've been playing around with doing an "in place" shuffle of the original Sprite RAM so that objects have a new position in the buffer every frame, yet still retain their old values. The shuffling process is fairly expensive though, since you gotta shuffle 256 different values.
Re: Sprite data caching or reuse?
by on (#211693)
pubby wrote:
Post some code? It's surprising to me that sprites would be so expensive.


Now, I haven't posted any code in this post. It's not like my code is secret, but I'd much rather explain what it does (and why it's slow) instead of posting a big blob of asm and forcing everybody to decrypt what's going on. Also, I haven't commented it yet :oops:

Some games like Super Mario Bros 3 has a lot of restrictions on what game objects can exist where, doing tricks like hardcoding the available palettes and CHR banks to the level. So if this is a Goomba+Koopa Troopa level, you simply can't use the Boo or Thwomp enemies or they will look strange and miscolored, and vice versa.

I've been working on a system to defeat such restrictions, by dynamically loading and unloading CHR and palettes as they are needed. The way things work, when an ingame object is created, it attempts to grab an 8k sprite CHR page and a palette for itself. I use a lot of techniques to maximize their potential like reusing as much as possible, having optional alternative graphics and color schemes, and even splitting palettes in two (and if it's just utterly impossible to fit in, the object simply despawns before it's seen).

But all of this only happens when the object is created, not every frame, so it's not the expensive part. But the point is, any object might end up with any of the CHR pages or any of the palettes. So, while SMB3 can optimize it's Koopa Troopa drawing routine by always refering to palette 2, my system has to do a lookup to see which of the four palettes my object was assigned.

Then there is animation data, and some extra goodies I've baked in there like allowing small x/y offsets on the sprites or vflip/hflip flags on individual sprites in the meta-sprite. Despite having so much more stuff than SMB3, my system is still faster due to better coding.

Still, I can see the potential for massive gains by reusing those values rather than having to recalculate them every frame.
Re: Sprite data caching or reuse?
by on (#211694)
Kasumi wrote:
Sprite updates are indeed pretty expensive...


Thanks for the routines, I'll compare them to my own and see if there are any places I can shave off some cycles.

I do indeed have different drawing routines, with varying levels of functionality. Objects, in their initialization routine, pick one that they know will be enough for them.
Re: Sprite data caching or reuse?
by on (#211695)
Quote:
Then there is animation data, and some extra goodies I've baked in there like allowing small x/y offsets on the sprites or vflip/hflip flags on individual sprites in the meta-sprite.

That shouldn't really affect rendering at all. Whether an individual sprite is flipped or not doesn't matter to the block copy, whether an individual sprite is offset a little doesn't matter to the block copy.

Is the issue that you're also trying to save space? I stored every frame twice, once flipped and once not, rather than flipping it at runtime and I don't feel bad about it.
Re: Sprite data caching or reuse?
by on (#211696)
Kasumi wrote:
That shouldn't really affect rendering at all. Whether an individual sprite is flipped or not doesn't matter to the block copy, whether an individual sprite is offset a little doesn't matter to the block copy.


The thing is, I don't block copy my animation data right to the sprite buffer, I do stuff like XOR the global flip flags with the individual flip flags, and add the global x/y coordinate with the sprite's local x/y offset. I also have to fish out the correct palette since it's not hardcoded.

Quote:
Is the issue that you're also trying to save space? I stored every frame twice, once flipped and once not, rather than flipping it at runtime and I don't feel bad about it.


Huh....I hadn't thought about that. That's genius! It totally saves me the XOR of the flip flags. I could even use a macro, and potentially do it for other things too. Thanks mate!
Re: Sprite data caching or reuse?
by on (#211698)
Quote:
I also have to fish out the correct palette since it's not hardcoded.

The palette thing is easy, if the object only uses one palette. (Which, if you're dynamically allocating palettes is probably the common case) It's one instruction:
Code:
lda [reserved4],y
ora SPRpalette
sta OAM+2,x

You store the palette the object wants to SPRpalette before rendering and lose 3 cycles per sprite, oh well.

In a case where palette 0 is like... a shared palette (reserved for player one or something, that enemies can also use)... you could maybe get cute with the data and store it shifted right one bit.

Now the highest bit is free to use as a flag for that.
Essentially:
Bit 7: Use Palette 0
Bit 6: Flip Sprite Vertically
Bit 5: Flip Sprite Horizontally
etc.
Code:
lda [reserved4],y
asl a;Whether to use palette 0 is now in the carry, flip sprite vertically is in bit 7 where OAM expects it, etc.
bcs storepalette;If the high bit was set, we use palette zero
ora SPRpalette
storepalette:
sta OAM+2,x

Admittedly that's still a bit heavy 64 times, but well...
Edit: Just to say it, I'm not sure how much help caching sprite data will be in a game that scrolls. But I'll think about it.
Re: Sprite data caching or reuse?
by on (#211699)
Kasumi wrote:
The palette thing is easy, if the object only uses one palette. (Which, if you're dynamically allocating palettes is probably the common case) It's one instruction.


I am saving the palette per sprite per frame, for the entire metasprite, but in hindsight it's just as you say, maybe 90% of all metasprites use only one palette. I'll make a faster drawing routine they can use instead of the full one, that doesn't need to look up the palette byte each time. :beer:

Quote:
Edit: Just to say it, I'm not sure how much help caching sprite data will be in a game that scrolls. But I'll think about it.


That's a good point, but I think it could be worked around since scrolling most of the time only happens on one axis even for games with multi-directional scrolling like SMB3. You'd simply have to loop over every 4th byte and do an addition :D The carry flag will tell you if the sprite is now off-screen.

But even if it's not viable, just being able to reuse the pattern and attribute bytes would still be a boon.
Re: Sprite data caching or reuse?
by on (#211702)
For each object, do you have a register holding the palette bits?
Re: Sprite data caching or reuse?
by on (#211703)
psycopathicteen wrote:
For each object, do you have a register holding the palette bits?


Yup! It's assigned when the object spawns. An object can actually "claim" up to three palettes.

Futhermore, in the animation data for each object, for each frame, each individual sprite has 2 bits to tell what palette that sprite uses. However, because of my dynamic palette allocation it becomes a bit tricky, because even if the animation data says that the first four sprites all use palette 0, it might be that the game scene was pretty crowded when this object spawned so it was assigned palette 3.

That means I have to translate the "frame palette" to "actual palette" on the fly as the sprites are being plotted out.

But as Kasumi pointed out, for objects that only use one palette (which is MOST objects), that lookup logic can be greatly simplified. I don't mind if bosses and the like are a little more expensive to draw.
Re: Sprite data caching or reuse?
by on (#211704)
I actually think that "data shifted right" thing I posted is a great idea. Maybe not for your game, but I'm happy with it even an hour later. :lol:

One other thing: If an object does use say two palettes, you could split it into two metasprites. It adds... two bytes for the extra address? And probably not the much overhead. May only save if your objects are biggish, though.

Edit: To be more clear, split it into two metasprites and change SPRpalette after the first is rendered. You could even create a routine that handles anything that'd need to switch to avoid the hit of the second subroutine call/return.
Re: Sprite data caching or reuse?
by on (#211706)
Kasumi wrote:
I actually think that "data shifted right" thing I posted is a great idea. Maybe not for your game, but I'm happy with it even an hour later. :lol:


I already use some similar techniques for packing some of my animation data pretty tight. Who needs the background priority flag anyways?

Quote:
One other thing: If an object does use say two palettes, you could split it into two metasprites. It adds... two bytes for the extra address? And probably not the much overhead. May only save if your objects are biggish, though.


You could, but it would complicate a lot of internal logic, so I'm trying to avoid it. It's already possible by writing a custom drawing routine for the object, but I really don't wanna bake support for it into the standard drawing path.
Re: Sprite data caching or reuse?
by on (#211715)
You can organize the metasprite data into color groups.
Re: Sprite data caching or reuse?
by on (#211745)
Alright, I've been hard at work incorporating in some of the suggestions I've gotten, which saw me about 10% faster rendering. I also managed to implement caching, which cut the rendering time in half again.

My caching technique is this:

Every frame the Sprite RAM is shuffled, but it's shuffled so that metasprites still have all their child sprites grouped together. The shuffling process also updates sprite pointers for all objects to their new locations. I use some macros and loop unrolling to ensure that this shuffling is fast.

When it's time to draw an object, the object knows that what it plotted down in the Sprite RAM the previous frame is still there, and it can skip on a lot of things unless they have changed. If nothing except the object's position has changed since last frame, then a fast drawing replacement routine is invoked that adjusts all child sprites x/y and then calls it a day. Even if the metasprite is a non-rectangular form it's maintained perfectly. If any of the sprites positions produces a carry it's removed as it likely went off-screen.
Re: Sprite data caching or reuse?
by on (#211752)
One potential issue. (Rare case, and probably an acceptable loss but still.)

What if the screen scrolls right, then scrolls left? Scroll right, some sprites move offscreen due to carry change, scroll left, those sprites should be back on screen. Is that case covered?

I'm not clear, is this two pages of RAM (one for cache, one for not?) or just one?

Can objects have a variable number of sprites? (Some frames take 12, some take 8)Edit: To ask a more specific question. For the frames that take 8 sprites, are 12 child slots still needed for the object?

Does the halved time include the added time for the actual shuffle? I assume half on average, is your worst case much worse? (Though I guess it wouldn't be.)

Sorry for all the questions, but sounds brilliant. If it uses one page of sprite RAM, I'm all in. if it's two pages of sprite RAM, I'm half in. :D
Re: Sprite data caching or reuse?
by on (#211753)
What happens when an object that has moved partially off screen and had some of its sprites removed moves back on screen? Does it still trigger the "fast mode" where the existing sprites are moved (causing parts of the object to be missing) or does it know it has to generate the missing sprites again?
Re: Sprite data caching or reuse?
by on (#211755)
Kasumi wrote:
What if the screen scrolls right, then scrolls left? Scroll right, some sprites move offscreen due to carry change, scroll left, those sprites should be back on screen. Is that case covered?


tokumaru wrote:
What happens when an object that has moved partially off screen and had some of its sprites removed moves back on screen? Does it still trigger the "fast mode" where the existing sprites are moved (causing parts of the object to be missing) or does it know it has to generate the missing sprites again?


I was thinking about the exact same issue, trying to come up with a way that doesn't overly complicate things, but, I think I'll solve it by never allowing the use of the fast drawing replacement routine for objects that are either being clipped by the edge of the screen or that was clipped (or outside) the screen on the previous frame. Too much special logic going on there to account for in the caching mechanism.

I might refine it later to narrow down, since in theory there is nothing wrong with "clipping away" sprites just as long as no sprites need to reappear.

Kasumi wrote:
I'm not clear, is this two pages of RAM (one for cache, one for not?) or just one?


I'm just using one page of RAM, it's being shuffled "in place". If I used two pages of RAM I'd have to waste time copying over the values I wanted to reuse, and I'd be spending a lot more memory, so that solution would be subpar.

By cleverly using LDA/STA and LDX/STX in combination we can shuffle around the RAM "in place" without any performance penalties for not having a shadow copy.

Kasumi wrote:
Can objects have a variable number of sprites? (Some frames take 12, some take 8)Edit: To ask a more specific question. For the frames that take 8 sprites, are 12 child slots still needed for the object?


Right now I'm just giving 8 sprites to each object, I was really curious on how much CPU time the caching would save me so I cheated a bit on sprite assignment. However, I have plans to change that in favor of just using the sprites it exactly needs that frame, by making a routine that gives out sprites whenever it's called, divided in such a way that my shuffling mechanism is still fast.

Kasumi wrote:
Does the halved time include the added time for the actual shuffle? I assume half on average, is your worst case much worse? (Though I guess it wouldn't be.)


Well, my claim was perhaps a bit simplistic. The shuffling adds a static cost, but the caching makes every object cheaper. That means it's actually slower if you only have one object on the scene, but a lot faster if you have eight objects on the scene. The cost per object was more than cut in half, but I eyeballed it a bit since I was just measuring by tinting the screen with colors as the code was executing.

This is a very good kind of tradeoff though, as we don't really care about saving that much CPU when there is only one object on the scene. We want to optimize for the worst case scenario after all.

Kasumi wrote:
Sorry for all the questions, but sounds brilliant. If it uses one page of sprite RAM, I'm all in. if it's two pages of sprite RAM, I'm half in.


It's definitely just one page. I have big plans and it's gonna need a lot of spare RAM.
Re: Sprite data caching or reuse?
by on (#211769)
Drakim wrote:
Kasumi wrote:
I'm not clear, is this two pages of RAM (one for cache, one for not?) or just one?


I'm just using one page of RAM, it's being shuffled "in place". If I used two pages of RAM I'd have to waste time copying over the values I wanted to reuse, and I'd be spending a lot more memory, so that solution would be subpar.

You'd waste no more time than you do by shuffling in place. Using a second page would make the logic more straightforward, since you'd be essentially filling a new OAM page from scratch every time, generating new data when necessary and copying from the previous table when possible.

Quote:
By cleverly using LDA/STA and LDX/STX in combination we can shuffle around the RAM "in place" without any performance penalties for not having a shadow copy.

Doing things in place may save memory, but it can be significantly more complex, and somewhat slower, since you have to deal with fragmentation due to shuffling groups of sprites of different lengths and clipping.

Quote:
Right now I'm just giving 8 sprites to each object, I was really curious on how much CPU time the caching would save me so I cheated a bit on sprite assignment.

So that's why fragmentation isn't a problem for you... I'm not a fan of this solution, since being limited to using multiples of 8 sprites can be very wasteful.

Kasumi wrote:
Sorry for all the questions, but sounds brilliant. If it uses one page of sprite RAM, I'm all in. if it's two pages of sprite RAM, I'm half in.

Using two pages might not be so bad if you can reuse those pages for more than just the OAM shadow. For example, you could use pages $0200 and $0300, and alternate which page is used for the OAM shadow and which is used for the VRAM update buffer every frame. As long as you handle all the sprites before buffering NT/AT/PT/etc. updates, you can copy OAM entries that were used last time, and then you can overwrite the old OAM completely with buffered VRAM updates so the space doesn't go to waste.
Re: Sprite data caching or reuse?
by on (#211774)
You are right tokumaru, without a shadowy copy, fragmentation becomes an issue. I've been twisting my brain at the problem, and while It's possible to write some clever loops to defrag things while shuffling (as long as all "groups" are 2, 4, 8 or 16), the loops would be massive and not prone to loop unrolling which makes them a lot slower. I guess one could live with the 8 sprite tradeoff, but I'm starting to realize the shadow copy is probably worth it and could bring other benefits as well.

Edit: Another possible setup if you really wanna save that memory is to have groups of 4 sprites in the shuffling process and then use more than one metasprite for objects that are bigger than 4 sprites.
Re: Sprite data caching or reuse?
by on (#211777)
I did a lot of math on this. As far as fragmentation, I just assumed a totally unrolled 2048 cycle 256 load 256 store shuffle before any rendering. But any changes would have a duplicate load store later in the frame.

Even with the duplicate load and store, it actually still beat my unrolled thing so long as most sprites didn't need more than one byte changed. (But I made a lot of assumptions, so take that with a grain of salt.)

I didn't think too deeply about it, but I think with two pages you start to really win. I might play around with it for a non scrolling game I'm thinking about.

If anyone wants to check some stuff themselves:
64 fast sprites (my method) is 4160 cycles (Well... not always because page cross stuff).
32 fast sprites (my method) is 2080 + 607= 2687 cycles. (607 is for moving the remaining sprites offscreen)
I assumed always fastest method for both, all sprites in one go. Obviously there'd also be overhead in places but the overhead (deciding whether to use the fast function, navigating to the next object) would be a bit similar for either method.
Edit: Oh. I timed with adding the tile offset, so 3 cycles per sprite could be taken off the above counts for some games. Also the 607 could be made a cycle faster for every sprite. But probably take the counts as they are, because obviously there's still a check to skip the offscreen loop when there are 64 sprites and that's not counted. And there are similar things for the 32 sprite one.
tokumaru wrote:
Using two pages might not be so bad if you can reuse those pages for more than just the OAM shadow. For example, you could use pages $0200 and $0300, and alternate which page is used for the OAM shadow and which is used for the VRAM update buffer every frame.

True, but I'd usually prefer pla sta $2007 VRAM updates. There are definitely games where I'd be fine just dedicating the second page, though.
Re: Sprite data caching or reuse?
by on (#211780)
Has anybody thought about using a mapper with RAM bankswitching and putting the two OAM pages at the exact same address but in two different banks? Instead of writing the address high byte to $4014 to select between your two OAM pages you'd simply switch banks instead.

The advantage would be that all the drawing code that refers to your Sprite OAM address would just magically work on both copies, without the need for indirect addressing, duplicate drawing methods, or copying from one page to the other.
Re: Sprite data caching or reuse?
by on (#211788)
Drakim wrote:
Has anybody thought about using a mapper with RAM bankswitching and putting the two OAM pages at the exact same address but in two different banks? Instead of writing the address high byte to $4014 to select between your two OAM pages you'd simply switch banks instead.

That's overkill! The difference between LDA #IMM; STA $4014 and LDA $ZP; STA $4014 is just 1 cycle... that's hardly worth the trouble.

Quote:
The advantage would be that all the drawing code that refers to your Sprite OAM address would just magically work on both copies, without the need for indirect addressing, duplicate drawing methods, or copying from one page to the other.

Yes, but copying from one set to the other becomes slower, because you have to constantly switch back and forth between the banks.

If you can spare a little ROM space, duplicating the sprite drawing routine is probably the best choice to avoid indirection, since each copy of the routine will know which buffer is the primary one.
Re: Sprite data caching or reuse?
by on (#211800)
tokumaru wrote:
That's overkill! The difference between LDA #IMM; STA $4014 and LDA $ZP; STA $4014 is just 1 cycle... that's hardly worth the trouble.


Hehe, I didn't mean that particular aspect as a cost saving measure. I was just introing my explanation so people would know what I was talking about.

Quote:
Yes, but copying from one set to the other becomes slower, because you have to constantly switch back and forth between the banks.


I realized that for the MMC5 mapper at least, you can mount the RAM banks in several places, so if you need to work on both of them at once (copying back and forth) you could have them neatly side by side for the operation.

Quote:
If you can spare a little ROM space, duplicating the sprite drawing routine is probably the best choice to avoid indirection, since each copy of the routine will know which buffer is the primary one.


I guess it's not so ugly if you are using a macro to duplicate everything, but still, I like my way better :D

The advantage would be that you aren't using up your "always available RAM" but instead using the paged RAM which can be a little more messy to access for global variables.

You could theoretically even have more than one shadow copy if you have some tricks in mind (splitscreen? I dunno).