This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Roughly how many cycles can I budget for nametable updates?

Roughly how many cycles can I budget for nametable updates?
by on (#39989)
Okay, so writing to VRAM during rendering causes Bad Thingsā„¢ to happen. That leaves us with our good old friend, Mr. Vincent Blank.

According to the wiki:

Quote:
The NTSC video signal is made up of 262 scanlines, and 20 of those are spent in vblank state. After the program has received an NMI, it has about 2270 cycles to update the palette, sprites, and nametables as necessary before rendering begins.


Right now, my nametable updating routine (used for vertical parallax) updates a region of 4 x 8 tiles and takes around 258 cycles. I'd like to increase the size of the update region, but doing so will obviously increase the cycle count as well.

Assuming I perform sprite DMA and overwrite the palette with a 32-byte zero page buffer each frame, approximately how many cycles can I safely budget for nametable updates?

by on (#39992)
Vertical blanking on NTSC systems is 2200-odd cycles, and OAM DMA takes 513 of those.

by on (#39997)
2200-513 = 1687 clocks for doing VRAM writes. Copying one byte from zero-page to VRAM can be done in 7 clocks. So at most, you can copy 241 bytes from zero-page to VRAM during standard VBL. BTW, I thought Blank was Mr. L's middle name; care to cite an authority? :)

by on (#39999)
I'm writing fairly repetitive data (repeating 2-byte sequences) to VRAM, so I'm able to speed things up by caching the bytes in the A/X registers and removing unnecessary LDA/LDX statements. I get about 4 STA's/STX's per LDA/LDX on average.

By the way, what's the best way to guarantee that your VRAM updates haven't exceeded the vblank interval? Can you simply check bit 7 of PPUSTATUS and call it good if it's still set to 1?


blargg wrote:
BTW, I thought Blank was Mr. L's middle name; care to cite an authority? :)

I've got no real authority, just a penchant for nonsense.




EDIT: I just read the NMI section in the wiki and it appears that reading bit 7 of PPUSTATUS isn't the proper way to detect the end of vblank. However, it does say the following:

Quote:
The right way to wait for the end of vblank involves the sprite 0 hit flag.

What does it involve, exactly? Waiting until the sprite 0 hit flag gets set back to 0?

by on (#40003)
Quote:
By the way, what's the best way to guarantee that your VRAM updates haven't exceeded the vblank interval? Can you simply check bit 7 of PPUSTATUS and call it good if it's still set to 1?


Yes if you use Nesticle :wink:

The proper way would be to check your timing under Nintendulator. Also, the sprite zero hit is always reset at the end of VBlank, but if I'm not mistaking it's reset after the last possible $2007 write, and it's not very usefull as you have to guarantee a hit has occured previous frame (else the flag would be '0' for the whole frame, including the VBlank).

by on (#40006)
Yeah, Nintendulator's debugger will give you information about PPU timing, so you can set a breakpoint at the end of your VRAM updates and check how much VBlank time is left.

IMO, this is something you worry about when designing the game, at which point you make sure combinations of VRAM updates that exceed the VBlank time will never be executed. What good is it knowing, at runtime, that you blew it and went over the time limit? No matter what you do from that point on the display is already corrupt! So there is no real point in detecting the end of VBlank, other than for timing reasons.

by on (#40010)
tokumaru wrote:
IMO, this is something you worry about when designing the game

True, but if you're using only a small fraction of vblank time, you could probably add a feature.

Quote:
at which point you make sure combinations of VRAM updates that exceed the VBlank time will never be executed. What good is it knowing, at runtime, that you blew it and went over the time limit?

If the play testers can reproduce a failure to update the screen in time, they can report it back to the developers.

by on (#40012)
tepples wrote:
If the play testers can reproduce a failure to update the screen in time, they can report it back to the developers.

Sorry, when I said "design" I meant "development". So yeah, you could even have the program itself detect this error, but it might be difficult to know exactly what combination of VRAM updates caused the problem without analyzing it further with a debugger at the exact moment it happens.

by on (#40020)
Why not end the display early? Do you need all 240 scanlines? Clip it around 232 and gain another 900+ cycles for vblank area.

by on (#40022)
If you have good enough timing, you can actually start Vblank at rendered scanline 232, and keep the screen shut off until the 8th scanline from the top of the screen. This would give you 16 extra scanlines, and 1800+ cycles. Though this would be kind of difficult to pull off if you didn't have fixed-length routines during the blanking period or a scanline counter. You might be able to get away with having all fixed-length routines in your Vblank code if you need the extra 8 on top. By that, I mean having fixed-length routines doesn't sound unreasonable.

But having 8 blanked on the bottom sounds like a better plan if you don't need so much more time.

by on (#40023)
I don't think he needs any more time, he was just trying to figure out how much time he does have, in order to make good use of it.

If the available time proves to not be enough, then he'll have to think about ways to get more VBlank time.

by on (#40024)
I think the king of updates during VBlank is Battletoads, so whatever that game does, imitate it.

I think Battletoads reserved some of the stack just for nametable updates, so it could use a series of Pops and Stores to populate the nametables.

Battletoads also ran its updating code ALWAYS, no matter what (except maybe if the screen is supposed to be off). That ensures that it always take the same amount of time to finish.

by on (#40028)
Dwedit wrote:
I think the king of updates during VBlank is Battletoads, so whatever that game does, imitate it.

Dwedit, I believe we disagreed recently on the idea that whatever was done by programmers back then is better than whatever we can come up with now. I guess I'm disagreeing with you again! =) I'm not saying this is a bad game, it's one of my NES favorites and certainly very impressive, but it has it's flaws like everything else.

Quote:
I think Battletoads reserved some of the stack just for nametable updates, so it could use a series of Pops and Stores to populate the nametables.

Which is still 1 cycle slower than loading the values from ZP, so it really isn't the optimal choice.

Quote:
Battletoads also ran its updating code ALWAYS, no matter what (except maybe if the screen is supposed to be off). That ensures that it always take the same amount of time to finish.

This was probably great for Battletoads, because not only this is easier to code, but also because the game relied on good timing for it's status bar at the top of the screen (right after VRAM updates) and all. However, the time taken to perform unnecessary VRAM updates could be the difference between fluid animation and lost frames, in cases where there is a lot of action going all at once, so, performance-wise, this isn't optimal at all.

by on (#40030)
I think it would be wise to at least look at what Battletoads does. That game is really the king of blanking. Maybe don't do exactly what it does, but get ideas from it.

I actually suggest if you're going to extend Vblank, do it from the top. Though if you don't have a scanline counter, you'll have to figure out some way to know when the blanking is supposed to stop so you can appropriately set the Y scroll.

Also, I kind of think that the fixed-length PPU updating code always running is kind of a good idea. This eliminates the need for scanline counting. If you're in the NMI, running fixed code from the beggining of the frame, you'll be able to reset the scroll appropriately depending on how far it spills out of Vblank.

I think most of the time with Battletoads, it isn't doing useless PPU updates. I'm pretty sure the player is made of 16 fixed tiles, and those tiles are updated with CHR RAM instead of the player being assigned to different tiles.

by on (#40050)
tokumaru wrote:
Dwedit wrote:
I think the king of updates during VBlank is Battletoads, so whatever that game does, imitate it.

Dwedit, I believe we disagreed recently on the idea that whatever was done by programmers back then is better than whatever we can come up with now.

I think deference to techniques used in older games might have something to do with lot check, so that something works on all known (and unknown) revisions of the PPU. It's likely that some of the corner cases didn't get exploited because Nintendo said they are subject to change in a future NES revision. Remember how some games for the original PlayStation glitch on a PS2, even more PS1 games don't work on a slim PS2, and some PS2 games don't work even on the (discontinued) PS3 with backward compatibility.

Quote:
Quote:
I think Battletoads reserved some of the stack just for nametable updates, so it could use a series of Pops and Stores to populate the nametables.

Which is still 1 cycle slower than loading the values from ZP, so it really isn't the optimal choice.

Heck, copying from $0100 using PLA isn't much faster than copying from $0100 using a plain old unrolled loop. It just saves ROM bytes and opens up the X register.