This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Correct usage of PPUSTATUS

Correct usage of PPUSTATUS
by on (#201428)
When writing data to PPUADDR, we need to reset the high/low latch by writing LDA PPUSTATUS first.

Now, I've also seen the version BIT PPUSTATUS.

So, I have three questions:

1. Is there any advantage of using BIT over using LDA?

2. And can both versions be used for both, PPUADDR and PPUSCROLL?

3. Are there other PPU addresses where I have to reset the high/low latch by using PPUSTATUS? Or are PPUADDR and PPUSCROLL the only ones?
Re: Correct usage of PPUSTATUS
by on (#201429)
DRW wrote:
When writing data to PPUADDR, we need to reset the high/low latch by writing LDA PPUSTATUS first.

Now, I've also seen the version BIT PPUSTATUS.

So, I have three questions:

1. Is there any advantage of using BIT over using LDA?


Using BIT preserves the previous contents of A.

Quote:
2. And can both versions be used for both, PPUADDR and PPUSCROLL?


Yes, the effect on the PPU is identical with either instruction. Both instructions do a read from memory, the difference is whether they put the data into A or not.

Quote:
3. Are there other PPU addresses where I have to reset the high/low latch by using PPUSTATUS? Or are PPUADDR and PPUSCROLL the only ones?


PPUADDR and PPUSCROLL are the only write-twice registers that have a high/low latch.
Re: Correct usage of PPUSTATUS
by on (#201431)
Thanks for the information.
Re: Correct usage of PPUSTATUS
by on (#201437)
The other thing BIT does that LDA doesn't is it reads bit 6 (sprite 0 hit) into the V (overflow) flag, so for sprite 0 hit testing it's usually the best choice.

For bit 7 (vblank) both LDA and BIT put it in the N (sign) flag, so both are equally good unless you want to preserve A.

For bit 5 (sprite overflow) there's no convenient flag load. The best choice here is BIT, but after loading A with %00100000. (Because BIT does an AND operation without destroying A you can keep testing it in a loop.)


BIT's non-destructive AND is also useful for combination tests, e.g. if you want to test for sprite 0 and vblank together (for backup in case the hit fails unexpectedly?), you can load A with %11000000 and do a BIT test loop. (Take note: BIT's AND only updates the Z flag, the N/V flags are loaded directly from $2002 and not part of the AND operation.)


I just have a habit of using BIT with that register even if I don't need to preserve A. There's just such a strong correlation in practice between BIT and $2002 on the NES, it feels like the "natural" way to read it. (BIT does have some other uses, but they're relatively rare compared to seeing it as a $2002 read.)
Re: Correct usage of PPUSTATUS
by on (#201447)
Yes, BIT was literally designed for situations, like this.

-Thom
Re: Correct usage of PPUSTATUS
by on (#201449)
I see why BIT can be used to get bit 6 and 7, but I don't get why LDA misses bit 5 and 6?
Re: Correct usage of PPUSTATUS
by on (#201450)
Pokun wrote:
I see why BIT can be used to get bit 6 and 7, but I don't get why LDA misses bit 5 and 6?

tschak909 sort of answered that:
tschak909 wrote:
BIT was literally designed for situations, like this.

The purpose of BIT is to use A as an AND mask to test an arbitrary bit (or any 8-bit mask) from a memory location. As an added bonus, it also has a direct fetch of bits 6 and 7 so that they could be used even more easily without having to prepare A. (I wonder if they were greedy enough whether they could have used the carry flag as well?)

LDA on the other hand is a general purpose load instruction. It only sets a minimum number of flags (zero flag and sign flag) becauase it needs to be unobtrusive. LDA gets interleaved with other instructions; think about how a 16-bit addition needs to have LDA in between two ADC instructions, etc. They weren't going to cram in extra "free bit test" stuff into assorted flags, that's very specific to BIT.
Re: Correct usage of PPUSTATUS
by on (#201451)
Just wanted to point out that, from my own experience, that you don't *have* to reset the $2005/6 latch all the time, as long as your code is well behaved and never does any stray $2005/6 accesses. Clearing the latch can be seen as a safety measure, so if you feel more comfortable doing it, that's fine, but you don't *need* to do it. I never do it and never had any problems, and a lot of commercial games I debugged don't do it either.
Re: Correct usage of PPUSTATUS
by on (#201475)
tokumaru wrote:
Just wanted to point out that, from my own experience, that you don't *have* to reset the $2005/6 latch all the time, as long as your code is well behaved and never does any stray $2005/6 accesses.

That's exactly the reason why "City Trouble" still works despite me not using it for scrolling.

But here's a question: What if you don't use the latch and someone presses the reset button exactly between two STA PPUADDR? Wouldn't the following game be misaligned until you power it off?
Re: Correct usage of PPUSTATUS
by on (#201476)
DRW wrote:
But here's a question: What if you don't use the latch and someone presses the reset button exactly between two STA PPUADDR? Wouldn't the following game be misaligned until you power it off?

Yes, you must clear it once on startup, at the very least.

Though standard startup also polls $2002 for two frames to wait for the PPU to warm up anyway, so you've probably already cleared the latch doing that.
Re: Correct usage of PPUSTATUS
by on (#201477)
DRW wrote:
What if you don't use the latch and someone presses the reset button exactly between two STA PPUADDR? Wouldn't the following game be misaligned until you power it off?

The correct reset procedure requires you to wait a few frames for the PPU to initialize, and during this time we usually count frames by reading $2002 repeatedly and checking the vblank flag, so the latch is guaranteed to be cleared after resets.

EDIT: Like rainwarrior said.
Re: Correct usage of PPUSTATUS
by on (#201478)
Oh, right: BIT PPUSTATUS in the wait for vblank function. I forgot about that.

Does anybody know of any other possible conceivable situation where the stuff can somehow misalign, even if I always read or write PPUADDR and PPUSCROLL in pairs?
Re: Correct usage of PPUSTATUS
by on (#201480)
I think it's perfectly good practice to clear the latch at the start of an NMI handler's upload session, or when you're about to load a nametable, etc.

You don't strictly have to, but a few bytes/cycles of redundant latch clears in your game could make the difference between a latent bug that causes a hard fail vs. one that the program can recover from. I actually think they're worth putting in, personally.

There are a lot of ways bugs can arise that end up being completely benign with just a little bit of redundant code. Looking back at some code you shipped later on and thinking "that shouldn't have worked?", and then you realize that extra little bit of protection you'd written that you didn't expect to matter was actually doing its job. I've seen that enough times to think it's worth doing.
Re: Correct usage of PPUSTATUS
by on (#201481)
O.k., I'll keep it then.
Re: Correct usage of PPUSTATUS
by on (#201484)
BTW, about a year ago I found out I could slightly abuse the way the high/low latch and $2005/6 writes work in order to speed up the updating of columns in the attribute tables in one of my engines: Change only low byte of VRAM address (good for attributes)

rainwarrior wrote:
I think it's perfectly good practice to clear the latch at the start of an NMI handler's upload session, or when you're about to load a nametable, etc.

I'm not strongly against clearing the latch as a safety measure, just like I'm not against clearing all RAM during reset as a safety measure, but when done during development, these procedures could end up hiding bugs that'd be easy to catch otherwise. You should NOT have stray writes to $2005/6 in your game, ever, but how will you detect these bugs if you're clearing the latch left and right?

Quote:
You don't strictly have to, but a few bytes/cycles of redundant latch clears in your game could make the difference between a latent bug that causes a hard fail vs. one that the program can recover from. I actually think they're worth putting in, personally.

My advice is, by all means, cover your ass by clearing whatever you can as much and as often as it takes for you to feel comfortable, but save such safety measures for the end of development, because while having bugs manifest themselves in the final product is a BAD thing, having them manifesting themselves during development is a GOOD thing.

Anyway, the reason I don't personally clear the latch is because I always find myself in situations where every cycle matters. I try to code my systems to be as dynamic as possible, and after the overhead of switching banks and deciding it does indeed have to process VRAM updates, my vblank handlers only have so much time left for the actual updates. The updates are scheduled dynamically according to the available time, and there are a few of them that I absolutely need to happen in the same frame, such as a row and a column of metatiles when the screen scrolls diagonally. I kid you not, the 4 cycles taken by a BIT $2002 were preventing me from fitting those 2 updates together. Between refactoring the more complex systems in hopes of shaving those cycles off from somewhere else or just removing the BIT $2002, the choice was easy. Ever since that happened, it has always made more sense to me to claim those 4 cycles as useful vram update time than using them for a safety measure I probably won't ever need. I guess I could clear the latch *after* the vram updates, but... meh.
Re: Correct usage of PPUSTATUS
by on (#201485)
Unrelated, but it always annoys me how none of the easily googlable 6502 sources tell how BIT works with a partial match.

0xf & 0x1 = 0x1
But what is the Z flag? Zero or one? Ie, for which case is it useful, testing everything or testing "is at least one bit set"?
Re: Correct usage of PPUSTATUS
by on (#201487)
I think this 6502 reference page is the best one. It clearly explains that the Zero Flag is set if the result of A & M results in 0 (and cleared otherwise), and the Overflow and Negative Flags always gets a copy of bit 6 and 7 respectively.

rainwarrior wrote:
Pokun wrote:
I see why BIT can be used to get bit 6 and 7, but I don't get why LDA misses bit 5 and 6?

tschak909 sort of answered that:
tschak909 wrote:
BIT was literally designed for situations, like this.

The purpose of BIT is to use A as an AND mask to test an arbitrary bit (or any 8-bit mask) from a memory location. As an added bonus, it also has a direct fetch of bits 6 and 7 so that they could be used even more easily without having to prepare A. (I wonder if they were greedy enough whether they could have used the carry flag as well?)

LDA on the other hand is a general purpose load instruction. It only sets a minimum number of flags (zero flag and sign flag) becauase it needs to be unobtrusive. LDA gets interleaved with other instructions; think about how a 16-bit addition needs to have LDA in between two ADC instructions, etc. They weren't going to cram in extra "free bit test" stuff into assorted flags, that's very specific to BIT.

I'm probably misunderstanding something but I still don't understand. I understand that BIT makes things easier, but I don't understand why LDA technically wouldn't work. If all bits in $2002 are fully readable, you should get a copy of them in the accumulator after LDA $2002 no? Then you could test the flags in the accumulator after doing that.
Or is it just that LDA would technically work but not practically in a game? I've never tested any of these flags in my NES programs yet, save for the vblank flag tests in the init code.
Re: Correct usage of PPUSTATUS
by on (#201495)
Pokun wrote:
I understand that BIT makes things easier, but I don't understand why LDA technically wouldn't work. If all bits in $2002 are fully readable, you should get a copy of them in the accumulator after LDA $2002 no?

Correct. It's just a lot easier to test bit 6 in a tight-as-possible loop through BIT than through LDA.
Re: Correct usage of PPUSTATUS
by on (#201497)
I see, I was just overthinking it. Thanks!
Re: Correct usage of PPUSTATUS
by on (#201498)
Okay stupid question...
There's also undocumented instruction 0C, NOP abs. Does that do a memory read then discard it with no changes to flags or register A?
Re: Correct usage of PPUSTATUS
by on (#201499)
Yes, as far as I'm aware. (Source)

But I'd use it and other unofficial opcodes only for reading MMIO ports or some other NES-specific task, not game logic that you expect to port to a machine using one of the 6502's successors. On the 65C02, HuC6280, and 65816, $0C is TSB, a read-modify-write instruction that's like ORA but writes the result back to memory rather than to A. (Source)
Re: Correct usage of PPUSTATUS
by on (#201524)
tokumaru wrote:
My advice is, by all means, cover your ass by clearing whatever you can as much and as often as it takes for you to feel comfortable, but save such safety measures for the end of development, because while having bugs manifest themselves in the final product is a BAD thing, having them manifesting themselves during development is a GOOD thing.

Sorry that I've tread back on our old argument, but no, I would definitely not want to suddenly add in extra safety code at the END of development. That has far too much potential to add NEW bugs, and invalidates all of the testing you were doing up until that point. I think doing this would be worse than no safeties at all. If you've gone without them up until that point, just stay the course and ship that way.

Like I said in the other thread, I think "safeties off" tests are a great thing to do, just not as something you alter so late in the project. E.g. every sunday night turn off all the redunant checks (helps if you put 'em in a .define) and do some smoke testing, then after this testing session turn them back on and get back to normal. Spend as much time as is practical testing the way you will deploy. That by itself is much more important than having a redundant safety feature.

tokumaru wrote:
how will you detect these bugs if you're clearing the latch left and right?

I would once again suggest the debugger can do a much better job than working safeties-off. The game failing is neither the only way, nor the best way to detect a bug. Using appropriate debug tools you can detect this with complete reliability without even removing the safety code.

If you wrote a LUA script to catch any $2002 reads where the latch isn't already at 0, you would catch a mismatch error whether or not it resulted in crashing the game or gave any easy visual indications, and wouldn't require any modification of the game's code. Setup breakpoints on unexpected reads/writes, fabricate some sort of assert mechanism, use thefox's nintendulator that automatically catches uninitialized memory use, etc. there's lots of very good tools for this kind of thing!


tokumaru wrote:
situations where every cycle matters...the 4 cycles taken by a BIT $2002 were preventing me from fitting those 2 updates together

Of course, yes. If you need 4 cycles or 3 bytes, it's something you can cut.

There's also a compromise between "never do it" and "always do it", like nobody is advocating clearing $2002 more than once during an NMI handler (pointless), but even if you only put a redundant $2002 read between screens or levels (i.e. timing not an issue, rendering off), that by itself is still a recovery point. Maybe the game goes visually haywire for whatever reason, but if they can just make it to the next screen... they're back in business!

Even if your code is perfect (it isn't) there is such a thing as hardware failures, bad catridge connector, vibrations in the room, crummy power supply, etc. You could write these cases off as "well it should crash and I don't care what happens", but if someone's been playing your game for an hour they'd probably really appreciate being able to recover if it's possible. Such a miracle has happened to me a lot of times while playing NES and it's always been a moment of extreme relief.


Overall I don't think $2002 latch mismatch in particular is a case with high potential for error, really. It's just one of many things I think usually worth doing redundantly as a matter of practice.
Re: Correct usage of PPUSTATUS
by on (#201529)
rainwarrior wrote:
I would definitely not want to suddenly add in extra safety code at the END of development. That has far too much potential to add NEW bugs, and invalidates all of the testing you were doing up until that point.

You have a good point, and I did in fact think of this. I didn't really mean "put the safeties in, assemble and ship", but rather "put the safeties in when the core engine is mostly stable, so there's still plenty of testing ahead as the rest of the game is developed", but I see your point.

Quote:
I think doing this would be worse than no safeties at all. If you've gone without them up until that point, just stay the course and ship that way.

That's probably the reason I don't put the safeties in at all: I think they hide bugs during development, and putting them in late may change things too much. You're right, that was bad advice on my part.

Quote:
every sunday night turn off all the redunant checks (helps if you put 'em in a .define) and do some smoke testing, then after this testing session turn them back on and get back to normal.

Sounds like a good compromise.
Re: Correct usage of PPUSTATUS
by on (#201531)
I have never done a read of $2002 except for the two (recommended) vblank waits when the ppu starts up on reset and have never had a problem...should I be concerned?
Re: Correct usage of PPUSTATUS
by on (#201532)
Or limit haywire operation to one frame by doing the BIT $2002 right after enabling rendering. This way your Lua breakpoint catches the unpaired write even sooner.

GradualGames: If you haven't experienced an unpaired write, there shouldn't be a problem.
Re: Correct usage of PPUSTATUS
by on (#201534)
tepples wrote:
Or limit haywire operation to one frame by doing the BIT $2002 right after enabling rendering. This way your Lua breakpoint catches the unpaired write even sooner.

GradualGames: If you haven't experienced an unpaired write, there shouldn't be a problem.

I only do writes to $2006 and $2005 either within nmi or, outside of nmi with graphics disabled (and the portion of nmi which uses these registers locked out by a condition)...so I don't think I can get the latch into the opposite state from what I expect. What could cause an unpaired write except for improper locking of nmi vs. main thread?

tokumaru wrote:
You should NOT have stray writes to $2005/6 in your game, ever, but how will you detect these bugs if you're clearing the latch left and right?

This is comforting. (since I don't do this) :)
Re: Correct usage of PPUSTATUS
by on (#201537)
If you had unpaired $2005/6 writes, you'd probably have noticed some sort of graphical corruption by now. Like we've been saying, clearing the high/low latch is an extra layer of protection you may want to include in your programs, so they can recover from software bugs or possible hardware malfunctions, but it isn't mandatory, and several commercial games don't do it.
Re: Correct usage of PPUSTATUS
by on (#201539)
GradualGames wrote:
What could cause an unpaired write except for improper locking of nmi vs. main thread?

I can't think of many things... Even improper locking would still execute an even amount of $2005/6 writes, so the latch would remain consistent afterwards, even if those specific writes we're botched.

One thing I can think of are mid-screen horizontal scroll changes... I seem to remember some games not bothering to do a second $2005 write since they only needed to change the X scroll, so this could end up resulting in an inconsistent state of the latch.
Re: Correct usage of PPUSTATUS
by on (#201541)
tokumaru wrote:
GradualGames wrote:
What could cause an unpaired write except for improper locking of nmi vs. main thread?

I can't think of many things... Even improper locking would still execute an even amount of $2005/6 writes, so the latch would remain consistent afterwards, even if those specific writes we're botched.

One thing I can think of are mid-screen horizontal scroll changes... I seem to remember some games not bothering to do a second $2005 write since they only needed to change the X scroll, so this could end up resulting in an inconsistent state of the latch.


Hmm...so in my experiments with Split X/Y scrolling using an irq, sometimes I've seen an odd little dotted pattern in the last 8 pixel wide column of a given screen split (I'm not talking about the flickery/bouncy scanline artifact at the very left of the screen that some games have, say smb3 status bar. I've found it's pretty easy to hide that particular artifact by adjusting timing. This is a more consistent looking pattern). With enough fiddling with the waits that push code into hblank I was able to get it to go away. But I'm wondering if there's any possibility that reading $2002 first in these irqs would also help with that problem or if that is more likely to be purely a timing problem where I may have written something just outside of hblank, perhaps. I'm guessing it may be the latter because it's just this tiny tiny little section of the topmost row of pixels of the last column of a split in some cases, the rest of the split scrolling area looks as expected.
Re: Correct usage of PPUSTATUS
by on (#201542)
With noise and hardware malfunctions anything could happen that is supposed to be fail-proof in software. NESes and Famicoms often have old and dirty cartridge connectors nowdays which can make the game hang just by a slight bump. My SMB3 cart's pins have very high oxidation, so too much vibration in the floor is enough to make the game hang, and this is a large game without backup RAM. Last time I beat the game, I did it without warping ...or breathing for that matter.

GradualGames wrote:
tepples wrote:
Or limit haywire operation to one frame by doing the BIT $2002 right after enabling rendering. This way your Lua breakpoint catches the unpaired write even sooner.

GradualGames: If you haven't experienced an unpaired write, there shouldn't be a problem.

I only do writes to $2006 and $2005 either within nmi or, outside of nmi with graphics disabled (and the portion of nmi which uses these registers locked out by a condition)

Exactly like I do it. I also BIT $2002 in my NMI but within the same condition that allows rendering that frame (as per advice).
Re: Correct usage of PPUSTATUS
by on (#201543)
tokumaru wrote:
GradualGames wrote:
What could cause an unpaired write except for improper locking of nmi vs. main thread?

I can't think of many things...

If you restrict the possibilities to things you intend your code to do, there's no accounting for bugs at all. ;P

STA (pointer), Y can write anywhere depending on what's in the pointer. Accidental execution of data or RAM as can end up writing anywhere as well. Don't forget the PPU registers aren't just at $2000-2007, they are mirrored, so the target for a stray write is fairly wide.


It's true that a lot of bugs will simply crash the game or otherwise leave it totally failed, but I find (maybe surprisingly?) that quite a lot are relatively benign and often totally recoverable.