This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Changing the scroll every 2 scanlines

Changing the scroll every 2 scanlines
by on (#139122)
In another thread I considered drawing a raycaster's viewport using tiles that contain only 1 software pixel in the Y axis, and vertically compressing 2 name tables (60 tiles) into a 120-pixel tall area by changing the scroll every 2 scanlines. Changing the scroll isn't much of a problem, since the fine Y scroll isn't used at all and 2 $2006 writes can take care of it, but doing it without wasting massive amounts of CPU time is looking like a challenge.

Without a mapper, 100% of the time while the viewport is rendered would be spent on this, since I can't think of any constant-timed task that could be done between scroll changes. MMC3 IRQs fire kinda late in the scanline, enough that the scroll can't be reliably changed right away. Waiting for the next scanline would mean a little over 50% of the time spent squeezing the viewport, which is much better but still fairly expensive.

What would the alternative be? A cycle-based IRQ counter? With that I could time the IRQs so there'd be no waiting at all, and all the stolen time would go towards actually changing the scroll, which I imagine will be around 30% of the time, which sounds more reasonable. Problem is these mappers aren't as easy to come as the MMC3, and I don't know if they'll play nice with 4-screen mirroring across different emulators and flash carts.

Can anyone think of other ways to vertically squeeze the screen without sacrificing too much CPU time?
Re: Changing the scroll every 2 scanlines
by on (#139123)
It'd be fairly easy to trigger an IRQ when the PPU fetches a specific tile or location from the nametable... although the exact phase you'd need for this application might require some experimentation.
Re: Changing the scroll every 2 scanlines
by on (#139128)
I was thinking you could use looping DPCM IRQ that fires at a fixed rate. It can recur at ~4 scanlines, I think? Might be more realistic to do 3 scanlines per scroll instead of 2.

1. Wait a special number of cycles (you'll need to create a table mapping the IRQ timings during the frame to their scanline positions, since each IRQ is going to fire in a different but specific horizontal position).
2. Set a scroll for this scanline.
3. CPU wait for 2 (or 3) scanlines, set scroll a second time.
4. Return from interrupt to resume executing arbitrary code.

You'd have to do a lot of CPU waiting, but at least you could have maybe 30% free time outside the IRQ response.

Actually... does IRQ still work on looping DPCM? If not, accumulated jitter would probably kill this idea.
Re: Changing the scroll every 2 scanlines
by on (#139133)
Quote:
Without a mapper, 100% of the time while the viewport is rendered would be spent on this, since I can't think of any constant-timed task that could be done between scroll changes.[/quote
You could execute a virtual machine and execute a main thread, so that each VM instruction takes a constant amount of cycles, and after every 2 or 3 instructions you do a scroll change (or any other timing sensitive operation such as a $4011 write).

Not that it'd allow your main thread to execute very fast, but at least it's better than nothing.

Quote:
MMC3 IRQs fire kinda late in the scanline, enough that the scroll can't be reliably changed right away.

I'm no specialist, but $2006 writes, as well as the lower bits of the $2005 write takes effect immediately. Of course because of the jitter it will still be a major problem, as visual glitches will appear.
Re: Changing the scroll every 2 scanlines
by on (#139137)
lidnariq wrote:
It'd be fairly easy to trigger an IRQ when the PPU fetches a specific tile or location from the nametable...

Custom hardware is outside of the question for me. Too much work before the first line of code can be written, and the whole chicken/egg thing sucks.

rainwarrior wrote:
I was thinking you could use looping DPCM IRQ that fires at a fixed rate.

While this is an interesting suggestion, my experience with DPCM IRQs has been nothing but painful. I could never get a steady effect from that thing.

Bregalad wrote:
You could execute a virtual machine and execute a main thread, so that each VM instruction takes a constant amount of cycles, and after every 2 or 3 instructions you do a scroll change (or any other timing sensitive operation such as a $4011 write).

That's a very different approach, I like it. It would be a hell of a slow VM, but at least you'd be doing something instead of just waiting! I wouldn't do this in this particular case though, because delayed scroll changes with the MMC3 still sound faster.

Quote:
I'm no specialist, but $2006 writes, as well as the lower bits of the $2005 write takes effect immediately. Of course because of the jitter it will still be a major problem, as visual glitches will appear.

I think you're right, and yes, the jittering is the problem.

One thing I just realized is that I could probably do the first $2006 write at the end of the previous IRQ, which would allow me to finish setting the scroll sooner (save A, load second $2006 byte, write it). That's 10 CPU cycles + 7 to enter the IRQ plus any left over cycles from the instruction that's running when the IRQ fires, we're looking at up to 72 PPU cycles (NTSC)... plus 260, which is when MMC3 IRQs fire, that's 332, well into the first fetches for the next scanline.

I don't care if the fetched data is wrong, because the first 2 tiles are always blank, but the IRQ latency will prevent me from resetting the scroll at a constant spot every time, so there will probably be a lot of jittering.
Re: Changing the scroll every 2 scanlines
by on (#139138)
Here's another crazy thought: while the viewport is rendering, only run logic that doesn't use the X register, so it can contain the second $2006 byte always ready to be written as soon as the NMI fires. That should guarantee that the scroll is always changed before PPU cycle 320, even on PAL, right?

Losing X for a while every frame sucks, but I thing I can cast rays using only A and Y, for example.
Re: Changing the scroll every 2 scanlines
by on (#139139)
Quote:
That's a very different approach, I like it. It would be a hell of a slow VM, but at least you'd be doing something instead of just waiting! I wouldn't do this in this particular case though, because delayed scroll changes with the MMC3 still sound faster.

You are right, but my idea doesn't require the MMC3. Also I through you hated the MMC3's IRQ (and personally I agree it's weird/inconvenient as opposed to the MMC5, the FDS or VRC series for example).

Quote:
Here's another crazy thought: while the viewport is rendering, only run logic that doesn't use the X register, so it can contain the second $2006 byte always ready to be written as soon as the NMI fires. That should guarantee that the scroll is always changed before PPU cycle 320, even on PAL, right?

Definitely possible, but it sounds like the main thread would be extremely painful to code. The time lost because you loose a registers, which implies more memory loads and stores, will probably be on par with the time lost by waiting in each IRQ.

Also I didn't mention it, but I think the MMC3 can fire IRQs in two different positions in the scanline depending on which pattern table (left or right) is used for BG and sprites. Have you considered both possibilities ?
Re: Changing the scroll every 2 scanlines
by on (#139143)
I do dislike how the MMC3's scanline counter kills the versatility of 8x16 sprites, but I won't be needing to use sprites from both pattern tables this time, so it's fine.

Not using X would only affect part of the main thread, which should probably be split into 2 threads because of this. I can think of a few tasks that will work decently using only A and Y. It's certainly faster than a VM with constant-timed instructions.

Yes, I have considered the alternate MMC3 IRQ timing, which fires later than in the normal setup, so it wouldn't help me set the scroll sooner, but would result in less wasted time in case I decided to wait for the next HBlank to change the scroll.
Re: Changing the scroll every 2 scanlines
by on (#139145)
I just had the random thought of making those lines blank, hiding any glitches if you change scroll mid-line (...in theory). The problem is that it pretty much creates a scanlines effect, which may come up as annoying.
Re: Changing the scroll every 2 scanlines
by on (#139147)
Sik wrote:
I just had the random thought of making those lines blank, hiding any glitches if you change scroll mid-line (...in theory). The problem is that it pretty much creates a scanlines effect, which may come up as annoying.

I don't see how that would help saving time, seeing as I'd still have to wait for the next HBlank in order to enable rendering back on... Also, in this particular case, scanlines would compromise the dithering method I plan on using to create more colors.

I'll probably try setting the scroll at the start of the next HBlank (at this point we can be sure there'll be no glitches), effectively wasting 1 scanline every 2 scanlines. Yes, it sucks, but the alternative of having 2 main threads (one of them unable to use X) seems like hell to manage.
Re: Changing the scroll every 2 scanlines
by on (#139149)
I don't quite understand what you're trying to do, perhaps post a mock screenshot?
Re: Changing the scroll every 2 scanlines
by on (#139152)
Someone is using the top 2 rows of each tile as a constant CHR table and rendering to nametables, with each map entry representing an 8x2 pixel area of the screen. This requires 4-screen mirroring (TVROM) and requires changing the scroll every 2 scanlines. See FMV on the NES for more on this technique. And if you don't want to hog the CPU for the entire picture, you need to have a mapper-generated interrupt trigger the writes to the scroll register that change the scroll.
Re: Changing the scroll every 2 scanlines
by on (#139155)
tepples is better with words than I am, but here's my explanation anyway:

I'm considering rendering the graphics for a raycaster this way. Each tile contains 2 software pixels, one on the left and one on the right, so these pixels are really tall. I don't want them to be this tall, but this is the only way I can store all possible combinations of colors in under 256 tiles, so that I don't have to update the pattern tables during gameplay. In order to have more acceptable pixels I want to display only 2 rows of each tile and skip the other 6.

The ultimate goal is to resize a 256x480-pixel area to 256x120, and for that I need to change the scroll every 2 scanlines. I can't spend a lot of CPU time on this though, because raycasting is already a very CPU-intensive task.

And here's why explaining your ideas to other people is good: I JUST HAD AN IDEA: Instead of changing the scroll every 2 scanlines, I can do it every 4 scanlines! I just have to display the bottom of a row of tiles followed by the top of the one below it. Losing 1 scanline out of 4 is much more acceptable than what I was considering before.
Re: Changing the scroll every 2 scanlines
by on (#139159)
Dwedit wrote:
I don't quite understand what you're trying to do, perhaps post a mock screenshot?

Another explanation: he's trying to basically scale a really tall image vertically by changing the Y scroll every 2 scanlines. Unaltered, the image would span 2 nametables and would appear to be stretched really tall. Scrunching the image makes it appear normal.

tokumaru wrote:
And here's why explaining your ideas to other people is good: I JUST HAD AN IDEA: Instead of changing the scroll every 2 scanlines, I can do it every 4 scanlines! I just have to display the bottom of a row of tiles followed by the top of the one below it. Losing 1 scanline out of 4 is much more acceptable than what I was considering before.

Genius!
Re: Changing the scroll every 2 scanlines
by on (#139162)
Quote:
And here's why explaining your ideas to other people is good: I JUST HAD AN IDEA: Instead of changing the scroll every 2 scanlines, I can do it every 4 scanlines! I just have to display the bottom of a row of tiles followed by the top of the one below it. Losing 1 scanline out of 4 is much more acceptable than what I was considering before.

Very clever indeed. I can just see how you got excited :)

The only issue is that scrolling to line #6 of fine scroll is more annoying than scrolling to line #0, #1, #2 or #3 because it can't be done with $2006 alone, but I guess this is a minor issue in your case.

Sounds like a raycaster with decent graphics and framerate is on the way to go :beer: :D

Quote:
See FMV on the NES for more on this technique.

I can't belive I completely missed this topic back then. The problem is that while the technique is impressive, the demoes FMV themselves are very unimpressive. Probably it would take real handcrafted artistic work to make this meaningful, and this requires, well, a good artist who has lots of time.
Re: Changing the scroll every 2 scanlines
by on (#139163)
If your pixels are 4 scanlines high, you could interlace your nametables to split only every 8 scanlines.

0-3 > table 0 y = 0 (row 0)
scroll split
4-7 > table 1 y = 4 (row 0)
8-11 > table 1 y = 8 (row 1)
scroll split
12 - 15 > table 0 y = 12 (row 1)
16 - 19 > table 0 y = 16 (row 2)
scroll split
20 - 23 > table 1 y = 20 (row 2)
etc...
Re: Changing the scroll every 2 scanlines
by on (#139166)
Bregalad wrote:
The only issue is that scrolling to line #6 of fine scroll is more annoying than scrolling to line #0, #1, #2 or #3 because it can't be done with $2006 alone, but I guess this is a minor issue in your case.

Yeah, this isn't a problem if I have a whole scanline to prepare everything. I'll probably get the scroll values from a table anyway, to make things simpler.

Quote:
Sounds like a raycaster with decent graphics and framerate is on the way to go :beer: :D

Wait, weren't you the one saying that NES raycasters could not be good? Well, it's too soon to get excited anyway... the resolution and colors are better than what I had before, but there's still a lot to figure out before I can consider these ideas feasible.

rainwarrior wrote:
If your pixels are 4 scanlines high, you could interlace your nametables to split only every 8 scanlines.

4 scanlines is quite chunky for a raycaster, distances become hard to judge. Plus I want the resolution to be better than my previous attempt, and that would a downgrade. But what you're proposing is exactly what I figured can be done for pixels that are 2 scanlines tall.
Re: Changing the scroll every 2 scanlines
by on (#139172)
Ah, I misread. You already had the idea! :)
Re: Changing the scroll every 2 scanlines
by on (#139174)
tokumaru wrote:
Sik wrote:
I just had the random thought of making those lines blank, hiding any glitches if you change scroll mid-line (...in theory). The problem is that it pretty much creates a scanlines effect, which may come up as annoying.

I don't see how that would help saving time, seeing as I'd still have to wait for the next HBlank in order to enable rendering back on...

I mean the tiles themselves are blank at those lines (no need to disable rendering). The idea is that you can't see what's going on because everything looks the same in those lines anyway.
Re: Changing the scroll every 2 scanlines
by on (#139182)
Sik wrote:
I mean the tiles themselves are blank at those lines

Ah, I see what you mean now. Yeah, that'd help, because I could just set the scroll right away and wait for it to be corrected by the PPU for the next scanline, not worrying whether it's messed up this scanline. Good idea!

But now that I figured I can get by with changing the scroll every 4 scanlines, I guess I'll stick to that.
Re: Changing the scroll every 2 scanlines
by on (#139435)
Bregalad wrote:
Quote:
Without a mapper, 100% of the time while the viewport is rendered would be spent on this, since I can't think of any constant-timed task that could be done between scroll changes.[/quote
You could execute a virtual machine and execute a main thread, so that each VM instruction takes a constant amount of cycles, and after every 2 or 3 instructions you do a scroll change (or any other timing sensitive operation such as a $4011 write).

Not that it'd allow your main thread to execute very fast, but at least it's better than nothing.

Quote:
MMC3 IRQs fire kinda late in the scanline, enough that the scroll can't be reliably changed right away.

I'm no specialist, but $2006 writes, as well as the lower bits of the $2005 write takes effect immediately. Of course because of the jitter it will still be a major problem, as visual glitches will appear.


x2. I tried a demo effect like this and there was jitter all over the place. I fixed it by setting one of the non-transparent colours to the same colour as the background, but still, the jitter is a major problem.
Re: Changing the scroll every 2 scanlines
by on (#139472)
Is this of any help? blargg wrote this wiki doc some time ago. I'll reference the forum thread about it as well, including the code/demo he wrote associated with it:

http://wiki.nesdev.com/w/index.php/Cons ... ronization
viewtopic.php?f=2&t=6589

There's also this:

http://wiki.nesdev.com/w/index.php/Full_palette_demo
viewtopic.php?f=2&t=6484
Re: Changing the scroll every 2 scanlines
by on (#139487)
The jittering only happens if you set the scroll too late, when the PPU itself is already updating the scroll values and fetching tiles for the next scanline. There's a sufficiently large slice of time when it's safe to finish updating the scroll (i.e. the last $2006 write), but MMC3 IRQs fire too late in the scanline for you to make use of that time, so you have to wait until the next scanline to do it right.

blargg's stuff is much more hardcore, targetting full synchronization between the CPU and PPU (like on the 2600, for example), and doesn't have much use in an actual game, because achieving the synchronization takes several frames and you have to use timed code to remain synchronized, IIRC.