This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Updating APU Regs Multiple Times Per Frame

Updating APU Regs Multiple Times Per Frame
by on (#26926)
How practical would it be to write a sound engine that wrote to the APU registers 2 times per frame instead of one time per frame? I think it would be interesting with an engine like this, maybe more depth could be added to the APU channels, like for frequency changes, or duty cycle changes for the square channels, or volume changes. For 2 times per frame, maybe a game fires a mapper IRQ near the middle of the frame, or as close as it can get if there's split-screen effects going on there. Maybe a game could use a large status bar like Kirby's Adventure and do mid-frame APU updating there... Could 3 or 4 times per frame also be practical?

What if a game did its own manual sample mixing for $4011 and mixes two sound channels together once per frame? I think such sound channels would need to be split up in one frame segments (50 or 60 HZ segments). When mapper IRQs are being fired at about every 2-3 scanlines or lower frequency, a game can just pop a value from RAM and write to $4011. A problem would be that sample bytes would need to be skipped in split-screen effects and VBlank, or using a pseudo-extra channel timer during split-screen IRQs to still write to $4011 would complicate the IRQs and lose time for calculating the split-screen data. Using conventional loops and indexes, mixing two 80-120 byte segments together would also be very time consuming and take up a lot of the frame, leaving less time to actually update $4011. Even if completely unrolled code was used, a lot of PRG code would need to be used and it would still be pretty time consuming, but it wouldn't be as bad. What are better ways to incorporate pseudo-extra channels with $4011?

I was just wondering, but did games ever use similar methods as these - for sound effects or music? Even though it's not a good game (it's an LJN game), WWF King of the Ring seems to mix audience cheers and wrestler grunts together during gameplay, but the two sound effects seem to drown out each other.

by on (#26927)
Quote:
How practical would it be to write a sound engine that wrote to the APU registers 2 times per frame instead of one time per frame?


I already throught a bit about this. It can range to very simple to almost impossible in function of the mapper used and the CPU usage for the rest of the game (not the sound code).
First, the problem is that to call the sound programm 2 times (or more) a frame, you need to have a second reliable time base (the first being the VBlank NMI). I trought of using APU frame IRQ for this (by waiting half a frame while booting then start the frame APU from there), but then it would be slightly faster (or slower, I cannot remeber) and would eventually run of duty, and there is no way of making it keep in synch, exept if you want to regulary sacrifice a frame to re-sync it (it could cause gaps in the gameplay each 16 frames or so, it could be tolerable in a RPG, maybe not in an action game). You could of course use a music engine that could split between normal mode and dual more automatically, for example in a RPG the music would be normal on the field, but fast in battle (where the sound code is called twice a frame) to have more detailed sound effects, and then skip a frame ocasionally to re-synch the APU frame IRQ used to tigger the additional call of the sound code. Another problem is that the PAL NES has the frame IRQ that is not a little faster or a little slower than the VBlank like the NTSC NES, but it is just a whole lot faster than the VBlank, so it would need to re-synch very often (maybe each 4-5 frames or so).

This is for a mapper that has no particular timebase IRQ. If you use the MMC3 (but do not use split-screen effect) it would be a lot easier to reliably trigger two or more times the music code in the frame. The only real downside is that you will eat a more lot CPU time, as the sound programm is running a couple of time per frame, so it has to not be too slow to not screw up the rest of the programm. Finally, it would be harder to use the IRQs for something else at the same time. Again, a game could use the IRQ for some graphical stuff at one place, and for sound at another place in function of the game needs. In a very standard action game with a status bar, it wouldn't be hard to use 2 IRQs per scanline (approximately at 1/3 and 2/3 of the frame) and split the screen to the status bar on the second IRQ too (before calling the sound routine).

Finally, on the MMC5 it would be even simpler than on the MMC3, as IRQs are absolute (and not relative) so it would help a lot if you want to merge IRQs for graphic uses and for sound uses (you would need to write a small IRQ handler, that would trigger the first needed, then init the second when the first happen, etc... for the whole frame).

So in conclusion this is possible, with and without special mapper, but it sure does complicate things, so unless you really want very detailed sound effects or special effects in you music it's simpler to just use the regular way to do things, and the hardware sweep registers can change the pitch faster than a one-frame basis (combined with also changing the pitch evey frame can create interesting effects).

Quote:
What if a game did its own manual sample mixing for $4011 and mixes two sound channels together once per frame? I think such sound channels would need to be split up in one frame segments (50 or 60 HZ segments)

It depends what you call "mixing". I mean even one single channel cannot be played, unless the whole programm is completely frozen. You *could* come up with a programm that cuts all its tasks into very small timed codes of a fixed CPU lenght then manage to call those small task in a good order and write to $4011 between them, but this would be a real headache to handle (not technically impossible).
If you want to mix 2 channels, in theory you would have to mix the equivalent of one sample in software, write to $4011 AND call a small piece of the rest of the code regulary, which is even more a headache. If you just add the values of two samples together before feeding $4011 I don't think it changes much things, however if you want to come with volume mixing, resampling and such things I'd say forget about that unless you seriously overclock the NES.

With extra hardware of course this make things differently, I guess the Squeedo car has a microcontroller that mix audio and then IRQ the main programm, directly sending it data the cart just has to copy to $4014 regulary. This works, but eats considerable CPU %.

by on (#26933)
Bregalad wrote:
With extra hardware of course this make things differently

Would "extra hardware" include mappers with their own IRQs? Gauntlet II and King of the Ring used the MMC3's IRQs for driving $4011 during rendering only but they don't slow down. Gauntlet II updated $4011 every 3 scanlines, King of the Ring updated $4011 every 2 scanlines. King of the Ring seems to use pre-mixed sound samples for playing both cheers and grunts at the same time, as it just loads data from the PRG ROM.

Empire Strikes Back actually drives $4011 nearly every rendered scanline during gameplay without pausing or slowing down the game, but that might be because of there's very little enemies. Ultimate Stuntman seems to skip sample bytes lost for the game engine and plays drum samples during free time in the game, as the drums' quality worsen when there's more enemies. That approach, however, would not be good in case too much stuff is going on, but there's not much intense action in the game.

Would $4011 sound channel mixing be more practical in puzzle or text/graphic adventure games where there's usually not as much action?

EDIT: Changed "drives $4011 every scanline" to "drives $4011 nearly every rendered scanline" to be more accurate. Also added "during rendering" to "used the MMC3's IRQs for driving $4011" to reflect this.

by on (#26936)
The DMC can be used to generate a mid-frame interrupt or generate many interrupts per frame (up to around 4 kHz, 66 per frame). By using a sample made up of $55 or $AA, you can have just a quiet square wave at high pitch.

About the only use for sample mixing is playing more than one drum sample at once. Trying to play notes would probably result in crappy music like on the Game Boy Advance.

by on (#26937)
Bregalad wrote:
Another problem is that the PAL NES has the frame IRQ that is not a little faster or a little slower than the VBlank like the NTSC NES, but it is just a whole lot faster than the VBlank, so it would need to re-synch very often (maybe each 4-5 frames or so).

This isn't true. On both NTSC and PAL, frame IRQ's occur at (approximately) the same rate that NMI's occur (60Hz NTSC, 50Hz PAL). Blargg verified this some time ago.

That said, I strongly discourage the use of fram IRQ's (for any reason) because they aren't in sync with PPU timing.

by on (#26945)
strangenesfreak wrote:
Empire Strikes Back actually drives $4011 every scanline during gameplay without pausing or slowing down the game, but that might be because of there's very little enemies.

Which mapper is that? And how would anything drive $4011 during OAM DMA?

Quote:
Would $4011 sound channel mixing be more practical in puzzle or text/graphic adventure games where there's usually not as much action?

Possibly, but would $4011 mixing keep up with this kind of puzzle game or this kind of puzzle game?

by on (#26946)
tepples wrote:
Which mapper is that? And how would anything drive $4011 during OAM DMA?

That game uses MMC3, but it doesn't use its IRQs for $4011. Actually, it doesn't update $4011 during VBlank, sorry that I forgot to mention that. None of the games I mentioned update $4011 during VBlank, only when the screen's rendering.

Quote:
Possibly, but would $4011 mixing keep up with this kind of puzzle game or this kind of puzzle game?

I don't think $4011 mixing would work easily with complex puzzle games like Lumines, but it could work with simpler games like Tetris. A bit off topic, but speaking of that Tetris video, woah, that guy is FAST. :shock:

by on (#26948)
Quote:
Trying to play notes would probably result in crappy music like on the Game Boy Advance.

Exept that the NES CPU is abot 20 times less powerfull than the GBA one. In fact I doubt it would be possible to play simples on the NES with resampling&cie without any co-processor on the cartridge, even if all action and screen stuff is completely paused.

And of course it's possible to play samples AND keep action on the screen, HOWEVER the main programm has to regulary write to $4011 at mostly regular intervals, and this is a hard things to do, and would complex the whole game engine. I guess fighting games needs very low CPU usage, allowing the rest of the CPU time used for such things.

by on (#26949)
For $4011 mixing, assuming the gameplay can allow for it without too much difficulty, would it be plausible to mix drum beats (realistic or using sound waves) with simple wave channels (squares, triangles, saws, etc.) instead of realistic non-drum sound samples (guitar sounds, etc.)? Would simple waves sound better instead of realistic samples here?

by on (#26951)
Drums beats should be doable, as if their pitch is sligtly modulated it won't sound too bad. Simple wave channels or looped samples would sound bad anyway, because the pitch would be slightly modulated and sound fuzzy. Unless of course you can write to $4011 at EXACT intervals, which is almost impossible (unless you sacrify a lot of VBlank time for this).

by on (#26955)
It's probably safer just to use DPCM for drums (and possibly bass if you're feeling especially Sunsoftish) and the 2A03 tone generators for anything pitched.

by on (#26966)
If one wanted to play DPCM samples during intense split-screen effect IRQs, would it be practical to read from the same samples converted from DPCM to RAW and update $4011, then determine where to continue the DPCM sample? With this method, if it needs to be updated, DPCM would always be updated at the same time every frame. Would it be alright if DPCM was temporarily disabled and then the sample is continued?
Re: Updating APU Regs Multiple Times Per Frame
by on (#239271)
A very-much-necro post here because this thread is high on Google and I am currently doing more or less exactly what the OP is asking (with the original aim of a sound engine for EXCLUSIVELY music-demo use; no need to count precisely mid-frame or leave time for game code).

Tl;dr: *death-metal dial-up modem noises*

Using a fairly simple testbed that gives every register of every channel a looping queue of up to 8 values held, sub-frame, for literally a number of main-loop iterations rather than anything e.g. NMI-clocked,
- 1 "tick" is basically every 16th iteration of the main program loop, as its ongoing register index hits a given value
- anything less than about 64 "ticks" between updates on the square or triangle channels sounds like... death. (although since Triangle doesn't reset its step clock on register updates, you can achieve some really funky largely-uncontrolled modulation artifacts...)
- 64 "ticks" between register updates produces an end result acoustically close-enough to 1 update per frame at NMI-59.9Hz
-- I'm counting some 0x380 iterations of the loop per frame, so each of the 16 registers should be clocking through at about 0x38 (decimal 56) "ticks" per frame
-- indeed, the same Final Fantasy VI 6-tone block chord I'm testing with now was part of a frame-timed arpeggiation demo I did years ago, and I can definitely hear the top note of that arpeggio going by at about the same speed as the frame-clocked demo when holding for 64 ticks of the loop-clocked demo, barely-maybe going by at all when holding for 32 ticks of the loop-clocked demo, and getting utterly lost in the noise with any shorter wait times.

Results are consistent between Nestopia and a PowerPak in a vintage toploader.

Best guesses so far:
- the pAPU internally uses perhaps the Frame Counter to only "latch" values from the control registers on a 60Hz-or-so interval, so if you are literally writing mid-latch you get undefined garbage, otherwise you get whatever is stable once per frame or so.
AND/OR
- it takes CPU-clock-nontrivial time for the pAPU to adjust the line levels after noticing a register change, and interrupting this process leaves the frequency generators in very unhappy middle-ground states

Either way, I am disappoint.
I'll probably keep banging on it a bit more this weekend, but I suspect that the closest thing possible to CPU-driven mixing/multiplexing on NES is going to be on-the-fly DMC sample building, which is outside my current scope of experimentation.

Update:
Values closer to a clean factor of the de-facto number of ticks per frame do seem to work better.
I can push the Triangle and Square down to 42-tick holds (3/4 frame) with good results when I'm getting ~56 ticks/frame.
28 is dicey, 14 I thought I had working markedly better than 16, exclusively on square, but then I listened more closely in headphones and 14 is still garbage.
And of course, changing the hold time changes the number of times per frame that the loop has to do more heavy lifting, which in turn changes the average loop iterations per frame, which then throws the math off.

Using the same trick to try to mock-volume the triangle channel with a sub-frame-timed on/off duty cycle also does something passable (definitely not perceived as a drop in "volume" but at least a drop in sound intensity through the quite-obvious in/off pattern) at 42-tick holds. Much lower and you're back into the wonderful world of frequency aliasing as you keep interrupting the continuous step-waveform at odd points.

Experiments were generally intended to try to get on the NES what this guy has gotten out of hardware with only a single-channel square beep. Sadly, it seems this won't happen. https://soundcloud.com/mister_beep
Re: Updating APU Regs Multiple Times Per Frame
by on (#239285)
LoneKiltedNinja wrote:
Best guesses so far:
- the pAPU internally uses perhaps the Frame Counter to only "latch" values from the control registers on a 60Hz-or-so interval, so if you are literally writing mid-latch you get undefined garbage, otherwise you get whatever is stable once per frame or so.
AND/OR
- it takes CPU-clock-nontrivial time for the pAPU to adjust the line levels after noticing a register change, and interrupting this process leaves the frequency generators in very unhappy middle-ground states
Both of those are definitely untrue. Multiplexing of audio channels will produce unpleasant FM sounds, like you seem to be hearing, unless you multiplex at ultrasonic rates. (e.g. the N163 emits a new sample every 15 CPU cycles, for a net mixing rate around 15-30kHz).

The various ZX spectrum channel multiplexing examples I've been able to find need sample rates closer to 8kHz, perhaps higher.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239293)
lidnariq wrote:
The various ZX spectrum channel multiplexing examples I've been able to find need sample rates closer to 8kHz, perhaps higher.


The BTP2 music player I used in Stella's Stocking for the 2600 used a rate of 15.75Khz (one sample scan line). If I were designing a CPLD-based mapper chips, I'd be inclined to include an option to generate a scan line interrupt with master-clock cycles, with a "bump cycle" strobe that code could use to establish initial sync, and could then hit once every other frame on NTSC to maintain it. If I were trying to do a "period correct" discrete hardware version on the cheap and didn't need any raster splits or complex graphics but wanted nice audio playback, I might use a 555 timer simply wired to the IRQ.

Having an IRQ which was reliably triggered every scan line wouldn't be as nice for raster splits as one that was programmable, but a minimal-length IRQ:
Code:
    dec ctr
    beq ready
    rti

would add 20 (7 for the IRQ, plus 5+2+6) cycles overhead every 113 cycles when it wasn't doing anything. Annoying, but hardly a showstopper. Code in vblank could run with interrupts disabled if it kept track of how many lines would be skipped, or it could enable interrupts if it could deal with the extra delays.

A minimal audio-playback IRQ for audio that was buffered in the mainline twice per frame would be something like (running from ZP RAM)
Code:
    sta irq_reload_a+1
irq_load_data:
    lda buff
    sta $d011
    inc irq_load_data+1
irq_reaload_a:
    lda #00
    rti

From zero-page RAM, that would cost 7+3+4+4+5+2+6, i.e. 10+13+8 or 31 cycles. Not too bad, save for the necessity of putting the data into the buffer first. Adding an extra 2 cycles to the common case (and 8 to rare cases) would allow for buffers that go beyond 256 bytes.

A four-voice audio-playback IRQ which generated samples individually might be something like:
Code:
irq:
    jmp irq_handler ; Patchable JMP instruction in ZP RAM
typicalHandler:
    sta IRQ_A
    stx IRQ_Y
    lda #<nextHandler
    sta irq+1
    ; clc -- skip if all amplitudes are even, and anti-distort table compensates
    ldy phase0
    lda (phase0l),y
    sta phase0
    lda (phase0a),y
    ldy phase1
    adc (phase1b),y
    ldy phase2
    adc (phase2c),y
    ldy phase3
    adc (phase3d),y
    tay
    lda antidistort,y
    sta $4011
    ldy IRQ_Y
    lda IRQ_A
    rti

The BTP2 music driver used 46 cycles every 76 to generate data then and there for four-voice audio; here we have more cycles/scan line, but add in IRQ support, so we end up with:
Code:
25 : 7+6+6+6 -- interrupt enter/return and register save/restore
 8 : 3+2+3 -- The jmp and then the update of the next jump
40 : Five pairs of a 3-cycle ldy (or in one case sta) with a 5-cycle (zp),y
10 : Two "weenie" instructions (CLC/TAY), distortion correction, and final store

A total of 83 cycles/scan line out of 113. Rather high loading, but still practical for a low-action game. The code would also have a significant zero-page burden of 20 pointers, five phases, and the jump vector.

It might be possible to improve efficiency by having the IRQ handler output a previouly-generated sample, generate data for the next sample, store that, start generating data for the next sample, taking a break at just the right time to output the one it had just generated, finish generating that, and store it in the spot needed for the handler to output it next time. This would add an extra pair of store/reload operations since samples would be stored in advance of need, but would eliminate 25 cyles worth of interrupt entry/exit, for a net win of 13 cycles every two scan lines, which is pretty huge at this level of CPU loading.

That sort of approach might even make such a driver practical using DMC interrupts alone if one tolerates a little extra noise from the DMC going up and down at its own speed. The most practical way to do that would probably be to set the DMC to run once every 54 samples and handle three samples every interrupt, with time-padding code between (average of 144 cycles/interupt). A sample rate a bit below the BTP music driver, but probably still decent. If the IRQ handler starts and ends with (instructions in ZP RAM)
Code:
    sta irq_reload_a+1
    lda #outputValue
    sta $d011
    jmp irq_service_main ; Get out of ZP RAM
irq_out_and_reload_a
    sta $d011
irq_reload_a:
    lda #savedVal
    rti

the last store to $d011 should happen 288 cycles after the first, leaving a minimum IRQ time of 7+3+2+288+4+2+6 = 12+288+12 = 312 cycles every 432. In addition, the DMC would steal around 20 cycles within the IRQ handler and 12 afterward. The 288 cycles in the middle would need to generate data for three samples (which would be doable for three voices if not for four). My big concern would be that the DMC can't be silenced by using all-zero data if code wants to feed audio to $4011.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239298)
supercat wrote:
I might use a 555 timer simply wired to the IRQ.
The ordinary cheap resistors and capacitors needed for the 555 will drift in value enough to become audibly out of tune, unfortunately. (Threshold is 0.4%). Even the jitter might even be bad enough to be noticeable during operation.

One could buy higher-precision parts, but I don't know if a 555 with those better RCs is still cheaper than various higher-precision options. (e.g. 74'393 generating an IRQ every 128cy)
Re: Updating APU Regs Multiple Times Per Frame
by on (#239308)
lidnariq wrote:
supercat wrote:
I might use a 555 timer simply wired to the IRQ.
The ordinary cheap resistors and capacitors needed for the 555 will drift in value enough to become audibly out of tune, unfortunately. (Threshold is 0.4%). Even the jitter might even be bad enough to be noticeable during operation.

One could buy higher-precision parts, but I don't know if a 555 with those better RCs is still cheaper than various higher-precision options. (e.g. 74'393 generating an IRQ every 128cy)


Being +/-5% would be horrible if anything was trying to play along with the NES music, but if all music uses the same pitch reference it would probably be tolerable. The biggest issue I see with any kind of interrupt-driven music is interrupt timing jitter, which would require a feature that could be incorporated in a CPLD but so far as I know hasn't been yet: make the mapping of $FFFE vary based upon the three LSBs of a cycle counter, so depending upon when the interrupt arrives code could execute:
Code:
   ; If IRQ fires with minimum delay
   stx IRQ_X
   sty IRQ_Y
   sta IRQ_A
   lda next_sample
   sta $4011

or
Code:
   ; If IRQ fires after one cycle
   stx IRQ_X
   sta IRQ_A
   nop
   lda next_sample
   sta $4011
   sty IRQ_Y

etc. up to
Code:
   ; If IRQ fires after six cycles
   sta IRQ_A
   lda next_sample
   sta $4011
   stx IRQ_X
   sty IRQ_Y

Having the $FFFE mapping select one of eight locations based upon the three bottom bits of the counter would allow code to eliminate the timing jitter. I've never seen that done, though.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239354)
My two cents' worth is that you don't need to run the ENTIRE sound engine multiple time per frame to achieve interesting effects, just put some subroutines in an IRQ that modulate pitch, duty cycle or volume. Also, referring to tips about using the DMC IRQ: If you don't want to use samples otherwise, you can also use $00 dummy samples for the IRQ timing to produce no unintentional audio. If you do this you can still write to $4011 to produce popping sounds, and then the $00 samples drive the level back to 0. Or, you can alternate between playing a 1-byte $00, or $FF sample to make use of the DMC level's attenuation of the triangle-noise-dmc output to gain limited control over the amplitude of the triangle channel (but be careful, this affects all three channels on the second output pin).
Re: Updating APU Regs Multiple Times Per Frame
by on (#239362)
lidnariq wrote:
One could buy higher-precision parts, but I don't know if a 555 with those better RCs is still cheaper than various higher-precision options. (e.g. 74'393 generating an IRQ every 128cy)


Using the top bit of a simple counter would make it awkward to reduce CPU utilization below 50%. Most of the one-chip counter-based solutions I know of would either go low for only one cycle or for half the period, when what's needed is for the timer to go low for something between 8 and 20 cycles--something the 555 can easily achieve. If one wants a sample rate of one sample per 128 cycles, the best way to achieve that would probably be to use bit 7 of the counter, and have the ISR output two samples 128 cycles apart (output a pre-calculated value, then calculate two samples, store the second for use with the next ISR, and output the first).

Note that NMI would interfere with the audio. Given the problems with polling $2002 to find the start of vblank, the best workaround may be to have code add 1536 to a 3-byte counter on each interrupt and, when it overflows, subtract 199,485 (for NTSC), 212,784 (PAL), or 178,684 (Dendy) and run the program's vblank handler.

Since the NES has more RAM than the Atari 2600, my four-voice audio driver could be quite efficient if I had a good IRQ source. Using DMC interrupts, however, a rough prototype would spend about 200 cycles every 432 figuring audio for four samples but then have to spend 142 cycles killing time between output samples. Perhaps when using DMC interrupts a six-voice or eight-voice driver would make more sense. While a four-voice music player that imposes a 50% CPU load might be preferable to an eight-voice player that poses an 80% load, if the minimum CPU load is going to be 80% one may as well get the maximum music out of it.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239384)
Having quickly prototyped up a simulation in PureData of multiplexing at various audio rates, any variation in multiplex rate produces painfully obvious sounds of heterodyning. Some frequencies of multiplex just produce objectionable nonharmonic content, regardless of anything else. I really don't think a 555 could possibly produce acceptable results. It's just not stable enough over short periods of time, and the aging characteristics are awful.

Also, since the pulse channels reset phase when you update the upper byte of the period, multiplexing the pulse channels would only work in the topmost ~5 octaves (A440 and up). Triangle and noise and looped DPCM should be ok, albeit limited.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239385)
Even if the NMI handler allows IRQs to get through, distortion from OAM DMA is still very hard, if not impossible to mitigate completely. Only the DMC DMA can interrupt it, so I suppose one solution is to find a good "interpolation" DMC sample to play before starting OAM DMA, or to adjust the sample pointer(s) after the DMA to skip the samples that should've played during the DMA. Either way, sprite usage can definitely mess with anything pitched here.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239394)
za909 wrote:
Even if the NMI handler allows IRQs to get through, distortion from OAM DMA is still very hard, if not impossible to mitigate completely. Only the DMC DMA can interrupt it, so I suppose one solution is to find a good "interpolation" DMC sample to play before starting OAM DMA, or to adjust the sample pointer(s) after the DMA to skip the samples that should've played during the DMA. Either way, sprite usage can definitely mess with anything pitched here.


Alternatively, if one isn't using too many sprites, one may be able to get by without OAM DMA. From what I understand of the DMA corruption issue (and experimentation with my own NES is consistent with this), if e.g. code only needs sprite zero and the last 16 sprites, one could set OAMADDR to $C0 (which might corrupt sprites 0, 1, 48, and 49), then feed 68 bytes (reloading the data for sprites 48-63 and 0) and end by storing an $FF for the sprite 1 position. The remaining bytes of sprite 1 would be randomly corrupted, but that wouldn't matter since it wouldn't be displayed anyhow, and since OAMADDR would be pointing at a partially-written first row, the hardware-forced reload of address zero wouldn't cause any further corruption.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239396)
lidnariq wrote:
I really don't think a 555 could possibly produce acceptable results. It's just not stable enough over short periods of time, and the aging characteristics are awful.

I think a 555 free running at about 16KHz should be reasonably stable over the very-short term (<100ms) aside from a certain amount of random jitter which could actually be a good thing. If the main CPU is running almost entirely 2-3 cycle instructions for some parts of a frame, but is running many 4-8 cycle instructions during other parts, and if there's no IRQ timing jitter caused by anything other than the instruction mix, that could create a noticeable 60Hz modulation in the audio. Adding a little random jitter on top of that could make the effect less noticeable. Note that the 555 wouldn't be used to modulate other tone generators, but instead generate audio samples directly.

The idea I find most intriguing, though, would be a mapper that could vary the IRQ vector based upon the timing of when the interrupt is taken. Such a mapper could also have an "IRQ service mode" latch which gets set on any access to $FFxx, cleared on any read of $01xx, and force the cart to behave as though most regions are banked to the last address regardless of the state of any banking registers. This would allow the IRQ to use address space that's banked differently from the main-line code without having to save/set/restore the banking configuration. Audio quality in the presence of timing jitter may be tolerable, but removing the jitter or limiting it to an average of one cycle every three lines in a somewhat-randomized pattern would make the sound much cleaner.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239397)
supercat wrote:
one could set OAMADDR to $C0 (which might corrupt sprites 0, 1, 48, and 49),
On at least one alignment, any write to OAMADDR causes something that looks like DRAM decay.

supercat wrote:
Adding a little random jitter on top of that could make the effect less noticeable.
Not enough. Adding sample-to-sample jitter will degrade what sound quality remains, but in a way that sounds entirely different from the missing sample updates during OAMDMA.
Quote:
Note that the 555 wouldn't be used to modulate other tone generators, but instead generate audio samples directly.
Not being able to use any of the NES's built-in five voices in exchange for a softsynth is going to be a very hard sell.

Even if it's comparatively stable in the 1/10th of a second range, the temperature coefficient of the various parts will cause obvious problems to anyone who has even the tiniest bit of a sense of pitch. It will drift audibly during operation. The threshold of 0.4% is audible, even if it doesn't sound like a lot, and is the upper bound of what you'd get from using a PZT resonator. RCs, especially at slower speeds, are an order of magnitude worse.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239398)
lidnariq wrote:
Not being able to use any of the NES's built-in five voices in exchange for a softsynth is going to be a very hard sell.


Pitfall II on the Atari 2600 used a free-running oscillator to generate the pitches for the DPC which had no relation whatsoever to the pitches on the TIA's own sound generators, one of which was used for game sound effects. I don't think there was any particular effort to calibrate the oscillator, but it sounds fine.

Quote:
On at least one alignment, any write to OAMADDR causes something that looks like DRAM decay.


An inability to use sprites would be a bigger impediment using such a player on anything other than a title or intermission screen, but many cool title and intermission screens don't need animated sprites. I've not observed the DRAM decay hitting anything other than the "old" or "new" rows, though I can imagine it might hit any row whose address has a mix of old and new bits. If that occurs, that would mean that the only quasi-safe change to the OAM row address would be from 00000 to 10000 [which could corrupt row 0 or 16, but nothing else]. Having to write 32 sprites/frame would be annoying, but depending upon what the OAM does with read-modify-write operations, zeroing out sprites would take either 10 or 16 cycles each, so 32 would be 1024 cycles. Annoying, but not impossible.

Quote:
Even if it's comparatively stable in the 1/10th of a second range, the temperature coefficient of the various parts will cause obvious problems to anyone who has even the tiniest bit of a sense of pitch. It will drift audibly during operation. The threshold of 0.4% is audible, even if it doesn't sound like a lot, and is the upper bound of what you'd get from using a PZT resonator. RCs, especially at slower speeds, are an order of magnitude worse.


Cheap caps are +/-10%, but 0.1uF caps rated +/-5% are still pretty cheap (under $0.02). If one wanted game music to be in tune with anything else, a semitone variation would be intolerable, but an RC using 5% caps isn't going to wobble by anything near 5% in a reasonable time frame. Wobble at rates below 10Hz or so turns into vibrato, which can mask a host of other ills.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239407)
supercat wrote:
lidnariq wrote:
Not being able to use any of the NES's built-in five voices in exchange for a softsynth is going to be a very hard sell.
Pitfall II on the Atari 2600 used a free-running oscillator to generate the pitches for the DPC which had no relation whatsoever to the pitches on the TIA's own sound generators, one of which was used for game sound effects. I don't think there was any particular effort to calibrate the oscillator, but it sounds fine.
emphasis mine. That's vitally important. Sound effects usually aren't in tune and usually don't need to be. The loss of one sound generator is a lot easier to chew than the loss of three.

The TIA's native sound generators start out of tune with themselves and just get worse from there. Composing music for them is an academic exercise, comparable to writing 12-tone music. It's technically doable, but it's horribly constrained, and requires accepting that some notes are either just not available, and/or are going to sound sour, and/or are going to inscrutably change tone color.

A large capacitor is less stable over time, and a large resistor is more sensitive to crosstalk, while an RC that runs at a couple megahertz is comparatively stable. But if the RC has a high operating frequency one has to add a divider, and at that point one may as well start with the NES's M2 signal.

Vibrato doesn't sound good when it's a function of what's being displayed on screen, or what routine the CPU is currently executing, or anything with irregular period.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239415)
lidnariq wrote:
That's vitally important. Sound effects usually aren't in tune and usually don't need to be. The loss of one sound generator is a lot easier to chew than the loss of three.


I'd look at it differently: if one could swallow the loss of CPU time, especially in any vblank where it's necessary to perform sprite updates, one would gain the ability to use all of the built-in voices for sound-effects. I doubt a game would be very practical unless one used an IRQ that could be synchronized with horizontal sync, but a Frogger/Preppie style game with excellent music might be practical given such hardware.

Quote:
The TIA's native sound generators start out of tune with themselves and just get worse from there. Composing music for them is an academic exercise, comparable to writing 12-tone music. It's technically doable, but it's horribly constrained, and requires accepting that some notes are either just not available, and/or are going to sound sour, and/or are going to inscrutably change tone color.


I've done two programs that played music on the TIA. Toyshop Trouble was in C major and was mostly in tune except for the Gs which were absolutely horrid, but for which I compensated a little by using two different "B" notes--a slightly sharp one for use when the base note was C or F, and a flat one for use when the bass note was G. The Stella's Stocking menu used my BTP2 (Better Than Pitfall II) music player so I was able to compose it for four voices with a five octave chromatic range. There are ten pieces of music in the menu program, in nine different keys. If I remember right, C major was the most solidly in tune, and C#/Db was the worst.

Quote:
A large capacitor is less stable over time, and a large resistor is more sensitive to crosstalk, while an RC that runs at a couple megahertz is comparatively stable. But if the RC has a high operating frequency one has to add a divider, and at that point one may as well start with the NES's M2 signal.


If I were going to use a 555, I would set its pitch to 15-16khz (middle C would be 262Hz when the oscillator frequency is 15,720Hz), probably using a 0.1uF cap (which is why I mentioned that value).

Quote:
Vibrato doesn't sound good when it's a function of what's being displayed on screen, or what routine the CPU is currently executing, or anything with irregular period.


Such things shouldn't affect the oscillation rate of the 555. They might affect the level of jitter in the IRQ timing, but as I said I think the multi-voice music driver would probably be most useful for title screens.

Here's a very rough test/demo of a 4-voice driver. It plays ascending chromatic scales at four different rates on the four channels using the DPC for timing. I should use less-harmonically-rich waveforms for the upper notes since aliasing becomes noticeable in the top octave, but I think the quality is adequate to show that with some tweaking the effect could be listenable. As I mentioned earlier, the ISR ends up with an absurd number of cycles to burn when doing "only" four voices (between the third and fourth "lda linearityLookup / sta $4011" and "lda linearityLookup / sta $4011" it does nothing useful except reload the calling program's X and Y register values), so there's lots of room to add more audio fanciness.

PS--The demo sounds lousy on MESEN, but much better on my actual NES console.
Re: Updating APU Regs Multiple Times Per Frame
by on (#239738)
This is a couple levels above anything I've thought to try, but I'm following over half of it, I think. Kind of a crazy snapshot that in a bit under a decade hobbyist-level NES audio tech has gone from hypothetical discourse on how to pack more tones on the tonal channels to a practical demonstration of softsynth on DMC.

It would help if I had any backing whatsoever in the official theory being accomplished by supercat's 4-phase lookup+composition sample last page, or familiarity with the actual code behind the referenced Atari demos, but I think I get the gist of canning up 4 sample buffers, stacking the values, and passing the final result through a cleanup-lookup. Will take a bit more staring at it to see what the phase0 and phase0l values are doing, tho. And, being tremendously late to the evident party, wow putting code in zeropage and just updating the load/store/jump absolutes in-situ is brilliant.

I am, however, confused and the internet is not helping (and is indeed putting this thread as the top hit however I phrase the query): what is location $d011? I see 0 relevant NES references, and 1 Apple II reference to key input, but nothing there or in my on-hand cheat-sheets notes it as a control register for anything. Dummy write to pad time / update flags? Also, I assume you're purely driving the DMC output level via $4011 here, not expecting it to ever engage in its own process of sample playback?
Re: Updating APU Regs Multiple Times Per Frame
by on (#239739)
$D011 is a video control register on Commodore 64. It was probably a typo for $4011.