This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

4 Channel wavetable on the NES

4 Channel wavetable on the NES
by on (#33780)
Well, b00d's been buggin' me awhile on this. I wrote it a few days ago as a quick draft.

It's a possible way to perform 4 channel wavetable synthesis on an NES using its raw DAC channel and some code that runs at Hblank rate or thereabouts.

Code:

init:      lda #<trackdata  ;point to track data
      sta track+0
      lda #>trackdata
      sta track+1
      lda #000h
      sta pointer
      sta phase0
      sta phase1
      sta phase2
      sta phase3
      lda #0ffh
      sta timer
      rts


;perform channel update and rendering
update:      clc
      ldx phase0   ;index our sample table
      lda table,x   ;and add 4 channels worth together
      ldx phase1
      adc table,x
      ldx phase2
      adc table,x
      ldx phase3
      adc table,x
      sta 04011h    ;store in PCM reg  30 cycles to here
      
      clc      ;update phase accumulators
      lda rate0
      adc phase0
      sta phase0
      clc
      lda rate1
      adc phase1
      sta phase1
      clc
      lda rate2
      adc phase2
      sta phase2
      clc
      lda rate3
      adc phase3
      sta phase3      ;74 cycles to here

      jsr waithblank

      inc timer
      bne +           ;update approx 31x a second

      ldy pointer
       lda (track),y
      iny
      sta rate0
      lda (track),y
      iny
      sta rate1
      lda (track),y
      iny
      sta rate2
      lda (track),y
      iny
      sta rate3   
      sty pointer   ;46 cycles
      bne +
      inc track+1
      ;add other fun things here, like changing the lookup table address (self-mod the LDA high byte)
      ;check for track end (look for 0 or something in the note data maybe)


         + jsr waithblank
      jmp update

waithblank:   do hblank wait here

trackdata:   .db <blah>  music data goes here, notes in groups of 4, updated 30x a second.

table:      .db 256 byte waveform data goes here
      .db another 256 byte waveform table can go here for a second waveform


      

;zeropage variables:

;phase accumulators
phase0:      .dsb 1
phase1:      .dsb 1
phase2:      .dsb 1
phase3:      .dsb 1

;update adders
rate0:      .dsb 1
rate1:      .dsb 1
rate2:      .dsb 1
rate3:      .dsb 1

timer:       .dsb 1
pointer:   .dsb 1



Here's how it works. First you call init to initalize everything, next you jump to update, and it will get stuck in an endless loop playing the music.

I kinda cheated on it since it's just a rough draft. The whole hblank waiting thing isn't fleshed out, but you could use an IRQ from a VRCx mapper- that would give an IRQ every 114/113 cycles which is fine.

There's plenty of time to perform the calculations and all that.

Music is just stored note/note/note/note for the 4 channels, and the values get shoved into the adder variables every time "timer" expires.

You can get fancy and add volume control and different wavetables and all that with some self modifying code. The code is so small that you can easily run it out of RAM without using too much up.

The "Table" wavetable holds 1 cycle of your waveform, be it a sawtooth, triangle, square, sine, or other kind (electric guitar?). Samples should be 5 bits in size, so that when 4 of them are added, the result will not exceed the NES' 7 bit DAC. This will provide maximum volume without clipping.

Well that's about it. I don't have time to play with the code so I thought I'd throw it out here to see what people think.

oh yeah, if your adder value is 00h, this will effectively silence that channel 'cause it will no longer update its wave position. Using 8 bit adders like this, you should be able to get several octaves of range without too much trouble. If you wish to get more range, store 2 or more cycles of your waveform in the wavetable.

[/code]

by on (#33781)
Image

by on (#33793)
Oh my if this could run at a decent rate this would be AWESOME !!
Too bad it's needed to freeze anything to use this, unles you use fancy mappers. Maybe if you time every bit of your whole code you could actually run this while running another program, but that would be a big headache. I've already had a big headache writing a code that perform optimised matrix multiply and timed $2000 updates at the same time for my mode7 demo, my god I had an horrible time writing this.
However with $4011 updates are slightly less significant, if you are one or two cycles off it will sound sligtly noisy but that's acceptable as opposed to badly timed PPU registers updates which produces flickering.

by on (#33796)
This is quite possible IMO, and the idea has been discussed for a very long time. Last time here:
http://nesdev.com/bbs/viewtopic.php?t=1090&start=15

To use every cycle efficiently, I think you need to run the code in zeropage and write/read from an immediate value contained in the code like this:

zp_lbl:
lda #xx
clc
adc #yy
sta zp_lbl+1

Also, I think you will really need a fractional position for the phase to make it sound decent. On the other hand, you might be able to skip the CLC if you add one, as a 1/256 random sample shift should not be very noticable.

And actually, you could extend the idea of self-modifying code even further to make the code dynamically generated by the player. For melodic instruments, you'll probably want to have volume control, but for drum instruments you'll probably prefer to have samples longer than 256 bytes. By generating the appropriate code, you code use the available cycles for one feature or the other.

But the real problem is probably integrating this sample loop with a player in a decent way that doesn't glitch. :)

by on (#33798)
I tried the simple integral step per sample and I couldn't get enough intermediate frequencies, so I added a fraction to the phase and it works well. It seems that the code could just all be cycle-timed, with a JSR make_sample sprinked in the code at appropriate places. make_sample doesn't modify the X and Y registers, so calling code won't be too interrupted by calls to it. Volume can be handled by having multiple wave tables.

wavetable.nes.zip

Another approach would be to have NMI restart the code each frame, and have it cycle timed well enough that it always generates the proper number of samples each frame, just the last one's time varies by a few clocks.
Code:
clocks_per_sample = 20+94  ; Number of clocks between samples

; These are stored in zero-page inside the make_sample code (self-modifying)
chfrac0 ; fraction of phase
chrate0 ; fraction of rate
chstep0 ; whole of rate
chwave0 ; pointer to wave, 256-byte aligned in memory, low byte = phase
; ... same for channels 1-3

loop:
    ; Delay
    ldx #200
:   jsr chloop
    delay clocks_per_sample-5
    dex
    bne :-

    jsr make_sample
    ; ... some calculation that takes clocks_per_sample
    jsr make_sample
    ; ... more calculation that takes clocks_per_sample
    jsr make_sample
    delay clocks_per_sample+1-5
    jmp loop

; in zero-page
make_sample:        ; 94 clocks (including JSR)
    lda #0 ; chfrac0    15 clocks
    adc #0 ; chrate0
    sta chfrac0
    lda chwave0
    adc #0 ; chstep0
    sta chwave0
    ; Carry isn't cleared, so there can be slight spill-over
    ; ... same for channels 1-3
    clc                 ; 22 clocks
    lda $1234 ; chwave0
    adc $1234 ; chwave1
    adc $1234 ; chwave2
    adc $1234 ; chwave3
    sta $4011
    rts                 ; 6 clocks

by on (#33800)
It's working well. There's not much limit to how complex sequence data can be, since it can be processed like in any player, just with the inserted calls to make_sample. The NMI technique also works well, allowing time to be broken into samples, and not having to worry about how many samples some part of sequence playback takes (as long as it fits within the budget of around 60 samples per channel per frame).

For this demo I tried to put lots of sprites on screen, but there isn't enough VBL time to write them, and sprite DMA makes the audio sound bad. There's some issue with how I'm writing sprite data so they sometimes don't show on my NES, so ignore that (not interested in solving that issue, since it's unrelated to sound generation).

wavetable6.nes.zip

by on (#33801)
Heh, since my copynes is in pieces at the moment I haven't tried it. But it looks promising from what i see in nintendulator. Cool!

(the song should be redone though as it sounds very out of tune)

EDIT: I just realised it might be because of inaccurate timing in nintendulator it sounds out of tune, so ignore that if it sounds good on a real machine.

by on (#33802)
Heh, I only had a 2-part song, so I have it playing two instances of it, one started later. I'll have to see if I can convert some other 4-part song that sounds better.

by on (#33810)
As far as simplicity goes, running it out of IRQ on some CPU-timed mapper would be the most transparent. you'd lose two cycles to the interrupt sequence and RTI, and another 6 to saving and restoring A across though. Might not be workable, as your main code would then get 12 cycles out of every 113, but it would let your main code jump occasionally without egregiously horrible cycle counting.

Whatever your music kernel was though, it would have to run in about 3000 cycles or less to keep running at 60 Hz.

time-domain multiplexing them would get some of that back, at the expense of sample rate.

by on (#33819)
DMC IRQs? Can you use those to drive the timing?
Maybe just play some up-down-up-down sample so it doesn't affect much.

by on (#33821)
The DPCM IRQ has a maximum rate of about 4.2 kHz. I'm pretty sure we need 10 kHz or more for a decent mixer. Even Big Bird's Hide and Speak runs at roughly 8 kHz.

by on (#33827)
blargg/kev:

This is totally sex-awesome. I'm gonna have to make a musical jizz-fest when I get my hands on some form of GUI or MML-based interface engine to get this working.

Don't keep me blue-balled for too long, guys. :(

by on (#41192)
blargg wrote:
It's working well. There's not much limit to how complex sequence data can be, since it can be processed like in any player, just with the inserted calls to make_sample. The NMI technique also works well, allowing time to be broken into samples, and not having to worry about how many samples some part of sequence playback takes (as long as it fits within the budget of around 60 samples per channel per frame).

For this demo I tried to put lots of sprites on screen, but there isn't enough VBL time to write them, and sprite DMA makes the audio sound bad. There's some issue with how I'm writing sprite data so they sometimes don't show on my NES, so ignore that (not interested in solving that issue, since it's unrelated to sound generation).

wavetable6.nes.zip


Full sources please :)

by on (#41846)
ca65 source + ROM: wavetable6.zip

assemble with

ca65 -I common -o rom.o wavetable6.s
ld65 -C nes.cfg rom.o -o rom.nes

Sorry for the lack of documentation, but others urged me to anyway.

by on (#41939)
Thanks a lot, blargg. We'll see what we can do with this. ;D

by on (#41942)
How about one where it doesn't simultaneously play two identical songs separated by a couple measures?

by on (#41947)
Someone just needs to convert a four-voice song to its format. All I had handy was a two-voice one. :(