This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Help to optimize an "slide in" code

Help to optimize an "slide in" code
by on (#236264)
Hello to everyone,
I'm writing a code that "slide In" the screen from left to right column by column as you can see in this gif:

Image

That's works ok when I try to draw until the 16th row, but if I try to draw all 30 rows I got this:

Image

Probably because my code is too slow... I'm updating a column every tot seconds until the 32.

Code:
FadeInBackground:
    lda #$01
    cmp FADE_STATE
    beq .CanFade

    jmp .Exit

.CanFade:

    MACRO_INC TIMER_FADE_TICKS, #$01
    lda TIMER_FADE_TICKS
    cmp #$02
    beq .FadeTile

    jmp .Exit

.FadeTile:
    lda #$20
    sta ADDR_HIGH_2
    lda FADE_TILE_COUNT_LEFT
    sta ADDR_LOW_2

    lda #HIGH( TileTitleScreen )
    sta ADDR_HIGH
    lda #LOW( TileTitleScreen )
    sta ADDR_LOW

    MACRO_INC ADDR_LOW, FADE_TILE_COUNT_LEFT
   
  .highloop:
      ldy #$00 ;FADE_TILE_COUNT_LEFT
      lda #$00
      sta COUNT

      .loop:
        lda $2002
        lda ADDR_HIGH_2
        sta $2006
        lda ADDR_LOW_2
        sta $2006

        lda [ ADDR_LOW ], y
        sta $2007

        ;sta TILE
        ;MACRO_BackgroundTile ADDR_HIGH_2, ADDR_LOW_2, TILE
        MACRO_INC ADDR_LOW_2, #$20
       
        MACRO_INC COUNT, #$20 ;usare un dey
        ldy COUNT
   
        cpy #$00
      bne .loop

      MACRO_INC ADDR_HIGH_2, #$01
      MACRO_INC ADDR_HIGH,   #$01

      lda ADDR_HIGH_2
      cmp #$24
  bne .highloop   

  lda #$00
  sta TIMER_FADE_TICKS
  MACRO_INC FADE_TILE_COUNT_LEFT, #$01

  lda FADE_TILE_COUNT_LEFT
  cmp #$20
  bne .Exit

  lda #$02
  sta FADE_STATE

.Exit:
  rts


Probably the problem is here:

Code:
lda $2002
        lda ADDR_HIGH_2
        sta $2006
        lda ADDR_LOW_2
        sta $2006

        lda [ ADDR_LOW ], y
        sta $2007


Anyone can help me to understand how and if I can optimize the code?

Thanks a lot!
Re: Help to optimize an "slide in" code
by on (#236266)
The PPU has a "increment PPU address by 32" mode that should help here, such that you don't need to set the PPU address before every write. It doesn't look like you're using it?
Re: Help to optimize an "slide in" code
by on (#236267)
The vertical blank (vblank), the period during which you have free access to VRAM, lasts a very limited amount of time. To make the most out of that time, you're supposed to just blast pre-calculated data to VRAM and not do any complicated data processing. The problem with your code is that you're doing a lot of unnecessary things for every single byte you copy, and that is indeed blowing your vblank time budget. With reasonably optimized code, one can write well over 100 bytes of data to VRAM during vblank, but you're having trouble after only 15!

The very first thing you have to do is stop setting the VRAM address for every byte! The PPU has an auto-increment feature that takes care of moving the address register over to the next position (either vertically or horizontally) after each access to $2007. You just need to indicate via bit 2 of PPU register $2000 whether you want the address to increment by 1 (for drawing rows of tiles) or 32 (for drawing columns of tiles), and then you can set the target name table address just once when starting the column, and the PPU will automatically advance to the next position after each byte written. You still have to update the source address manually though, which brings us to the second point...

While the change above by itself could be enough to solve your current problem, that "MACRO_INC" thing you're doing looks like a huge time waster, specially when called multiple times for each byte like you're doing. One thing you could consider doing is store the screen data rotated by 90 degrees in PRG-ROM, so you can simply increment Y to advance to the next source byte, so you don't need to update the pointer itself as often. If storing the screen sideways is not something you're willing to do, make sure you're incrementing that source address as efficiently as possible, or consider using other addressing modes with (partially or completely) unrolled loops.
Re: Help to optimize an "slide in" code
by on (#236318)
tokumaru wrote:
The very first thing you have to do is stop setting the VRAM address for every byte! The PPU has an auto-increment feature that takes care of moving the address register over to the next position (either vertically or horizontally) after each access to $2007. You just need to indicate via bit 2 of PPU register $2000 whether you want the address to increment by 1 (for drawing rows of tiles) or 32 (for drawing columns of tiles), and then you can set the target name table address just once when starting the column, and the PPU will automatically advance to the next position after each byte written. You still have to update the source address manually though, which brings us to the second point...


Mmm ok! The first thing I will try to set the auto increment by 32.
Re: Help to optimize an "slide in" code
by on (#236335)
Ok, now works great! I just set the increment by 32, and delete the resetting of PPU Address at every cycle in this way:

Code:
lda #%00000100
    sta $2000


And now everything works fine. Thanks you as always! :)
Re: Help to optimize an "slide in" code
by on (#236337)
tokumaru wrote:
The very first thing you have to do is stop setting the VRAM address for every byte! The PPU has an auto-increment feature that takes care of moving the address register over to the next position (either vertically or horizontally) after each access to $2007. You just need to indicate via bit 2 of PPU register $2000 whether you want the address to increment by 1 (for drawing rows of tiles) or 32 (for drawing columns of tiles), and then you can set the target name table address just once when starting the column, and the PPU will automatically advance to the next position after each byte written.


Just chiming in to say thanks for explaining details like this. I haven't had much time to code recently but I feel like I'm still learning a lot by reading other people's threads. :beer:
Re: Help to optimize an "slide in" code
by on (#236349)
samophlange wrote:
tokumaru wrote:
The very first thing you have to do is stop setting the VRAM address for every byte! The PPU has an auto-increment feature that takes care of moving the address register over to the next position (either vertically or horizontally) after each access to $2007. You just need to indicate via bit 2 of PPU register $2000 whether you want the address to increment by 1 (for drawing rows of tiles) or 32 (for drawing columns of tiles), and then you can set the target name table address just once when starting the column, and the PPU will automatically advance to the next position after each byte written.


Just chiming in to say thanks for explaining details like this. I haven't had much time to code recently but I feel like I'm still learning a lot by reading other people's threads. :beer:


Yeah, It's amazing how many things you can learn on this forum. Great users! :beer: :beer: