This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Strategies for generating Mesen label files

Strategies for generating Mesen label files
by on (#238548)
I'm still slowly moving forward with creating my own assembler, and I decided it'd be really cool to generate Mesen label files (.mlb) automatically, but I'm having trouble figuring out how to handle certain things.

If my understanding is correct, values represent offsets into each kind of memory, which is interesting because it does away with a lot of the complications of banking, but on the assembler's side, we don't normally know what kind of memory each label belongs to, since all labels are created from the same Program Counter.

Since we know we're generating NES ROMs, we can make certain assumptions based on the addresses where things are usually mapped. This is fine for simpler NROM programs, but once we start dealing with bankswitching, there's not much we can do without knowing which bank each label belongs to and how large the banks are.

My assembler has native support for banks, so every label has a bank number assigned to it. I suppose I could add the option of specifying a bank size when defining the bank number, so it'd be possible to calculate the overall offset of each label (BankNumber * BankSize + LabelValue). For this to work correctly, the bank size would have to be consistent throughout all the banks in the same memory type.

Another thing I'm struggling with is identifying "G" labels (registers), since they can be anywhere in the NES' addressing space, that means they'd be recognized as the other kinds of labels (internal RAM, PRG-ROM, etc.) based on their addresses. What can we do to differentiate these labels in the source code? And how to differentiate work RAM labels from save RAM labels?

Does anyone know of ways to deal with these issues in a sane way?
Re: Strategies for generating Mesen label files
by on (#238558)
You're understanding the format correctly - except for the "register" labels, they are all offsets into each kind of memory. This is how Mesen handles labels internally, too.

The register labels are special in that they are only used in some scenarios. They will not display a label in the disassembly view, e.g: this will never happen:
Code:
myRegister:
   STA $00
But they will be used when reading/writing to memory locations:
Code:
STA myRegister
This is what Mesen uses to give labels to the NES' built-in registers (APU, PPU, etc.) In terms of an assembler, unless you want have a way to define "registers" in the assembly syntax, you can probably ignore them altogether.

In terms of PRG ROM labels, as far as asm6 was concerned, it was pretty straightforward. The code kept track of the current file output position (in the .nes file) as it parsed the code & labels, so it was just a matter of keeping track of the position at which each label was defined. This may be harder depending on your assembler's design, though.

In ASM6f, I just assumed anything between $6000-7FFF was work/save ram (it creates duplicate labels for both, and Mesen just ends up ignoring whichever is not used by the mapper based on the battery flag/etc). In CC65/CA65, the assembler's dbg files keep track of which segments are RW/RO/etc and I use that info to figure out if a label is in ROM or RAM and figure out where (it's not perfect, and does not work properly if bankswitching is used for work/save ram, though, but almost no games do this)

If you're feeling very adventurous, you could try generating a .dbg file like CA65's linker does, which has a number of benefits vs .mlb files (source view + scoped symbols in source view). But the format is pretty hard to understand and a lot more complex than .mlb files to generate, too.
Re: Strategies for generating Mesen label files
by on (#238560)
tokumaru wrote:
My assembler has native support for banks, so every label has a bank number assigned to it. I suppose I could add the option of specifying a bank size when defining the bank number, so it'd be possible to calculate the overall offset of each label (BankNumber * BankSize + LabelValue). For this to work correctly, the bank size would have to be consistent throughout all the banks in the same memory type.

Mappers with a mix of window sizes for the same memory type violate this.

VRC6: Small PRG window 8K, large 16K
Setting PRG window $8000 to 12 will switch 8K bank 24 into $8000 and bank 25 into $A000.
Setting PRG window $8000 to 12 will switch 8K bank 12 into $C000 as expected.

Namco 108, MMC3: Small CHR window 1K, large 2K
Setting CHR window 0 to bank 12 or 13 will switch 1K bank 12 into $0000 and bank 13 into $0400.
Setting CHR window 1 to bank 12 or 13 will switch 1K bank 12 into $0800 and bank 13 into $0C00.
Setting CHR window 2, 3, 4, or 5 to 12 will switch 1K bank 12 into $1000, $1400, $1800, or $1C00 as expected.
(On MMC3, this is subject to the C bit of $8000.)

tokumaru wrote:
And how to differentiate work RAM labels from save RAM labels?

Is this for SOROM, SXROM, ETROM, EWROM, and a modified FME-7 board with >8K SRAM that l_oliveira proved to work in 2015? Or just for SOROM and ETROM that can have two memories, one with battery and one without?
Re: Strategies for generating Mesen label files
by on (#238563)
Sour wrote:
In terms of PRG ROM labels, as far as asm6 was concerned, it was pretty straightforward. The code kept track of the current file output position (in the .nes file) as it parsed the code & labels, so it was just a matter of keeping track of the position at which each label was defined. This may be harder depending on your assembler's design, though.

That was my initial idea too, but then I thought of the 16-byte header, which is part of the output but not part of the PRG-ROM... I don't want to assume anything about the presence or absence of a header, so maybe I could have a separate counter that only incremented when the output bytes were in the $8000-$FFFF range... but then there are the mappers that can map ROM to areas below $8000...

Quote:
In ASM6f, I just assumed anything between $6000-7FFF was work/save ram (it creates duplicate labels for both, and Mesen just ends up ignoring whichever is not used by the mapper based on the battery flag/etc).

But what about banking in that region? We don't have the output offset to help out this time...

Quote:
If you're feeling very adventurous, you could try generating a .dbg file like CA65's linker does, which has a number of benefits vs .mlb files (source view + scoped symbols in source view). But the format is pretty hard to understand and a lot more complex than .mlb files to generate, too.

Yeah, source view would be really cool! But since even .dbg files can't fully solve the label problem, I don't really feel like going through all the trouble of generating that. For now I'm more inclined to use directives to help giving labels more attributes, that can be used to better sort them when generating .mlb files. For example, I already support selecting different labels to act as the Program Counter, so I could maybe add an offset counter to each label used for this and somehow signal which counter corresponds to which kind of memory. I need some time to think if that'd work.
Re: Strategies for generating Mesen label files
by on (#238566)
I guess once you give a way to specify all properties you want to associate with a "bank", the result will look almost like ld65 linker configuration files.
Re: Strategies for generating Mesen label files
by on (#238574)
I don't plan on doing anything like link configuration files, but I'm thinking of ways to achieve some of the same functionality via inline directives.

Like I said above, I have a directive that allows you to use any symbol as the Program Counter, so that's a handy way to handle different addressing spaces without them interfering with each other. If I properly pad each bank to the end, the output offsets should remain consistent regardless of the sizes of the banks.

For overlaps, such as when RAM locations are reused for multiple purposes, it should be possible to "roll back" the PC and the output offset.

Then I could decide the type of memory based on the address of each label, but also provide a way for the user to specify that explicitly.
Re: Strategies for generating Mesen label files
by on (#238581)
tokumaru wrote:
That was my initial idea too, but then I thought of the 16-byte header, which is part of the output but not part of the PRG-ROM...
That's what asm6f does (assumes that there is a 16-byte header, that is). Arguably, if you're outputting .mlb files, this will usually be the case. But if you have a more reliable/easier way of doing this without assuming a 16-byte header in your case, even better.

For work/save ram banking, I don't know enough (read: nothing) about how assemblers handle bankswitching to help. For asm6f I just assumed no banking, since it's the typical scenario.
tokumaru wrote:
But since even .dbg files can't fully solve the label problem
In source view mode, though, the labels/symbols are near-perfect. With .dbg integration, the debugger is scope-aware and able to distinguish between identically named-labels in different files, or different labels for the same memory address (e.g temp global variables, etc). This only applies to source view, though, the disassembly view is still limited by both the lack of scope for labels and work/save ram bankswitching limitations, etc. That being said, creating a .dbg file is probably an order of magnitude more effort than just a simple .mlb export.
Re: Strategies for generating Mesen label files
by on (#238646)
I've been thinking about this some more, and came to the conclusion that in order to generate accurate (i.e. no guessing or assuming anything) .mlb files I'd need to significantly increase the complexity of the assembler, and keep track of a bunch of information unrelated to the assembly processes itself. For this reason, I decided to not generate Mesen label files directly, but instead export a generic label file with all the information I do keep track of and is relevant to the assembly process, and create an external script that'll convert that into Mesen's format (and possibly other emulators too).

Using this tool it'll be possible to specify what kind of memory is mapped where (there will be a default configuration based on the typical NES memory map), as well as banking information (bank sizes, start addresses, etc.) so that the final offsets can be correctly calculated. To keep things simple, I'll try to avoid configuration files and get all the settings from the command line, since I can't see anyone needing different settings for every single bank in a program with hundreds of banks.
Re: Strategies for generating Mesen label files
by on (#238657)
tokumaru wrote:
For this reason, I decided to not generate Mesen label files directly, but instead export a generic label file with all the information I do keep track of and is relevant to the assembly process, and create an external script that'll convert that into Mesen's format (and possibly other emulators too).


I think it is better and preferable solution. As long as your format is simple and documented, it will be easy to adapt them to FCEUX and other debuggers if needed in the future.
Re: Strategies for generating Mesen label files
by on (#238667)
Yeah, I figured it was better to keep emulator-specific stuff separate from the assembler itself, for various reasons. In this case in particular, I absolutely need more input from the user in order to generate proper output, and it just wouldn't make sense to require this information on the assembler side, since it's not needed during the actual assembly.

BTW Sour, do .mlb files (and Mesen, actually) support label lengths? Those are quite useful when dealing with arrays and multi-byte values... I plan on giving my labels a length property, which's obtained when memory-reserving directives are used, so that'd be easy to obtain.
Re: Strategies for generating Mesen label files
by on (#238672)
Yea, outputting your own format that contains as much information as you can give is probably a better choice (esp. since whatever information is not useful for Mesen can be used by some other tool, etc.)

It does support multi-byte labels in the dev builds, but 0.9.7 doesn't (will probably just finish up a couple of pending documentation updates and release the current code as 0.9.8 tomorrow before I have to say the same thing in yet another thread a week from now..)

The mlb format was updated to support multi-byte labels, too, by adding the end address, e.g:
P:8000-8001:mylabel

But now that I'm looking at the code, I think I accidentally excluded the last byte of the address range, which makes no sense, will need to check & fix that if needed.
Re: Strategies for generating Mesen label files
by on (#238674)
Cool, I'll keep an eye out for version 0.9.8!