This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Simulating arguments passing to functions with zero page

Simulating arguments passing to functions with zero page
by on (#43323)
I'm rewriting my library in asm, but still offering an API to be called from C.
cc65 implements arguments passing to function with a stack in $500 - $5FF, but I figured out a faster (I think) way to do that without the stack.

I declare some zero page variables which will be used to simulate the arguments, let's say _byte_arg_0, etc.
Then I defined a macro function with arguments, which is what the user really calls in his code, which in turn sets those zero page vars with the values of the arguments, and then calls the real .proc
The .proc reads the arguments from the zero page vars and does what it has to do, with faster zero page vars instead of a stack.

With this method, of course a routine like this cannot call itself recursively or call another routine that uses the same zero page vars. Anyway, no routine of my library will do any of both.

Also, if one of this routines, gets interrupted by an interrupt which in turn calls another of this routines, and both use the same zero page vars, there's a problem there too. I'l have to look over that.

Besides what I mentioned, does anyone think this method is good?
Does anyone detect any other problem?
Would you use this method? or is the stack method better?
( One of the reasons I don't use cc65's stack, appart from zp being faster, is that I don't fully understand it yet )

by on (#43325)
Sounds like a solid approach, with the limitations you cover, but is this optimization necessary? What sort of performance are you aiming for? What functions would this be used for? I guess thre real question is of the general API you're planning. It's a good approach overall, to write core routines in asm so that more work is done in the library and less in the inefficient C code of the user.

by on (#43341)
The functions in my library are mostly writes to PPU, OAM, APU registers, Joypad reads, and so on.

The main performance aim is already covered by re writing everything in asm as oposed to C, which was a quick approach to get things done.

I don't know how necessary this optimization is, but I think every optimization is good in NESdev, also, using a stack for arguments is only needed for recursive calls and functions which call other functions, which won't happen in my library.
This approach is not only faster ( maybe unnecessarily ) but also simpler (at least for me)

The idea of the library is to cover all NES specifics, while leaving the user only the need to code the game logic in C.

by on (#43444)
You should take a look at HuC (and the source). It's a small C compile for PCE (which has a 65x variant). All the functions are written in ASM in the Clib (library). It has a "pragma fastcall" system for setting up how the arguments are going to be passed to the function in the ASM side.

An example on the C side:
 * sgx_vreg( char reg )
 * sgx_vreg( char reg, int data )

#pragma fastcall sgx_vreg( byte acc );
#pragma fastcall sgx_vreg( byte acc, word ax );

 * sgx_load_map(int vaddr, int *bat_data, char w, char h)
#pragma fastcall sgx_load_map(word di, farptr bl:si, byte cl, byte ch)

The asm library of HuC reserves some ZP registers for argument passing and such. Funny enough, their named ax,bx,cx,dx,si,di,etc even though there's no intel processor (the naming was taken from the official documentation). And the first four having al/ah, bl/bh, etc. Acc is the accumulator reg and if you specify a WORD then it's passed in A:X.

As you can see above, you can do argument overloading - having the same function name with different # of calling arguments. On the ASM side, you suffix a label with .1 or .2 - like _sgx_vreg.1. And no suffix means no arguments called.

Since the entire back end library is in ASM, tweaks / modifications / optimizations can be made with ease (relatively speaking). It's faster and more optimal than doing "user" functions for creating/extending the library. HuC builds out a rom file and also builds out an assembly and symbol file.

by on (#43645)
Petruza, i think you're doing some good work here. A simple CRT for NES is a solid idea!
I'm wondering why you think it's really necessary to recode everything in ASM. To be honest, the basic hardware setup and status reads done in the library aren't going to benefit orders of magnitude from ASM optimization. If your library included arithmetic-intensive algorithms like Bresenham or AABB-collision detection, i could see 6502 being a win, but it seems like you might want to focus on usability first, then tear down to the metal if bottlenecks present themselves.

by on (#43649)
Thanks baisoku!
Well, the reason of doing it in assembler is the following: No matter if you gain a lot of efficiency or not so much, but assembler code will always be more efficient than c code translated by cc65.

Then, doing everything in assembler would be the best option about efficiency. From my point of view, coding the game logic would be hard to do and hard to maintain in the future. But coding simple basic functions, like access to registers, is easy to do in assembler.

Then in some way there's a compensation of the efficiency lost in the game logic that the library user codes in C, and the efficiency gained in the library written in assembler.
Besides, as the NES hardware doesn't evolve, the library, hopefully, will reach a final version and never have to be touched again.

You're right, usability should be the main focus first, and the first version in C showed no bottlenecks so far, but to be honest, I felt like learning assembler and what better than coding the library with it? :D

by on (#43667)
The reason to not code in asm is that the cost outweighs the benefit in most cases. You spend time doing the asm and you end up with something worse than before. Premature optimization is a root of evil.