This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Make absence of # an asm warning

Make absence of # an asm warning
by on (#22174)
The 65xx series and a few other assembler languages differentiate between an immediate constant and a memory address by the presence/absence of the # character:
Code:
lda #10 ; A = the value 10
lda 10  ; A = byte at address 10

Since the # is not required on constants in other common computer languages or general writing, it is very easy to forget. The value at the unintended address might work most of the time, making the bug rare to appear. You may think you never forget this, but there's no way to be know since the assembler can't warn you.

My proposal is to prevent this error by adding a warning to assemblers for the second case above. To avoid the warning, the value must have some sort of "this is an address" prefix on it, or be defined symbolically in advance. For the moment I'll use the @ symbol, but any syntax could be used:
Code:
lda 10        ; warning
lda @10       ; OK

addr = @10
lda addr      ; OK

lda #addr     ; OK, sets A to the value 10

not_addr = 10
lda not_addr  ; warning
lda @not_addr ; OK

table: .byte 1,2,3,4

lda table     ; OK


The main aspects of implementation would be
* Assembler keeps track of the type of a value, either an address if prefixed with @, or a pure value if not
* Assembler warns on use of pure value not prefixed with a #
* Labels are addresses
* In an arithmetic expression, if any value is an address, the result of the expression is an address

I think the need to use @ would be minimal, only for the absolute addresses of hardware registers, which would be in a common include file anyway. Non-absolute addresses (i.e. variables and constants) would virtually all be labels, which the assembler would already treat as an address. If there's any complaining about this issue, it should be that the annoying # is required everywhere, when the most common case is a numeric constant, not an absolute address. Enabling this warning would involve the addition of a small number of @ symbols (unless you don't use symbolic constants).

by on (#22175)
Your logic is sound. I'll implement this in my fork of nesasm when i get the chance.

by on (#22176)
What would be really cool is something that doesn't render the source incompatible with other assemblers, sort of a hint to a "lint-like" assembler used to catch this bug. For assemblers that accept macros, you could just define an ADDR(value) macro that added @ if the assembler supported the proposed extension. As for choice of syntax, the @ symbol might not work well with its use for anonymous labels in some assemblers. If the macro approach above were taken, it could be something ugly and obscure since it would be hidden behind a macro.

by on (#22177)
my assembler generates an error when the operand is ambiguous and forces either a '#' or '$' to prefix every number.

by on (#22185)
never-obsolete wrote:
my assembler generates an error when the operand is ambiguous and forces either a '#' or '$' to prefix every number.

In 6502, $ is the prefix for a base 16 number. Under your assembler, is it not possible to specify a zero-page or absolute address in base 10, which is a common idiom when using the bottom locations of zero page as extra registers?

by on (#22186)
tepples wrote:
never-obsolete wrote:
...

In 6502, $ is the prefix for a base 16 number. Under your assembler, is it not possible to specify a zero-page or absolute address in base 10, which is a common idiom when using the bottom locations of zero page as extra registers?


if you mean like a...

Code:
 lda #$xx
 sta 255


...then no its not. since this isn't part of my coding style (which entails 100% of the user base) there was no need for it. though if anyone really wanted it, i could add support instead of throwing an exception. but i don't see that being the case, so i just add features as i need them. though for completeness i will add it in :D

by on (#22187)
I have enough trouble using assemblers without worrying about this crap. The standard of using # for values was established as a standard a long time ago by other 6502 programmers. What goes on in other languages is the business of the programmers of said languages. Also, I would just like to point out that even though the use of # to mean values isn't common outside the 6502 asm realm, neither is the use of $ to mean hexadecimal.

Having said that, however, I do support the ability to let the end user decide whether to use # or @ for either value or address...or whatever.

by on (#22190)
Quote:
Having said that, however, I do support the ability to let the end user decide whether to use # or @ for either value or address...or whatever.

Yeah, it should be an optional warning that's off by default, since a few people I've talked to also seem to prefer convenience over reduced bugs, even though the latter are far more time-consuming.

by on (#22194)
You could even name it like the GNU tools like to name their warnings: -Wliteral-zero-page-address

by on (#22200)
Quote:
-Wliteral-zero-page-address

That wouldn't catch this error:
Code:
value = 255
lda value   ; warning
lda value+1 ; no warning, treated as lda @value+1

You'd want -Wmissing-#-or-@ or something. Of course GCC uses these verbose names for the warnings since there are so many. It'd be out of place on an assembler with only a few.

Another thing I just realized, this warning would be great for people new to 65xx assembly. Give them a "nes.inc" file with lines like
Code:
PPUSTATUS = @$2002

and they'll never even have to know about @.

by on (#30264)
I finally implemented an experimental version of this in ca65 and made a post about it on 6502.org. In summary:
Code:
const = 10
lda #const       ; OK
lda 10           ; warning
lda const        ; warning

ADDR = 0         ; special constant that makes something an address
PPUSTATUS = ADDR+$2002 ; this is an address, not just a number

lda PPUSTATUS    ; OK
lda $4015+ADDR   ; OK

lda <12          ; OK, often used for quick nameless temporaries
lda $4000,x      ; OK, indexed modes always accept numeric expressions
lda const,y      ; OK
sta const        ; OK, since STA never accepts immediate anyway
sta $2000        ; OK

Note how the special ADDR constant will be treated as a normal constant of zero by an assembler that doesn't support this extension, thus allowing full source compatibility. Score!