This page is a mirror of Tepples' nesdev forum mirror (URL TBD).
Last updated on Oct-18-2019 Download

Thoughts on Higher Level Language Design for 6502 Systems

Thoughts on Higher Level Language Design for 6502 Systems
by on (#81551)
You may have read David A. Wheeler's 6502 Language Implementation Approaches article. I recently read it but was quite underwhelmed. Although David does point out some of the limitations of the 6502 that makes high-level language implementation challenging or inefficient, he takes the approach of working around those limitations to implement features which are inefficient on the platform.

I have given considerable thought recently to the design of a language for the 6502. The approach I normally take when designing a language is to start at the bottom and work upwards, making common design patterns and idioms used in the lower-level language easier to implement in the higher-level language.

To this end I have described the function of a language for the 6502 that cleanly maps to efficient machine instruction sequences that are extremely similar to those we write by hand. Below I describe the function of the language in terms of how it differs from C without regard to syntax. I look forward to your comments.

Limitations
* Function calls may not be part of an expression, only for single-assignment
* Function calls may not be recursive.
* Arrays are limited to 256 elements, however those elements may be of any size
* No support for dynamic memory allocation
* No multiplication, division or modulus operators, rather they are implemented as built-in functions
* Pointer and string data types not supported
* Comparison operators are not considered expressions
* No support for arrays as a member of a struct

Features
* Direct segment management
* 16-, 24- and 32-bit integer support
* Read and write "files", or large arrays, using built-in functions for pointer manipulation
* Support for structs, arrays, and arrays of structs
* Inline assembly support for every language feature

Thoughts? I would also be interested in hearing what language you would like the syntax based on. I have developed a minimalist line-oriented syntax I use for my languages that lacks punctuation of almost any kind, but it can be hard for someone to pick up if they didn't write the thing :D

by on (#81556)
Have you seen this thread?
http://nesdev.com/bbs/viewtopic.php?t=7185
Atalan seemed kinda decent.

I've seen some 6502 people say that Forth is good.

I've been doing assembly for years, and C (and Verilog, it's much like C) are the first high-level languages I've started using fairly recently. I'm not sure yet if I'd seriously anything besides asm and C on NES.

by on (#81557)
BASIC. It runs pretty quick on the TRS-80 Color Computer, and there's a disassembly out there. I know there's a lot of stuff it can take shortcuts with on the 6809, but I think basic on 6502 would be sweet if it worked on the NES. There's also C64 and Apple basic, although they aren't anywhere near the quality of it on the Coco I don't believe myself.

by on (#81558)
There is Family BASIC. If only the (72-pin) NES had a keyboard, I'd probably have made a BASIC interpreter myself by now.

by on (#81559)
I have seen Atalan and thought it looked pretty fuggly. It also uses the approach of forcing the 6502 to implement higher-level language constructs that it's just not that good at. I aim to make a language that exploits the strengths of the 6502.

FORTH has a lot of overhead on a 6502. It's a stack-based language and the 6502 kinda sucks when it comes to stacks.

A BASIC interpreter has even more overhead. The BASICs of the day were designed to allow home users to write meaningful and useful programs, not to squeeze every last cycle out of a CPU, which is what we tend to do in NES development. BASIC is great for writing a Tetris clone, but you'd quickly hit a limit with it.

Also the 6800 series processors had 16-bit register pairs and 16-bit math instructions, and a whole lot of other features that made implementing higher-level languages easier and more efficient.

Thanks for all the input folks! No remarks about the idea yet. I don't know if I'll actually make this compiler or not. I'm trying to decide between this and writing a game in QBasic.

by on (#81561)
6809 is more comparabile to the 68000 than the 6800, check it. And there's also hitachi's 6309.
Re: Thoughts on Higher Level Language Design for 6502 System
by on (#81592)
qbradq wrote:
You may have read David A. Wheeler's 6502 Language Implementation Approaches article. I recently read it but was quite underwhelmed. Although David does point out some of the limitations of the 6502 that makes high-level language implementation challenging or inefficient, he takes the approach of working around those limitations to implement features which are inefficient on the platform.

What features would those be?

qbradq wrote:
The approach I normally take when designing a language is to start at the bottom and work upwards, making common design patterns and idioms used in the lower-level language easier to implement in the higher-level language.

What would those be in the case of the 6502?

qbradq wrote:
To this end I have described the function of a language for the 6502 that cleanly maps to efficient machine instruction sequences that are extremely similar to those we write by hand.

Some (more explicit) examples?

by on (#81605)
High-Level Language Features that are Inefficient on a 6502:

1. Stack-based expression parsing, which allows function calls within expressions.
2. Stack-based local variables, which allows function recursion.
3. Array indices greater than 8-bits requires pointer math.
4. True objects (classes, structures with methods, anything with a this pointer) must use pointer math.
5. Pointers in general are problematic due to the requirement to use zero-page locations.


Here are the design patterns I am trying to represent in the language:

1. Mathematical expression solving using up to 32-bit numbers.
2. Procedure calling with local variables and statically addressed parameters.
3. Static arrays.
4. Static data structures.
5. Static arrays of data structures.
6. Memory files (large arrays using pointers for indexing).
7. Jump tables.
8. Procedure pointers (for procedures with no parameters).


And here are the design patterns that I am aware of but not trying to represent:

1. Reuse of RAM locations for multiple variables.
2. Self-modifying code.


So after some thought I am going to implement this language and see how it goes. I am going to use C syntax because it is unambiguous and well-known (Java, JavaScript and C# use the same basic syntax).

I'll start a new thread when I've got something to show. Hopefully by the end of the week I'll have a compiler done. It might take longer though, this is the first language I've tried to write that supports more than one data type.

by on (#81646)
Memblers wrote:
I've seen some 6502 people say that Forth is good.

It's not. I have to deal with it occasionally on FreeBSD since the 3rd-stage bootstrap is written in Forth. What an awful atrocity; debugging is the most impossible thing I have ever witnessed. Don't click here unless you want a seizure.

Let's face it: the 6502 is a small, bare-bones microprocessor. The keyword there is micro. "User-friendly" languages, particularly scripted languages, are usually over-abstracted and do too much to hide the inner-workings from the programmer. This sounds sexy and delicious on paper, but in practise/implementation it fails every time.

This is why you'll find people using assembly a lot of the time, and if not assembly then C -- but we can't really use that on the NES 6502 given the limitations of the architecture/platform. That's just the way the ball bounces. Sometimes it's best to keep it bare-bones, you know?

Though we all know Tepples masturbates at night to the thought of python running on the NES... ;P

by on (#81647)
Godspeed qbradq, I think this is a decent idea. It seems like it would address most of the problems I have with C on 6502. I have been trying to emulate much of the same stuff (local variables, parameter passing) using macros, but there's only so much you can do with them (even though I've come to find the CA65 macro system is amazingly flexible). Also 16-bit math gets annoyingly verbose (i.e. hard to read) even with macros.

qbradq wrote:
2. Procedure calling with local variables and statically addressed parameters.

Quote:
And here are the design patterns that I am aware of but not trying to represent:

1. Reuse of RAM locations for multiple variables.

Does this mean that each procedure will have its own area of memory for local variables/parameters? That does seem a little bit wasteful. Would it not be possible for the compiler to detect procedures dependencies on each other, and allocate accordingly? For example, the compiler knows proc1 calls proc2, so it knows not to overlap proc2 local variable/parameter area with proc1's area. Of course there are some problems, like proc1 calling proc2, then proc2 calling proc1 again. That could maybe be fixed by making two copies of proc1.. each using a different area of memory.

by on (#81649)
I wanted to make a HLL specially for 6502 (Spoiler: It's easier just to write asm macros). Got as far as designing syntax and creating a parser, and maybe some operations work. These were key features:
-No types; everything is an 8 or 16-bit variable
-Any 16-bit variable can be used like a pointer
-Variables declared in loop, function and "scope" keyword have bytes reused outside scope
-ascii string and const array support
-function params don't use stack

The code basically would have looked like Python with a "*" for memory addressing. "*()" would be Y-indirect.

by on (#81650)
Have you seen NESHLA?

by on (#81652)
Thank you all for the input!

koitsu wrote:
This is why you'll find people using assembly a lot of the time, and if not assembly then C -- but we can't really use that on the NES 6502 given the limitations of the architecture/platform. That's just the way the ball bounces. Sometimes it's best to keep it bare-bones, you know?


I know, that's why this language is keeping it bare-bones :D Once I'm done with this puppy I think you'll find the generated code to be very close to what you'd write by hand.

koitsu wrote:
Though we all know Tepples masturbates at night to the thought of python running on the NES... ;P


I kinda do too :D

thefox wrote:
Does this mean that each procedure will have its own area of memory for local variables/parameters? That does seem a little bit wasteful. Would it not be possible for the compiler to detect procedures dependencies on each other, and allocate accordingly?


I was going to leave call graphs for a latter optimization in the segmenting module. There should not be much CPU improvement by doing this, and I want to do a proof-of-concept to make sure the CPU efficiency works out the way I think it will before I put too much work into this thing.

Wave wrote:
Have you seen NESHLA?


Yes, and that is what inspired me to start thinking about this months ago. NESHLA has some great qualities, but it's still lower-level that what I'm trying for. Any software engineer can tell you that all things being equal a higher-level language will allow faster development time, fewer defects and quicker maintenance turn-around that a lower-level language. Well, at least this software engineer will tell you that :D

by on (#81653)
When discussing the language design, you must first decide, whether you are going to implement the language on 6502 system or as a cross-compiler.

While it is possible to implement a decent native compiler (see Action!), I believe today it does not have much practical sense, so let's talk about cross-compiler.

1. Stack-based expression parsing, which allows function calls within expressions.

This is not inefficient, it just complicates the parser a little bit.
When you do not allow function calls, you force the programmer to rewrite the expression

Code:
r = func(a) + func(b)


to
Code:
r1 = func(a)
r2 = func(b)
r = r1 + r2


But that is exactly what the compiler would do. Also note, that unless you want a programming language with RPN, you will need some kind of stack for parsing expressions (to implement operator priority and braces).

3. Array indices greater than 8-bits requires pointer math

You need this, or 2-d arrays. Otherwise you will not be able to support access to video ram on any architecture with memory mapped video ram.

Note, that you can have efficient small arrays with 0..255 indexes while still supporting large arrays.

4. True objects (classes, structures with methods, anything with a this pointer) must use pointer math.

That is not an absolute true. For example if you restrict the number of your objects of same class to 256 (quite reasonable restriction under 6502), you can easily represent a 'pointer' to an object as an index to array of objects. You need to design your whole language around this idea though.

5. No multiplication, division or modulus operators, rather they are implemented as built-in functions

This does not bring any extra effectivity. You can just implement the multiplication and friends as a call to built-in procedure even if the syntax is like A * B.

6. Reuse of RAM locations for multiple variables

You absolutelly need this. Otherwise you are going to ran out of variable space very quickly.


I'm very interested in how far you can go with your language (does it actually have a name?). It can give me some nice ideas for Atalan :-) Also if you need some help, let me know.
I encourage you to take a look at Atalan source code and take whatever parts you need (math routines may be useful, techniques of reusing ram locations etc.) I hope the source code is kind of readable (although it's probably hard reading anyways, as the algorithms are getting kind of complicated when all the compiler features are added).


Rudla
(author of Atalan :-)

by on (#81654)
Oh, I forgot, could you, please, provide more information on Memory files (large arrays using pointers for indexing).

?
How exactly does such structure look and what are the operations?

Thanks,

Rudla

by on (#81656)
Rudla,

Thank you very much for your input! You have given me a lot to think about.

Your points reinforce the insight this language is trying to express: that low-level language design patterns should be expressed in a higher-level language. I think it is inappropriate for the design patterns of established languages (such as C) to be back-ported to this architecture. For instance, I don't know of anyone (not to say that there aren't any) that use a stack-based expression parser in hand-written assembly.

This will be a cross-compiler. No use in trying to write a self-hosting compiler on an NES now is there? :D



Expression parsing is stack based, but only at compile-time. The expression processing will be reduced to standard 6502 mathematics.

Example:
Code:
r = a - b + 4


Becomes:
Code:
clc
lda b
adc #4
sta ___temp0
sec
lda a
sbc ___temp0
sta r


Rather than using the hardware stack or a software-controlled dynamic stack for storage of the intermediate variable a zero-page temporary variable is allocated statically at compile time, thus allowing this type of hand-written equivalent code to be generated.

However this static assignment of temporary variable space means that the entire scope of the expression must be constant, knowable at compile-time and consistent. This is why function calls within expressions are not possible.

Here's a more detailed example:
Code:
byte someFunc(a, b) {
  byte tempVal = a + b;
  tempVal |= %00100101;
  // This expression allocates temporary values on the static expression stack
  return tempVal - a + b;
}

// This expression must also allocate temporary values
byte myVal = c - someFunc(9, 11) + 9;


OK, so maybe that didn't clear things up. Anyhow, the point is stack-based operations on the 6502 are inefficient when compared to the above generated code example, and that's the level of optimization I am going for.



As for larger arrays, I had thought about using two different addressing modes for small and large arrays, but I gave up on this idea as there were things that would get inefficient. That's why I have the notion of "memory files" like you were asking about. Here's a quick example:

Code:
word myPtr = mapDataArray + (mapYPos - 1 << 5/* x32 */) + mapXPos - 1;

// Read the tile we're standing on
//               Pointer       Y Index Value
byte tile = read(mapDataArray, 32 + 1);  // 32 + 1 is optomized down to 33 at compile time

// Read the tiles around us

tile = read(mapDataArray, 0);      // North-West
tile = read(mapDataArray, 1);      // North
tile = read(mapDataArray, 2);      // North-East
tile = read(mapDataArray, 32 + 0);   // West
tile = read(mapDataArray, 32 + 2);   // East
tile = read(mapDataArray, 64);      // South-West
tile = read(mapDataArray, 64 + 1);   // South
tile = read(mapDataArray, 64 + 2);   // South-East


The compiler sees that mapDataArray is being used in one of the file-access built-ins and requires it to be a zero-page variable. The resulting assembly look something like this:

Code:
ldy #0
lda (mapDataArray),y
sta tile
ldy #1
lda (mapDataArray),y
sta tile


This ends up being a lot more efficient that how I typically write this in assembly, simply because I get lazy :D

Note that the call graph-based variable space reuse would be a great optimization here to conserve zero-page space.



I had never thought of doing objects that way. That's a great idea! I'll have to give that some thought.




As for multiplication operations as functions, that's something I gave a lot of thought to. In the end I decided it was best to force the user to use a function so they understood the performance impact of the operation. I've seen plenty of timing-critical C code with division inside of tight loops on platforms that do not have hardware division, then the programmer trying in vain to understand why it's taking so long. I took the Phythonic approach of "explicit is better than implicit".



As for memory reuse, I've never used that in my demos and never had an issue with running out of RAM, even on NROM boards. Then again I use a set of sixteen zero-page variables for all of this temporary stuff, so I guess that is a limited form of variable reuse. Anyhow, that's an optimization I am saving for later.



What's in a name? But if you're curious, here's the project page. It's called NICHE, for Niche Instruction Code for Homebrew Enthusiasts. I've always wanted to name something with a recursive acronym :P

by on (#81657)
qbradq wrote:
However this static assignment of temporary variable space means that the entire scope of the expression must be constant, knowable at compile-time and consistent. This is why function calls within expressions are not possible.

Once on EFnet #nesdev, kevtris told me how he allocates local variables in his own programs. I'll try to express it in the language of graph theory so that it can be implemented in a compiler.

Assume the set of functions is a partially ordered set over the relation "f calls g", that is, their call graph is acyclic. (This captures what I think you were trying to say by excluding recursive functions.) The sources in the call graph is the reset handler, and the sinks are "leaf functions", or functions that do not call any functions. Add an edge from each function called while NMI is enabled to the NMI handler, and do the same for IRQ.

Once you have a call graph, you can topological sort it, and the "height" of the function becomes the maximum of A. the amount of temporary space used while not calling any function, or B. the amount of temporary space that needs to persist across each call to the function f plus the height of f.

Quote:
Here's a more detailed example:
Code:
byte someFunc(a, b) {
  byte tempVal = a + b;
  tempVal |= %00100101;
  // This expression allocates temporary values on the static expression stack
  return tempVal - a + b;
}

// This expression must also allocate temporary values
byte myVal = c - someFunc(9, 11) + 9;

The height of this expression is the space needed by the expression plus the space needed by someFunc().

Quote:
As for larger arrays, I had thought about using two different addressing modes for small and large arrays

Much like "near pointers", "far pointers", and "huge pointers" in 16-bit MS-DOS programs.

Quote:
The resulting assembly look something like this:

Code:
ldy #0
lda (mapDataArray),y
sta tile
ldy #1
lda (mapDataArray),y
sta tile


This ends up being a lot more efficient that how I typically write this in assembly, simply because I get lazy :D

Which incidentally is a claim that some compiler publishers have made at various points: their products could schedule instructions in compiled code better than all but the most skilled (and highest paid) humans.

Quote:
Note that the call graph-based variable space reuse would be a great optimization here to conserve zero-page space.

Extend this sort of reuse to all variables, and you don't need a data stack unless you're recursive.

Quote:
As for multiplication operations as functions, that's something I gave a lot of thought to. In the end I decided it was best to force the user to use a function so they understood the performance impact of the operation.

In other words, the same complaint leveled against C++ operator overloading: it's "as efficient as C yet still rawther deceptive and easy to misuse because the 'simple' syntax hides how much code is actually being generated".

Quote:
I've seen plenty of timing-critical C code with division inside of tight loops on platforms that do not have hardware division, then the programmer trying in vain to understand why it's taking so long. I took the Phythonic approach of "explicit is better than implicit".

Yet Python has operator overloading :P

Quote:
Then again I use a set of sixteen zero-page variables for all of this temporary stuff, so I guess that is a limited form of variable reuse.

And from the call graph, you can extend this set of sixteen variables to nearly the entire zero page (except for a few things that must persist in zero page, such as open "memory files" used by music engines and the like).

by on (#81658)
Quote:
However this static assignment of temporary variable space means that the entire scope of the expression must be constant, knowable at compile-time and consistent. This is why function calls within expressions are not possible.


That's not really true, you know. Atalan does it. Local variables, function input and output arguments are just normal global (static) variables, you just do not allow programmer to reference them outside the specified scope (function).

Quote:
In the end I decided it was best to force the user to use a function so they understood the performance impact of the operation.


I knew this would be the reason :-) Well, it's design decision based on taste, so it's hard to argue. However I do not believe, that this is a way to go.
It makes the language harder to use even for competent programmers for the sake of preventing newbies to write inefficient code. Newbies will not write efficient code even if you force them to use operator '$#&#%' to do the multiplication.



If I understand the idea of memory files, it is basically just different syntax for C pointer arithmetics?
read(mapDataArray, 32 + 0) is equivalent to C mapDataArray[32+0] ?

In such case, common subexpression elimination should provide simmilar functionality.

Atalan provides 2D arrays for this. It generates index of array lines to eliminate multiplication and provide fast access to item elements. Access to one element is then:

Code:
ldy mapYpos
lda mapIndexLo, y
sta _arr
lda mapIndexHi, y
sta _arr+1
ldy mapXPos
lda (_arr),y


In case of several reads in row, line adress computation gets removed using common subexpression elimination.

by on (#81659)
Most of the suggestions in the previous two replies are over my head at this point. The features I have described are already stretching my abilities, so I think I'll stick to what I've got for the initial version and start studding these other concepts as I gain more basic understanding of compiler design.

Just to be clear, I am experienced with compiler implementation, but not the design of a good compiler, if that makes any sense.

by on (#81660)
Then let me help you through it a bit more gently:
  1. Do you understand what a call graph is?
  2. Do you understand what a leaf function is?[1]
  3. Do you understand how to perform a topological sort on a graph?
  4. Can you determine how much memory each leaf function uses for its local variables and temporary values?


[1] Other than to photosynthesize.

by on (#81661)
Local variables are quite easy. I will try to demonstrate the concept:

Code:
p:x, y, z -> q =
   a = x + y
   q = a * z

s = p 13,10,4


can be translated to:

Code:
p: p_x, p_y, p_z -> p_q =
  p_a = p_x + p_y
  p_q = p_a * p_z

p_x = 13
p_y = 10
p_z = 4
call p
s = p_q


Now all the variables are global, only some have 'p_' prefix .

by on (#81666)
tepples wrote:
Then let me help you through it a bit more gently:
  1. Do you understand what a call graph is?
  2. Do you understand what a leaf function is?[1]
  3. Do you understand how to perform a topological sort on a graph?
  4. Can you determine how much memory each leaf function uses for its local variables and temporary values?

[1] Other than to photosynthesize.


Thanks Tepples! You are always quite helpful!

1: Yes
2: Yes
3: Never done it before, but I think I have a handle on it now that I read that article.
4: Yes

This all makes sense for how I can reuse RAM locations for function parameters and locals. I also see how this can be extended to allowing function calls within expressions without using a stack, however that would be a rather extreme application.

Rudla,

That's basically what I was going to do. I will assign a global RAM location to each parameter and local variable. The symbols are resolved within a local scope.

by on (#81667)
Well, if you implement local variables like this, then nothing prevents you from implementing function calls in expressions. You do not need the stack then.

It may be useful to think about the goals of your language. I understand very well that you want to try to implement your own language, not just use the almost existing one. :-) Learning by doing was one of my motivations for starting the development of Atalan. However it may be useful for future discussions to clearly differentiate two reasons for restricting the NICHE:

1. You do not want to implement some feature, because you feel it would be too much work in the current state of development (there is always possibility to implement the feature later).

2. You believe the code generated by the feature would be so inefficient that it is not worth to do it.


For example in case of nested function calls, I thought the reason was 2., while it is probably 1., right?
Basically the case 2. is always a compromise on complexity. For me this is recursive functions. I have an idea on how to implement them without sacrificing effectiveness of non-recursive functions. However I believe Atalan will be mostly used for programming games, maybe some simple utilities. This type of application has very little use for recursion. Therefore it it not very high on the todo list.

by on (#81671)
I chose not to implement function calls within expressions because I feared the generated code would be inefficient. The only solution to the problem I could come up with was to use stack-based expression solving, which I know from experience on the 6502 is less efficient that statically allocated temporary position expression solving.

You and Tepples have giving me some insights into compiler design that I lacked previously. I now see how, when taken to the extreme, a topologically sorted call stack could be used to allow function calls within expressions and retain the static nature of expression solving. My only problem now is that it seems like this would require a very large area of RAM to be dedicated to the expression space.

My intuition is telling me it would be better to leave out functions as terms and be able to move more variables to the zero-page.

I won't let complexity of implementation stand in my way. After all you never stop learning :D

Also, about not using Atalan: I did give it a try prior to deciding to make my own compiler. It's just not what I'm looking for. Basically I want a very lean C compiler, which is what I am making. If you look at some of the other projects I've done you'll notice a pattern of re-implementing things "my way" :P

by on (#81722)
Just some thoughts on a high level language designed to work fats on a certain CPU. Is it actually could be useful? Is it useful enough so it could justify amount of needed work?

For non-speed critical parts you don't really need such an speed optimized language. Fast code is usually large, there are speed/size tradeoffs, and you probably would prefer to have compact code for these parts.

For speed critical parts, such as NMI handler, it depends from the language, if it will be fast enough to use it there, or you will have to write it in assembly anyway. Like, C on 6502 is fast enough for a simple handler, but not fast enough for a complex one. If the new language will be just twice times faster, it will make not much difference, because you will be able to use it for a bit more complex handler, but still not for a complex one.

Also, using C on 6502 so far speed was not an issue (with speed critical parts in assembly), but compiled code is pretty large, much larger than written in assembly by hand.

by on (#81728)
Shiru,

Thank you for the input! I appreciate your point of view.

What I am trying to do is create a language that produces code as efficient and as small as hand-written assembly. That may not necessarily be possible, but I'm giving it a shot :D The limitations of the language are the key to producing good machine code.

As for the effort versus benefit, that is a pretty moot argument for me. I am doing this because I enjoy it, and because if it works out I can make another game or demo with it. It's a hobby after all, the whole point is to kill time and enjoy yourself :D

by on (#81735)
And even if it isn't quite as fast as ASM, as long as it's more efficient than whatever the Micronics guys used, a language that reduces the entry barrier to NESdev will at least be a positive step in expanding the developer base and thus legitimizing NES homebrew.

by on (#81762)
There is a problem with new languages, they only increase the entry barrier, because they unique. You can find a lot of books, docs, or people to ask, about 6502 assembly or C. With a new language you will have just one doc written by the author, and a forum thread. Also, it will have bugs, which is a real problem for beginners, because they don't have experience to understand if they do something wrong, or it is a bug.

I can recall only one sort of successful unique language optimized for micros, it is CCZ80. Few actual games were written with it, but generally it not gained any major popularity in 3 years it exists. It has another problem, by the way, the compiler is closed-source.

by on (#81767)
Shiru wrote:
There is a problem with new languages, they only increase the entry barrier, because they unique.


That's why it's based on C with a few limitations.

Shiru wrote:
Also, it will have bugs, which is a real problem for beginners, because they don't have experience to understand if they do something wrong, or it is a bug.


I can't argue with that. Hopefully my test suite will be comprehensive enough to eliminate bugs most folks are likely to encounter. The compiler will also be open source.

by on (#81773)
One thing from BASIC that should carry over is a relaxed syntax. If it's supposed to be easier for the layman C style { } and ; and other typo inducing requirements should go out the door.

I'd take a look at BatariBASIC for some ideas:
http://bataribasic.com/

Oh, yeah. Someone tried to port ZX Basic to the SMS:
http://www.smspower.org/forums/viewtopic.php?t=12902

It's a Z80 but maybe some of the ideas implemented will help.

I've got a PowerPack ready to rumble when you get to beta :) I'd love to make a real NES game!

by on (#81775)
Then I believe Atalan may be just the right thing for you :-)

Errors are single most significant problem in new programming language. The number of combinations of different features is simply unlimited!

However, we do not so much talk about newbies. How many of us, who still develop something for 8-bit platforms are still newbies :-)

I must say, I was very surprised how immature the Atalan still was, when the first game in it was developed. Had the guy who developed it bothered to ask me, I would tell him to wait :-)

by on (#81785)
tepples wrote:
...a language that reduces the entry barrier to NESdev will at least be a positive step in expanding the developer base and thus legitimizing NES homebrew.


Sorry I missed this before. That's exactly why I try to do stuff like this. I love the NES, and nothing would please me more than to see our homebrew scene expand. That's why its the Niche Instruction Code for Homebrew Enthusiasts after all :D

I gave serious thought to using a BASIC syntax. I have written BASIC dialect compilers before, and personally I prefer the syntax to C. Much less wrist strain :D However I think C is more known to the type of folks that would try to write NES homebrew. Perhaps I'm off on that.

Anyway, after I get the C-dialect version complete it would be fairly easy to implement a BASIC-dialect version that could be used interchangeably within the same source stream.

For instance, if I had a file called main.bas in BASIC syntax, and that file included a file lib/ppu.c in C syntax, the parser could switch gears for that file, then switch back. This is kinda how DOT NET languages work, but instead of an intermediate language in-memory data structures are used to keep track of scoping, types, variables, functions, etc.

Anyway, I'll keep that feature in mind. If this thing turns out to be useful I may have to implement it.

Also, I really like how the ZX Basic port was implemented: taking the assembly output of a compiler for a different platform and transforming it into equivalent assembly statements for the target platform. That's brilliant! :D

I can think of several languages you might be able to do that with, unfortunately they'd all be fairly inefficient on the poor 'ol 6502.

Also, I've added multiplication, division and modulus operators when both operands are constant. It helps a lot with array sizing.

by on (#81804)
tepples wrote:
...a language that reduces the entry barrier to NESdev will at least be a positive step in expanding the developer base and thus legitimizing NES homebrew.


Ironically, though, what language can truly replace assembly all around? Thus the entry barrier becomes:

I either:

a. buck up and learn 6502 assembly, or
b. learn some weird new language that is sorta a cross between Python, Lisp, Ruby, FORTRAN, Pascal, and some C but not the best parts of C that I'll have to somehow keep straight in my head with all the rest, *and* learn the pieces of 6502 assembly I'll eventually realize I still need.

So I've come to think that the CC65 toolchain is probably where the most accessible sweet spot is. There's libraries (KNES, others?) available for CC65 to make C work on a NES if I want to try to avoid assembly. Who *doesn't* learn C these days?

I gave up trying to write my own assembler not because I realized it was too hard, but because I realized I was only adding to the mass hysteria. Just off the top of my head: ASM6, CA65, QASM, NESASM3, P65, ... there's probably more. I can't keep track of how many threads start:

"So I picked up a tutorial and NESASM and I have a problem..."
"WTF are you using NESASM for? Use..."

So I guess where I'm going with this is that's why I started working on NESICIDE in the first place...to reduce my entry barrier into NESdev. I know it sounds ridiculous, but that's my way of learning...but in doing so I hoped to build something that would, in turn, reduce the entry barrier for others.

One of the goals of NESICIDE has always been to be rich with tutorials. It's not there yet, but it's already a fairly decent integrated environment in my opinion. Tepples, your russian_roulette makes a pretty darn good "I just want to throw some text up on the screen and move a sprite around" sort of intro package. I am hoping to use it as a basis to generate many many other tutorials.

I know I know...we all have our favorite development environments and none of them honestly actually include anything with the word "Integrated". Perhaps what I'm doing is ultimately a waste of time...but I don't think more languages is the answer!

by on (#81805)
cpow wrote:
Tepples, your russian_roulette makes a pretty darn good "I just want to throw some text up on the screen and move a sprite around" sort of intro package.

Thank you. It's a bit lacking in the "move a sprite around" department, but I'm about to fix that in a few minutes.

by on (#81824)
The problem I have with Assembly is that it drastically increases the development time, which raises the barrier to entry. The problem I have with C (on this platform anyway) is that it is very easy to paint yourself into a corner performance-wise. I am trying to create a happy middle ground.

I had not thought of the proliferation of languages as a barrier to entry. That's a very good point. Something that's "not quite C" could be more confusing than a language with it's own syntax. Good food for thought, especially considering...

I am going to have to re-write the parser. This is the first time I have tried to implement a language with user-defined types and arrays of user-defined types, and I had some learnin's to do :P

by on (#81825)
Truthfully, I don't have a problem with assembly the more I do it. I keep saving subroutines I need and I take them from other programs, I can start a good game engine to anything within an hour or two just by copying all the code I've used before, and then just adding the functionality I need from there. You just have to work with it, learn it, and create the libraries/subroutines you need and that takes a while, but after some time it becomes a walk in the park.

by on (#81830)
It is true that over time you get better at assembly. I certainly have. However the advantage of a high-level language is not that you don't have to learn assembly, it's that you can express things much more clearly with less code.

For example, the last project I worked on I needed to calculate a pointer into a map data table. In C it would look like this:

Code:
byte *curTile = &mapData + (yPosition << 5) + (xPosition & 0x1f);


In assembly that took me about 50 lines of code and two attempts before I got it right.

High level languages do not reduce the complexity of the logic, only the complexity of the syntax. Equivalent C code is much easier and faster to write, read and maintain.

by on (#81831)
Equivalent code in C or other HLL is few times shorter than hand-written in assembly, and few times easier to read and maintain. Less source code - less work. If you have enough CPU resources to be able to use slower code, it is a really good exchange. No one say you thanks if you waste few week or a month writing functionally equal code in assembly and will have 50% of CPU power not used for anything. So no doubts HLL are useful.

by on (#81832)
Is there any C generated code that's faster than a hand written program? I've never seen an example, and doubt it exists myself. The computer can't see into the future and tell what you're going to add/plan to/want to, and I doubt it can make those optimization's I can just thinking about it.

by on (#81833)
qbradq wrote:
The problem I have with Assembly is that it drastically increases the development time, which raises the barrier to entry.


I disagree. Using assembly is probably the lowest barrier. Just look at the expertise available on this board regarding assembly tips, tricks, and analyses of assembly-gone-wrong.

Using C is probably the next-highest barrier. If someone posts a question about C in this forum, the assembly gods herein will either know C and be able to help directly, or will try to help by compiling the code and reading the assembly. In my work the most common question I ask when someone has C code that isn't doing what they expect is "did you look at the assembly the compiler generated for you?" If not, or if they cringe at me as if I just asked them to eat their veggies, I suggest they send me their source and I'll take a look at the assembled output if I can't figure out what's wrong immediately at the C level.

Using something other than C/assembly is the highest barrier...

If I write something in <choose-your-favorite-high-level-language-that-isn't-C> and am having troubles, the "barrier to help" is higher unless the language I've chosen happens to also have behind it a community of experts, or is widely available enough that anyone can quickly figure out how to compile it to assembly to be helpful at that level. But then the response to your question is, if you're lucky, of the form:

"Well, I don't know <language> at all, but I can see why you're having problems because the assembly generated by its compiler contains a <anomaly> which doesn't make any sense. I can't help you figure out why because I don't know <language>."

With something like C you don't have to rely on the original author of the language being around to be able to answer syntax/semantic questions regarding its use. You don't [usually] have to rely on the original author of the compiler being around to be able to answer questions regarding its seemingly errant translation of a chunk of code. This is perhaps not the case with CC65 because it is still in development, its author is still around. But with GCC I often am heard laughing uproariously whenever I hear someone saying "it must have been the compiler...I didn't make a mistake in my code."

by on (#81835)
3gengames wrote:
Is there any C generated code that's faster than a hand written program?

As I touched on before, that depends on the skill of the hand writer of the program, and modern compilers for 32-bit and bigger architectures routinely beat all but the most skilled (and highest-paid) assembly experts. The consensus from what I have read on the Internet is that on 32-bit and bigger architectures, there is more return on investment in hiring somebody to optimize the algorithms or in buying a fancy compiler than in hiring somebody to take the existing algorithm and try to write an improvement on what GCC or Clang produces with options like GCC's -S -fverbose-asm. See: Superoptimization on Wikipedia; top answers to this question on Stack Overflow.

And of course, a program written in assembly language for one architecture and then emulated on another architecture is going to run slowly. With assembly, you limit your audience to one architecture unless you know that the vast majority of your audience will have a machine dramatically faster than the machine you target, which is the case when targeting NES.

by on (#81836)
3gengames wrote:
Is there any C generated code that's faster than a hand written program?

Not on 6502 for sure, but you can easily found that hand written assembly code is not faster than (much larger and ineffective) compiler-generated on some architectures with slow RAM and algoritms that heavily use RAM, like data processing. For example, ARM-based Pocket PCs and mobile phones. That's because memory speed is the limiting factor.

by on (#81837)
Shiru: In those cases, it might be true that the programmer forgot to tell the compiler's scheduler how slow the memory is, and the compiler just assumed zero wait state.

Another interesting article on whether hand-coded assembly beats compilers: Great Debate I

by on (#81840)
Certainly hand-coded programs can be far more efficient (if you are good at it). I have made some suggestions for LLVM which could improve the optimizations (I have many suggestions), as far as I know none of them are implemented yet. There is also the story of Mel, who did strange things that no compiler does, such as use no timing loops but instead set up everything so that the timing of the computer is exactly how the program wants it, and other things. The real advantage to compilers, though, is portability (although this does not apply to the NES, obviously).

by on (#81841)
Assembly is the lowest barrier to entry? I disagree. NES development took me way longer to become proficient with in assembly than did DS and PSP development in C, and XNA development in C#. All of these platforms are as complex than the NES, yet I was already familiar with the languages, which reduced my learning curve.

Granted a new language will not benefit from prior knowledge, however it should be easier to learn for someone already familiar with a similar language (such as C).

Also, the comment about most people learning C these days is way off. 90% of the programmers I have at work have no C experience, and usually end up with a segfault in the program they have been asked to modify.

by on (#82008)
After some thought I realized that I have no experience with the CC65 compiler. The C compiler I use at work is an old HP compiler from 1995 (don't ask) that has only the most rudimentary optimizations. I have spent some quality time with CC65, and I am very impressed at the efficiency of the code it produces.

Much of the optimization I had in mind for this language are already present in CC65 (such as static location evaluation). The only real difference is that my language would try to prevent you from shooting yourself in the foot (with regards to performance), where as CC65 ensures that what you express in C will (eventually) happen on the CPU :D

For now I am putting the breaks on this language idea and seeing how far I can go with CC65.

Thanks for all of the input!

by on (#82010)
qbradq wrote:
Much of the optimization I had in mind for this language are already present in CC65 (such as static location evaluation).

What do you mean by "static location evaluation"?

by on (#82011)
Not always using the stack for expression evaluation.

by on (#82012)
qbradq wrote:
Not always using the stack for expression evaluation.

Ah OK. I kinda wish it knew how to 1) use global/static memory for local variables and function parameters whenever possible (instead of stack) 2) to re-use the said memory areas for different functions. Currently it's possible to specify local variables as static with a compiler switch, but even then it'll not reuse the same memory areas.

by on (#82013)
I believe it uses the stack under the assumption of the possibility of (non-tail) recursive calls. If there are recursive calls, automatic variables have to be put on the stack. The compiler has to decide whether to use the stack or static allocation for automatic variables at compile time, and it doesn't know whether or not there will be recursive calls until link time (unless all functions that a given function calls are static).

by on (#82015)
tepples wrote:
I believe it uses the stack under the assumption of the possibility of (non-tail) recursive calls. If there are recursive calls, automatic variables have to be put on the stack. The compiler has to decide whether to use the stack or static allocation for automatic variables at compile time, and it doesn't know whether or not there will be recursive calls until link time (unless all functions that a given function calls are static).

Yeah I get that, it's completely reasonable from compiler design point of view (given it's the "C way"). It's just unfortunate that vast majority of time recursion isn't used. Of course this problem is solvable (link time code generation) but it probably won't happen in CC65 any time soon.

I'll have to give Atalan a serious try some day.

by on (#82019)
I just wish Atalan's syntax was not so esoteric.

by on (#82021)
thefox wrote:
tepples wrote:
The compiler [...] doesn't know whether or not there will be recursive calls until link time (unless all functions that a given function calls are static).

Yeah I get that, it's completely reasonable from compiler design point of view (given it's the "C way").

I guess a compiler-specific extension to the C language might be useful for marking a function as not callable in a recursive manner other than tail-recursive. Compare to things such as __attribute__ ((pure)) that GCC implements.

by on (#82233)
qbradq wrote:
I just wish Atalan's syntax was not so esoteric.


Someone is developing a Scratch module for Atalan:
http://work.playpower.org/w/page/350757 ... n-Compiler

Hopefully this is a start of a robust, newbie friendly development platform for the NES. Alas, no downloads yet so it may end up vaporware. Would LOVE to be a tester for it!

by on (#82234)
slobu wrote:
Someone is developing a Scratch module for Atalan:
http://work.playpower.org/w/page/350757 ... n-Compiler

Hopefully this is a start of a robust, newbie friendly development platform for the NES. Alas, no downloads yet so it may end up vaporware. Would LOVE to be a tester for it!


My experiences with PlayPower end up being vaporware. Several times over the past several years they've expressed interest in helping with specific pieces of NESICIDE so that their developers can use it. [When I say specific I mean "help you add support for the Famicom keyboard" or "support for mapper nnn"] Then there was the GSOC thing that fell through -- not their fault but, whatever.

Your mileage may vary...

by on (#84709)
Hi, i am marcel from germany.
Rudla and me are working hard on ATALAN & Scratachalan..

Be sure :) it will not be vaporware.

My part is of the project is my Bachelorthesis..
if this is vaporware.. then i am DOOOMED :D

we would be very happy to get people to help us..
We have to stay realistic, its a big programm..
and a the start there will only be simple games possible with it.

Currently i am searching for good Background & Sprite editors... if anybody can give me some tipps... it would be cool.
Tnx ;)
greetings rudla& marcel

by on (#84711)
mrm78 wrote:
Currently i am searching for good Background & Sprite editors... if anybody can give me some tipps... it would be cool.
Tnx ;)

For still backgrounds, you could try the Python-based image converter and nametable editor that I've included with all my latest projects (Concentration Room, Thwaite, Zap Ruder).

by on (#85055)
Hey Marcel!

Have you looked at these places for sprite tools?
http://bobrost.com/nes/resources.php#devtools
http://www.zophar.net/utilities/nesgraph.html
http://www.romhacking.net/?category=10& ... tle=&desc=

It may be better to make your own editor and then use a tile convertor to go from .BMP to .CHR

Can't wait to test out Scratchalan :)

mrm78 wrote:
Hi, i am marcel from germany.
Rudla and me are working hard on ATALAN & Scratachalan..

Be sure :) it will not be vaporware.

My part is of the project is my Bachelorthesis..
if this is vaporware.. then i am DOOOMED :D

we would be very happy to get people to help us..
We have to stay realistic, its a big programm..
and a the start there will only be simple games possible with it.

Currently i am searching for good Background & Sprite editors... if anybody can give me some tipps... it would be cool.
Tnx ;)
greetings rudla& marcel

by on (#88348)
Hi, there is a new beta version from the visual programm system "scratchalan".

still lots of work...
it uses scratch from m.i.t + atalan compiler + a mod of NESST tool

http://playpower.pbworks.com/w/file/fet ... /beta2.rar

Run Scratch.exe then open project "firebreather" in the example folder
then press green flag...

the game is a ripp of an action 52 game... sry i cant pixel :(
its just a test..

would be nice, if i could get some feedback.
tnx

by on (#88365)
It wasn't easy to find the flag. Something compiles and runs, but honestly, for me it looks like a Space Shutlle control desk. Have no slightest idea how to use it and why it is easier than conventional programming.

by on (#88373)
Hmm... sure i need to make some video tutorial for it.
Its easier for making simple makes for people that dont know the NES system. Its a starting point for bloody beginners.
Not usefull for coder that know how to code for this system.

by on (#88391)
It looks impressive, so far.

I do agree with Shiru: from an experienced programmer's perspective, this coding environment looks... well... inefficient. On the other hand, most people aren't experienced programmers, and Scratch was designed not for productivity, but ease of use; even a child can code on it (it was designed for that).

In other words, while it won't make much of a difference for elite coders, a NES-targetted version of Scratch has the potential to introduce a whole slew of people into NES homebrew development.

Again, it's a very interesting project, and a cool achievement. Keep up the good work.

by on (#88392)
mrm78 wrote:
Its a starting point for bloody beginners.


When I see the UI I don't think "this is a great tool for a beginner", I think "this reminds me of the Lego Mindstorms my seven year old plays with". If he were to express an interest in creating a NES game, I sure as heck wouldn't tell him "Well, first you'll have to learn this weird language called Atalan, then you'll have to learn this weird pseudeo-programming environment called Scratch which hides so much from you you'll never be able to create anything more polished than something you'd find on Action52. Good luck with that!"

Also, it seems something's missing from your package? NESST runs fine by itself for me, but not when I try to "Paint" a background from your tool:

Image

mrm78 wrote:
Its easier for making simple makes for people that dont know the NES system.


No. A good IDE with a set of easy-to-follow-and-modify tutorial projects could be that. Even I'm not there yet...probably never will be.

mrm78 wrote:
Not usefull for coder that know how to code for this system.


I don't see how it'd be useful for coder that doesn't know how to code for this system? What am I missing? How would I know what "when I receive Update" means? NMI? IRQ?

by on (#88396)
@haroldo-ok
tnx... maybe its was the wrong place to post this..


@cpow
tnx u2 to for feedback & good luck with your IDE.

The tools auto generates a atalan file.
So why must the user learn atalan then?
We just wanted to telp playpower.org with coding some simple learning games for it..
and i think its possible to make maybe games that are better than action52.

I know that this forum seems to be a wrong place to talking about it.

Me had much fun coding for the NES :D
and do little bit 6502 stuff.

For example the sprite flickering seems better then the original game.
game needs 6k cyles for running..

see u

by on (#88402)
mrm78 wrote:
@cpow
The tools auto generates a atalan file.
So why must the user learn atalan then?

I've never had good experience with autogenerated code. The firebreather example has a lot of warnings when compiling...are those from the autogenerated code? If so, the autogenerator is degenerated to the point of uselessness because I can't tell whether a warning is my fault or not.

To Shiru's point in another post, if there's no debugger available that can step through code at the Atalan level [or at the Scratch level?] then it's not much help. How easy is it to get from a 6502 machine address to a line of Atalan code or to a block in Scratch? If your IDE supports that it isn't obvious.

by on (#88413)
hi, at the moment there are some problems with the atalan compiler.
but rudla is on it. After this the warnings should go away.

i know, autogenerated code bears problems.
scratch doesnt allow to debug stepwise, but it will be possible to add variable watching to Fceux with some lua scripting.

by on (#88418)
If the code is always autogenerated and user never work with it, what's the point to use Atalan then? Why not generate 6502 code directly?

by on (#88454)
Well, I can see one advantage of generating Atalan instead of plain 6502 code: The Atalan developers seem to be porting the language to the Z80 processor (There's even a small ZX-Spectrum demo on their SVN repository); they even changed their description to "Programming language compiler for 8-bit processors"; just imagine where it could go from there.

by on (#88455)
To very limited and not really useful portability? Game logic could be portable, but other differences would force to redo everything else for every platform. For example, NES HW 8x8 three color sprites and typically 60 FPS update vs. ZX software 1.5 color sprites (monochrome with masks) of arbitrary sizes and update rates more like 16 FPS, that also require more memory than NES sprites due to need of pre-shift and pre-flipping.

by on (#88459)
haroldo-ok wrote:
Well, I can see one advantage of generating Atalan instead of plain 6502 code: The Atalan developers seem to be porting the language to the Z80 processor

In other words, for the same reason that early C++ compilers emitted C.

Shiru wrote:
Game logic could be portable, but other differences would force to redo everything else for every platform.

In other words, something like a model-view-controller paradigm. This works only when all platforms can be programmed in the same language. I guess that's one of the reasons Microsoft chose C# for XNA: games can't be easily ported from any non-Microsoft platform to Xbox Live Indie Games, encouraging developers to make their games exclusive to Microsoft platforms. C and C++ need P/Invoke, and Python and friends need Emit, neither of which is present in XNA on Xbox 360 or Windows Phone 7.

by on (#88474)
Just as Tepples noted advantages using Atalan over bare metal ASM I also see Scratch having similar merits.

If the GUI based development environment can be separated from the underlying language it wouldn't matter what platform used what compiler. I could think of:

Scratch -> BasiEgaXorz -> Genesis binary
Scratch -> BatariBASIC -> 2600 binary
Scratch -> DragonBASIC -> GBA binary and so on..

by on (#88481)
slobu wrote:
Scratch -> BasiEgaXorz -> Genesis binary
Scratch -> BatariBASIC -> 2600 binary
Scratch -> DragonBASIC -> GBA binary and so on..


CC65 seems to be already there...so all you have to do is learn C. I *guarantee* you that is more rewarding and resume-enhancing.

CC65 comes with libraries to build binaries that will run on:

NES
C=16
C=64
C=128
PET
Vic-20
Apple][
Atari
...

by on (#88565)
I guess the one other factor in choosing a high/low language is support and or community. All but one of the BASIC compilers has a pretty dead community. Atalan has a *very* nice developer but who knows when an active development crowd forms around it.

C is looking better every day.

by on (#88569)
cpow wrote:
CC65 seems to be already there...so all you have to do is learn C. I *guarantee* you that is more rewarding and resume-enhancing.


Problem is that NES C wants you to do weird things to help optimize your code, like using global/static variables instead of local variables.

by on (#88571)
Dwedit wrote:
Problem is that NES C wants you to do weird things to help optimize your code, like using global/static variables instead of local variables.

This is largely an artifact of the 6502's 8-bit stack pointer and its lack of the 65816's d,s and (d,s),y address mode, which makes stack-based local variables practical on the 65816 but not on the 6502. (C has problems throughout the 6502 family, due in part to no hardware multiply for indexing into big arrays.) It's also an artifact of the compiler's assumption that all subroutines are potentially recursive. Otherwise, the compiler could allocate local variables on zero page based on call depth, as kevtris explained to me: leaf functions' local variables start at $00, and an outer function's local variables start after those of the functions it calls.

by on (#88575)
I've heard global vars are bad.. but, in the context of constrained devices, isn't speed a better tradeoff than tidiness? In my limited experience the only drawback is my own potential for errors (stepping on an already used var).

I think BatariBASIC has a great way to deal with multi-bank sub-routines. You can call "GOSUB FOO" for local routines and "GOSUB FOO BANK3" to tell the compiler it's located on another page.

by on (#88580)
slobu wrote:
I've heard global vars are bad.. but, in the context of constrained devices, isn't speed a better tradeoff than tidiness?

It depends... if you have a lot of global variables you'll run out of RAM faster in systems that don't have much of it, like the NES, and that's a huge problem.

Quote:
You can call "GOSUB FOO" for local routines and "GOSUB FOO BANK3" to tell the compiler it's located on another page.

IMO it's not a matter of how to call them... I mean, the compiler is the one generating the ASM/binaries, so it KNOWS what routines are in different banks, it doesn't need the programmer to explicitly say it.

by on (#88582)
tokumaru wrote:
I mean, the compiler is the one generating the ASM/binaries, so it KNOWS what routines are in different banks, it doesn't need the programmer to explicitly say it.

With cc65 it's the linker that actually knows the final addresses of everything, and by then it's probably too late to insert bankswitching, at least as it is implemented. (afaik)