This page is a mirror of Tepples' nesdev forum mirror (URL TBD).

# Handy collision detection

A while back, I was throwing around ideas to make my collision detection routines more efficient, especially on the NES, where every cycle counts. I came up with a pretty cool idea that I've never heard before, but I wouldn't be surprised if someone else came up with and implemented this before I did. It just means it's an extra good idea if more than one person came up with it all on their own.

So first and foremost, I'm doing bounding-box collision detection between all of my actors, and I'm iterating through all possible combinations of actors all at once.

If I have actors A, B, C, and D, then I would check AB, AC, AD, BC, BD, CD. It's worthless to check things like AA, BB, etc, and it's redundant to check BA, DB, etc.

Let's say that in my game, there's a player actor, several collectable/powerup actors, and a few enemy actors. Let's also say that enemies don't care about powerups, and enemies don't care about each other (they don't bump into each other).

So maybe in one level, I'll have the player, 10 keys, and 3 enemies. For the player actor, the logical thing would be to check its bounding box with the bounding boxes of each of those 10 keys, and those 3 enemies to see if it's colliding with any of them. No problem!

However, why would I need to check each individual key for collisions against the other keys, and against the enemies, if I've established that they don't do anything when they collide? I'd be performing coordinate transformation calculations for a lot of actors that it doesn't even matter for.

To solve this problem, my system is to assign each type of actor to a team, and have each actor specifiy which teams they collide with. Let's say I have a Player 1 actor, a Key actor, a Health actor, and several different types of enemies.

Each actor has two bytes, which are bitmasks, which determine the team(s) they belong to (the team byte), and which team(s) they collide with (the mask byte). I'll put Player 1 on team 00000001, Key and Health can both go onto team 00000010, and all of the different kinds of enemies can go into team 00000100.

Players need to collide with the collectables and the enemies, so I'll want it to look for collisions with both teams 00000010 and 00000100, so I'll give the player actors a mask byte of 00000110.

Collectables also need to collide with players, since players can collide with collectables, but the collectables don't need to collide with each other, nor the enemies, so I'll give the collectables a mask byte of 00000001. It's important to set both teams up to collide with each other (players collide with collectables and collectables collide with players), otherwise you'll have problems depending on which order you check the actors in.

Enemies also need to collide with players, since players can collide with enemies. They don't need to collide with the collectables, and they don't need to collide with each other, so they get a mask byte of 00000001. Later, if I *do* want the enemies to collide with each other (maybe so they turn around when they bump into each other), I can change their mask byte to 00000101. That way, they'll check for collisions with the players, and members of their own team.

So in summary:
Player1 -> Team 001, Mask 110
Key -> Team 010, Mask 001
Health -> Team 010, Mask 001
EnemyA -> Team 100, Mask 001
EnemyB -> Team 100, Mask 001
EnemyC -> Team 100, Mask 001

So now we're in the level, and the collision routine needs to check each actor combination for collisions. Let's say Actor 1 is a Player1, Actors 2-10 are Keys and Health, Actors 11-14 are various EnemyAs, EnemyBs, and EnemyCs.

The routine first checks (1,2). Actor 1 is a Player1 with a mask 110. Actor 2 is some collectable on team 010.

Mask 110 & Team 010 = nonzero, so this is a valid combination of actors, let's compare 1's collision box with 2's, see if they collide, run the appropriate code if necessary, etc.

Let's say we've done that, and now we check (1,3). Same thing happens, Actor 1 is still Player1 with the same mask 110, Actor 3 is another collectable on team 010. 110 & 010 is nonzero, check these actors for a collision. (There's actually an optimization opportunity here, load actor 1's collision box, transform it, etc, and then just keep that stuff in memory while you compare it with all of the other actors, Until it's time to check Actor 2 against everyone, in which case you forget Actor 1's box, transform Actor 2's box, keep it in memory, etc)

So fast forward a bit, now we're checking (2,3). Actor 2 is a collectable with mask 001. Actor 3 is another collectable on team 010.

Mask 001 & Team 010 = zero, which means we don't even need to bother computing anything else, these two types of actors don't care about colliding with each other.

(2,4) now, same thing, collectable vs. collectable, Mask 001 & Team 010 = zero, so skip and go to the next combination.

So keep going and going until everyone's been checked. The end result is that we've saved ourselves from a lot of worthless computations trying to compare certain combinations of actors who don't care about colliding with each other. Again, on modern hardware, who really cares? However, on a system like the NES, every cycle you save is important.

Now, it's time to touch on something. Remember how it was important to say both that players collide with collectables, and that collectables collide with players?

In the example above, I checked (Player1, Key); Player1's Mask is 110, and Key's Team is 010. 110 & 010 = nonzero.

What if we were checking (Key, Player1)? Maybe in some other situation, Actor 1 was a key, and Actor 2 was a Player1. In this new case, Key's Mask is 001, and Player1's Team is 001. 001 & 001 = nonzero, so the same collision would happen, which is what we want.

If we go back to the enemies, let's say I want them to check each other for collisions, so they can turn around if they bump into each other. I said that I needed to change the masks for the enemies from 001 to 101 to do this.

With this new definition, let's check two random Enemy actors, (EnemyA, EnemyC). EnemyA has a Mask of 101, EnemyC is on Team 100. 101 & 100 = nonzero. So now, everyone on the enemy team will check for collisions with other actors on the enemy team.

This is the method I came up with, and so far, it's worked pretty nicely, and I am at sound mind knowing my code isn't doing stuff it doesn't need to be doing.
I had seen the Collision by Object Group stuff before when I used to use Games Factory, it let you assign objects to group types. But I would just make separate lists instead, so you just iterate through them, instead of using bitmasks. So you'd have a list of which objects are collectables, which are enemies, etc.

Another way to speed up collision detection is to sort the objects by X, and sort the objects by Y, then you can do in-range checks for X quickly, and in-range checks for Y quickly. Never actually tried this, but it sounds cool. Of course, this has the cost of sorting the objects. (when sorting the objects, you actually sort pointers/array index numbers, not move around the objects in memory)
Sounds awfully complicated to me... In my game, I don't have a "master" collision routine that checks for all possible collisions, I have each object check just for the collisions that are relevant to them in their AI routines. If an object doesn't collide with anything, just don't call any collision routines in its AI, which wastes 0 CPU cycles.

This is great for my game, where most objects are not aware of each other and only collide with the player and the level map, but if I had to check for other types of collisions I would just create the different groups using liked lists. I'd have a list of projectiles, a list of items, and so on. So if I wanted my projectiles to only hit enemies, in the projectile AI I'd call a "CheckForEnemyCollisions" function, which would scan the list of enemies looking for a hit with the projectile that called the function. There would be no need for bitmasks or things like that, since all projectiles are already checking all the enemies, there's no need for the enemies to check the projectiles, so there will be no repeated checks.

After having rewritten my main game engine more times than I can count, I have come to the conclusion that it is much easier to write simpler functions that each object can call as necessary than to write huge "master" managers that reign over all of the game world at all times. First because of the reduced complexity, because performing a task for a single object is easier than managing them all at once, and second because of CPU use, since complex managers will always steal some CPU time even if it's just to come to the conclusion that they don't have to do anything, but when the responsibility of executing object-related tasks is delegated to the objects themselves, they only do what's absolutely necessary, and don't waste a lot of time just checking what needs to be done.
Interesting idea. All the collision detection routines I've done are divided into separate functions. Allows me to optimize for cycles easier that way. But I guess it's really game dependent.

The way I look at it, you have player objects, player projectiles (set shared for all players), collectible items, enemies, enemy projectiles.

Compare player(s) against all enemies. Then compare player(s) against all enemy project tiles. Compare player(s) against all collectible items. Compare player projectiles against enemies.

Then any additional detection routines for objects to BG map collision if needed.
tokumaru wrote:
This is great for my game, where most objects are not aware of each other and only collide with the player and the level map, but if I had to check for other types of collisions I would just create the different groups using liked lists. I'd have a list of projectiles, a list of items, and so on. So if I wanted my projectiles to only hit enemies, in the projectile AI I'd call a "CheckForEnemyCollisions" function, which would scan the list of enemies looking for a hit with the projectile that called the function. There would be no need for bitmasks or things like that, since all projectiles are already checking all the enemies, there's no need for the enemies to check the projectiles, so there will be no repeated checks.

I guess I'm not understanding the advantage of having projectiles check enemies. In my game, the same thing doesn't always happen when an enemy collides with a projectile, and having to figure out what happens for a certain enemy within the projectile code seems really complex and inefficient. For example, a fireball enemy may not even need to detect for collision with bullets. Having bullets check for this collision wastes cycles. Also, a fire-based enemy may respond differently to ice-based weapons than an ice-based enemy. Enemies handle these collisions themselves. If an enemy detects collision with a projectile, it also decide to self-destruct if it runs out of health. Could you elaborate a little on why checking enemies is more efficient? Perhaps it also depends on the project.

I basically have it doing this: Enemies/items check with player and player bullets. Enemy and item AI responds more uniquely to collision with the player or player weapons than the player does on a collision. This may seem backwards or partially true. Item AI checks for collision with the player, and then alternates player variables directly rather than having the player handler handle any of the logic upon colliding with items. Enemies respond more uniquely to collision with player weapons than the player does to enemy collisions. Enemies always just damage players, and weapons have unique effects on enemies. So I think the entity that responds uniquely needs to be the one checking for collision, most of the time.

Dwedit wrote:
Another way to speed up collision detection is to sort the objects by X, and sort the objects by Y, then you can do in-range checks for X quickly, and in-range checks for Y quickly. Never actually tried this, but it sounds cool. Of course, this has the cost of sorting the objects. (when sorting the objects, you actually sort pointers/array index numbers, not move around the objects in memory)

I really wanted to be able to do this in my game, but when I wrote a sorting method, even the simplest method for about 8 values can take 1 to 2 thousand cycles. You may as well just forget wasting space with the sorting method and go with checking for collision straight-up. I can't even remember how long my collision detection routine is; I think its 1-2 hundred cycles.
Celius wrote:
I guess I'm not understanding the advantage of having projectiles check enemies. In my game, the same thing doesn't always happen when an enemy collides with a projectile, and having to figure out what happens for a certain enemy within the projectile code seems really complex and inefficient.

You could of course do the other way around and have enemies check for projectiles instead, it doesn't really matter. But then, if you have different types of projectiles, you might still have to do some checking to select the appropriate response. I think the only way to avoid such checks would be to build very specific lists, like fire projectiles, ice projectiles, etc., so that an ice enemy could simply ignore all ice projectiles. But checking for the type of projectile doesn't sound so bad to me.

Quote:
Enemies respond more uniquely to collision with player weapons than the player does to enemy collisions. Enemies always just damage players, and weapons have unique effects on enemies. So I think the entity that responds uniquely needs to be the one checking for collision, most of the time.

Yeah, this is OK. Which objects check which depends on the design of the game, but the technique of using lists for the different types of objects can remain the same.

Dwedit wrote:
Another way to speed up collision detection is to sort the objects by X, and sort the objects by Y, then you can do in-range checks for X quickly, and in-range checks for Y quickly.

I don't know if there's any real advantage in doing that... it might even make things slower, I think. To sort the objects you have to check their X and Y coordinates, just like you do in the bounding box collision check, with the difference that the collision check stops as soon as a collision is found to be impossible. Not only that, but there's still the overhead of manipulating the pointers to the objects in the sorted list(s).
tokumaru wrote:
Sounds awfully complicated to me...

That may be my fault, I'm not the best at explaining things in a simple way. All this method does is take two actors, AND two of their bytes together, and use that to determine whether or not they need to be checked for a collision with each other. It does this with every pair of unique actors.

tokumaru wrote:
In my game, I don't have a "master" collision routine that checks for all possible collisions [...]

I did it that way because I wanted to have all of the collisions happen simultaneously, and in one centralized location. Different strokes for different folks.

Both of our methods are still doing (basically) the same thing. The only difference is that you're using linked lists, and I'm using actor properties. Either way, we're still grouping things together and checking other groups.

tokumaru wrote:
After having rewritten my main game engine more times than I can count, I have come to the conclusion that it is much easier to write simpler functions that each object can call as necessary than to write huge "master" managers that reign over all of the game world at all times.

haha, you make it sound like it's this big horrible thing. The manager is isolated from the AI routines, so that if I ever needed to completely redo it, I'd only have to replace one section of code, and I wouldn't need to alter any AI code to compensate for the new routine. In fact, I could completely delete that section of the code, and the AI wouldn't even know the difference.

This is just how I structured my code. By all means, you should definitely stick with your own ways of doing things if it's more comfortable to you. I'm only posting about my ways in case it may be helpful to someone. I know I would've loved a post like this a few months ago.

tokumaru wrote:
First because of the reduced complexity, because performing a task for a single object is easier than managing them all at once, and second because of CPU use, since complex managers will always steal some CPU time even if it's just to come to the conclusion that they don't have to do anything, but when the responsibility of executing object-related tasks is delegated to the objects themselves, they only do what's absolutely necessary, and don't waste a lot of time just checking what needs to be done.

Yes, simplicity is better than complexity, but my method isn't doing anything complicated here:
Iterating through all unique pairs of actors (without redundancy) is trivial:
A B C D E
Check A against B, C, D, E
Check B against C, D, E
Check C against D, E
Check D against E

Each actor is compared with every other actor exactly once. The actual comparison is even simpler:
Take a byte from A, AND it with a byte from B.
From this result, we'll either compare their hit-boxes, or we'll jump straight to the next pair.

This doesn't take any more or less time to perform than iterating through linked lists repeatedly, it's just simply a different way to do things.

Dwedit wrote:
Another way to speed up collision detection is to sort the objects by X, and sort the objects by Y, then you can do in-range checks for X quickly, and in-range checks for Y quickly. Never actually tried this, but it sounds cool. Of course, this has the cost of sorting the objects. (when sorting the objects, you actually sort pointers/array index numbers, not move around the objects in memory)

I thought of that too, actually, but I eventually came to the conclusion that it would take too much effort to do the sorting, and I wouldn't really be that much better off. It's not a bad idea though, but I think that's more of a modern method to do things, which unfortunately doesn't translate well to archaic hardware like the NES.
To explain Drag's ancient vs. modern concept further: If you have enough objects to need sort-by-X or sort-by-Y collision, then you probably have too many objects for one screen or for one scanline. For example, the NES (without a really complicated coprocessor) is not the appropriate platform for a bullet hell shooter, where you really have to use sorting to prune the set of rectangles to test.
Quote:
Another way to speed up collision detection is to sort the objects by X, and sort the objects by Y, then you can do in-range checks for X quickly, and in-range checks for Y quickly. Never actually tried this, but it sounds cool. Of course, this has the cost of sorting the objects. (when sorting the objects, you actually sort pointers/array index numbers, not move around the objects in memory)

If you're talking about a two pass system, then you won't gain any additional speed for doing that on a simple rectangle/square collision box (anything other than square/rectangle, I could see doing that). Actually, I can't see it doing anything but increasing over all cpu resource needed.

You can prioritize which should be the first check, X or Y. Depending on the game design one might occur more often than the other.

My current routines directly checks to see if the object is outside the bounds of another box, not inside. If box1 X position is too far left of box2 or is too far right of box 2 then skip rest of Y checking, else boxes overlay on X position - now check the same way Y. I assume this is pretty much the standard way of doing it, right?

Box A (Xa1,Ya1)(Xa2,Ya2) and box B (Xb1,Yb1)(Xb2,Yb2). Xn1,Yn1 is the upper left hand coords of the box and Xn2,Yn2 the lower right.

Code:
lda Xa2,y
cmp Xb1,x
bcc .no_collision
lda Xb2,x           ;reverse the compare order since there's no "greater than' only flag.
cmp Xa1,y
bcc .no_collision
;else X is a match, check Y
Drag wrote:
Either way, we're still grouping things together and checking other groups.

Yeah, the basic difference is when and where in the code those checks happen, but final effect is pretty much the same.

Quote:
This is just how I structured my code. By all means, you should definitely stick with your own ways of doing things if it's more comfortable to you.

Sure, the same to you. I know how difficult it can be to find solutions for problems like these, and when we finally do we grow very fond of our conclusions.

Quote:
I'm only posting about my ways in case it may be helpful to someone.

Yeah, same here. Maybe your method sounded more complicated than it actually is, so I kinda felt the need to present an alternative solution, even if only to let people know that there are various possible ways to solve this issue.

tokumaru wrote:
Iterating through all unique pairs of actors (without redundancy) is trivial:
A B C D E
Check A against B, C, D, E
Check B against C, D, E
Check C against D, E
Check D against E

Sure, you found a simple way to make sure the collision checks aren't redundant, but you still waste some CPU time ANDing the masks of objects that will never collide. It might not seem like much with just 5 objects, but it might add up to something significant when there are, say, 20 active objects. Let's see what happens with 10 objects:
Code:
A B C D E F G H I J

check A against B C D E F G H I J
check B against C D E F G H I J
check C against D E F G H I J
check D against E F G H I J
check E against F G H I J
check F against G H I J
check G against H I J
check H against I J
check I against J

That's 45 checks you have to do just to figure out what collisions to check for. I'll assume your loop looks something like this:

Code:
ldy #FIRST_OBJECT ;2 cycles
OuterLoop:
tya ;2 cycles
tax ;2 cycles
inx ;2 cycles
InnerLoop:
beq SkipCollision ;3 cycles
;;;COLLISION CHECK GOES HERE;;;
SkipCollision:
inx ;2 cycles
cpx #LAST_OBJECT+1 ;2 cycles
bne InnerLoop ;3 cycles
iny ;2 cycles
cpy #LAST_OBJECT ;2 cycles
bne OuterLoop ;2 cycles

If this is really the case, just the inner loop, repeating 45 times, will use nearly 8 scanlines. Considering the rest of the code, that doesn't run 45 times, it goes a little past 8 scanlines.

Now say that 4 of these objects are enemies and 6 are projectiles. This means you'll need 24 collision checks:

Code:
enemy A against projectiles E F G H I J
enemy B against projectiles E F G H I J
enemy C against projectiles E F G H I J
enemy D against projectiles E F G H I J

The collision checks will obviously take more time, but that doesn't matter because the same number of collision checks would be performed either way. The point is that while with collision masks you'd need to spend 8 scanlines just figuring out who can collide with who, with the linked lists you can skip straight to the collision checks.

Quote:
Take a byte from A, AND it with a byte from B.

This might not be so versatile if 8 bits aren't enough for all the group combinations you want to make.

Quote:
This doesn't take any more or less time to perform than iterating through linked lists repeatedly

I think there is a difference... it might be negligible to you, but it is there. A liked list tells you right away which objects you need to collide with, there is no need to waste CPU time figuring out whether one type of object should collide with another... if an objected requested collision checks against the objects in a linked list, you just know that ALL the objects in the list represent possible collisions.

I'm not saying your solution sucks, it's probably perfect for your game, or you wouldn't be using it. But I did notice some possible problems with your method, so I thought I should bring them up. Maybe they are not really problems for you, but I still wanted to show another way to handle collisions, and I'm trying to explain the difference. I hope you don't take this as an attack to you or as a claim of superiority from my part, I just want to get what I think is a good solution out there, so other people can hopefully use our methods as inspiration for making their own, rather than copying one or the other directly.
Don't worry, I didn't take any offense to anything. Part of this thread was to get critiques, after all. Plus, it's always a good thing to present multiple ways to solve the same problem. Programming would be very boring if we all had to do it one way!

Yeah, you do have a good point; you perform less checks if you group the actors together with those lists, but a shortcoming with that is the fact that you're storing several lists, one list for each group. If you have a lot of objects and/or groups, that could potentially use up a lot of memory, even if it's just lists of pointers. If your linked lists are very dynamic (for example, projectiles being added and removed frequently), then you may end up with a heap of memory with lots of holes, and every time you want to add an object to a list, you'd need to scan through the whole heap to locate a free spot.

The only advantage my method would have over this is the fact that the memory is more static; the only actor related stuff in the memory are the actors themselves. Each actor has one byte (defined by the actor type, with that table in ROM) which can assign up to 8 teams simultaneously (and a byte to 'watch' 8 teams simultaneously), where the linked list method would require duplicate list entries (in ram) to achieve the same thing.

If you have the memory to spare though, the efficiency of the checks may be worth the expense of extensive memory management, because arguably, the checks would be occuring a lot more frequently than the addition/removal of objects.

So I guess it comes down to CPU vs Memory usage.
Drag wrote:
So I guess it comes down to CPU vs Memory usage. :P

It's almost ALWAYS a CPU vs Mem issue
Drag wrote:
If you have a lot of objects and/or groups, that could potentially use up a lot of memory, even if it's just lists of pointers.

I admit that there might be some memory overhead depending on how you implement the lists, but my implementation just requires one byte to point to the first element and one byte per object (it's in a constant place in the object's memory) to indicate the next element in the list.

Quote:
If your linked lists are very dynamic (for example, projectiles being added and removed frequently), then you may end up with a heap of memory with lots of holes, and every time you want to add an object to a list, you'd need to scan through the whole heap to locate a free spot.

You can use the same solution again, and have a linked list of empty slots, so there will be no need to scan. When you need a free slot, just grab the first one and make the second one the new first, and when you free a slot you can do the opposite. Memory use will just increase by 1 byte, the one used to indicate the first free slot, because objects can't be in both lists, so you can still get away with only one byte per object indicating the next one.

Quote:
Each actor has one byte (defined by the actor type, with that table in ROM) which can assign up to 8 teams simultaneously (and a byte to 'watch' 8 teams simultaneously), where the linked list method would require duplicate list entries (in ram) to achieve the same thing.

Not really, as explained above. If your collision masks are stored in ROM (if they were in RAM, the memory use would be practically the same as the linked lists way), that inner loop that checks the flags will take even more CPU time.
tokumaru wrote:
Quote:
If your linked lists are very dynamic (for example, projectiles being added and removed frequently), then you may end up with a heap of memory with lots of holes, and every time you want to add an object to a list, you'd need to scan through the whole heap to locate a free spot.

You can use the same solution again, and have a linked list of empty slots, so there will be no need to scan. When you need a free slot, just grab the first one and make the second one the new first, and when you free a slot you can do the opposite. Memory use will just increase by 1 byte, the one used to indicate the first free slot, because objects can't be in both lists, so you can still get away with only one byte per object indicating the next one.

Wow, never thought of this trick. That's a great idea!
Wikipedia has heard of free lists.
Dwedit wrote:
tokumaru wrote:
You can use the same solution again, and have a linked list of empty slots, so there will be no need to scan. When you need a free slot, just grab the first one and make the second one the new first, and when you free a slot you can do the opposite. Memory use will just increase by 1 byte, the one used to indicate the first free slot, because objects can't be in both lists, so you can still get away with only one byte per object indicating the next one.

Wow, never thought of this trick. That's a great idea!

Yeah, that's phenomenal!
This has been a very timely discussion as I am implementing collision detection this week Here is my approach, but it is taking 42 scan lines for 8 enemies, 5 bullets and 1 player. I'd love some advice on how I can improve this.

I am basically implementing the collision grouping idea of the OP but in a different way. I have very narrowly defined criteria for what will collide with what.

1. Player projectiles with Enemy ships
2. Enemy projectiles with the Player's ship

As such I did not use the bitmask approach. Here's what I do:

Code:
1. Iterate through all objects
a. Calculate the bounds of each object in RAM (top, left, bottom, right)
b. Place the object in a collision array based on it's type (player projectile, enemy projectile, or enemy)
c. Make note of the player's object number when we see it
2. Iterate through all player projectiles
a. Iterate through all enemy ships and compare bounds
b. If there is a collision, handle it
3. Iterate through all enemy projectiles
a. Compare bounds with the player's ship
b. If there is a collision, handle it

This is about as efficient as I can think to make it. I am only calculating bounds once per each object and I am only comparing those bounds for unique sets of objects. My problem comes with scaling. With 8 enemies and 5 player bullets that's 40 comparisons.

I know it is possible to do very large scale object collision on the NES. Take LifeForce for example. The player can have up to 10 projectiles on the screen at any one time along with up to 30 enemies (see level 2, falling rocks segment) without slowdown.

I tried limiting my game to 30 FPS to give me more time for collision detection, but this looks simply awful. Should I keep looking at multiplexing or am I just doing collision detection poorly?

[EDIT]
Trying not to hijack the thread here, I just wanted to let others that read this later know how I resolved my problems. In the above code block I was using a subroutine to check the bounding rects. In-lining this saved about 14% of the execution time. I then multiplexed the collision detection, only processing even-numbered bullets on even-numbered frames and vice versa.
[/EDIT]
Here is my approach, but it is taking 42 scan lines for 8 enemies, 5 bullets and 1 player.

Is that just for the collisions? That time might not be so bad depending on how much you spend on the AI and background collisions (something you might not even have in a space shooter, for example).
OK, I'm verging on thread-jacking at this point I'll turn the conversation more generic.

Are you all using the entire sprite area as the bounding rectangle or are you using X and Y offsets with a width and height? I am doing the latter for more accurate collision detection, but it eats CPU time due to the multiple adds.

Also, is everyone using 16-bit values for X and Y? I am using 8-bit values with two bits in the flags byte for the object that act as a ninth bit. I skip any object that is off-screen in either direction (as in I don't even generate bounding rects for them) and then clip the bounding rect to the screen. This way the collision detection code is a LOT simpler.

I am using a JSR / RTS for the collision test. I wounder if in-lining this would help much. I'll test that next.

What I would like to do is put together a best practices document for collision detection and response on the NES. Of course it would not apply for every genre, but I see this as a big barrier to entry for folks wanting to develop more complex NES homebrew.
There's another trick you can use to speed up bounds-checking.

If you only check the enemy's X range against the bullet's X range, and you find that they're not overlapping, then there's no way a collision can be occuring, and you don't need to check the Y range.

Same thing for Y, if the two actors' Y ranges are not overlapping, then there's no way they can be overlapping, and you don't need to check X.

Furthermore, you can write some pretty fast code by being careful with how you perform your checks.

Two actors, A and B, are colliding if both their x ranges and y ranges are overlapping:

Code:
A = Some actor
Calculate A's hitbox here.
B = A
LOOP:
B++
IF (B > last actor) THEN exit
Calculate the X coordinates of the left and right edges of B's hitbox here.
IF (A.left > B.right) THEN GOTO LOOP
IF (A.right < B.left) THEN GOTO LOOP
; If you reach this point, A and B are overlapping on the X axis.
Calculate the Y coordinates of the top and bottom edges of B's hitbox here.
IF (A.top > B.bottom) THEN GOTO LOOP
IF (A.bottom < B.top) THEN GOTO LOOP
; If you reach this point, a collision is occuring between A and B.
Handle the collision.
GOTO LOOP

This way, you can eliminate half of your calculations most of the time. If you're doing a vertical shmup, then I'd recommend doing the Y check first, then X.

This is exactly how I'm doing collision detection. Additionally, I "reworded" the conditional statements so they're always "less than" comparisons, since you only need one branch for those, versus "greater than", which requires two (remember, 6502 only has >=, not >).
Also, is everyone using 16-bit values for X and Y? I am using 8-bit values with two bits in the flags byte for the object that act as a ninth bit. I skip any object that is off-screen in either direction (as in I don't even generate bounding rects for them) and then clip the bounding rect to the screen.

You know, I was thinking about this the other day. I currently use 16-bits for all coordinates, but it might be a good idea to only check collisions for objects that are on screen and use 8 bits instead... The problem is that this works well for sprite collisions, but for background collisions you still need 16-bit coordinates...

Quote:
I am using a JSR / RTS for the collision test. I wounder if in-lining this would help much.

JSR/RTS inside a loop could add to alot of extra time. I tend to inline everything that is only done in a single place... Hell, I'll even duplicate a routine if using it inlined results in significant speed increase.
Quote:
Quote:
I am using a JSR / RTS for the collision test. I wounder if in-lining this would help much.

JSR/RTS inside a loop could add to alot of extra time. I tend to inline everything that is only done in a single place... Hell, I'll even duplicate a routine if using it inlined results in significant speed increase.

In-lining that routine reduced the execution time by 15%! Yay!

@Drag,

Don't know why I did not think of that before. Due to my very narrow requirements this will fit very nicely. I will implement this soon.

I also implemented collision multiplexing, only doing 50% of collision detections per frame. I only consider odd-numbered bullets on odd-numbered frames and vice versa. By managing the maximum speed of the projectiles with the the minimum width of the enemies I shouldn't miss collisions.
By managing the maximum speed of the projectiles with the the minimum width of the enemies I shouldn't miss collisions.

Another way not to miss collisions is to extend a projectile's hitbox backward by how far it has traveled over the past frame or two.
As for multiplexing collision detection I get some issues with diagonal movement of bullets. This can sometimes cause a false miss. I kinda like it though, it's sort of like grazing in modern SHMUPS.

Quote:
Another way not to miss collisions is to extend a projectile's hitbox backward by how far it has traveled over the past frame or two.

Thanks for the hint Tepples, I'll implement that if I ever have issues with "shooting through a wall" as it were.

Quote:
@Drag,

Don't know why I did not think of that before. Due to my very narrow requirements this will fit very nicely. I will implement this soon.

I don't know what I was thinking here. This reduces performance I still have to consider every object's bounding rect more than once, therefore caching their bounding rect in memory and running comparisons from there is more efficient.