This page is a mirror of Tepples' nesdev forum mirror (URL TBD).

# CRC routines as PRNGs

Tepples mentioned it, which made me curious.

Traditional CRC routines look a lot like LFSR-based PRNGs. Basically, you start with a variable, SEED, and you shift it left once, and if you shift out a 1, you XOR SEED with a constant. If you shift out a 0, you do nothing else. You typically want to shift these 8 times each time you want a random number, to ensure that you get 8 random bits in your byte.

CRC generators work the exact same way, except SEED is called CRC, your xor-constant is called the "polynomial", and it typically performs 8 shifts at a time. CRC generators XOR with a byte of data before it does any shifting, so to use a CRC routine as a pseudoranom number generator, you either pass a constant to it, or you remove that data-byte-XOR all together (equivalent to passing a constant of \$00 to it). With an 8-bit generator, you're going to want to keep the data-XOR in and just pass it a constant, because if the generator ever outputs zero, it'll get stuck outputting zeros indefinitely.

Now more to the point of my research, Greg Cook somehow managed to simplify the entire routine to run in (near)-constant time, without needing to individually shift each bit. I figured out his shortcut for CRC-8 (which I'll put at the end of my post), but I'm having trouble figuring out how he did the CRC-16 routine. Ultimately, I would like a CRC-32 routine that is superefficient like his CRC-16 routine, so that's why I want to figure this out. The best advantage to these routines is that they're the equivalent of doing 8 shifts, so you get 8 random bits with each call, but it takes significantly less time than actually doing 8 shifts.

Code:
== Shortcut method for CRC-8 ==
Your polynomial must be an odd number. (all CRC-8 polynomials I've seen are odd anyway).
Your polynomial cannot be \$01. (This would result in your new byte being the exact same as your starting byte)

Start with the MOST SIGNIFICANT 1 BIT of the polynomial, and with each shift, move to the next bit right.
Shift.
If there's an overflow, xor with the polynomial.
If the current bit of the polynomial is 1, xor with your starting value.
Repeat until you're out of bits in the polynomial.

So, if your polynomial is \$07 (which is actually in one of the CRC-8 specs), you only need to perform two shifts.
===============================
Any data on how varied the distribution is on these modified CRC routines?
No data whatsoever!

I have reason to believe they should be acceptable though, just given the way CRC polynomials need to be created.
It's pretty easy to test a PRNG. Just run it through its cycle and record the results.
Not having a good day (health-wise), but I wanted to chime in and say this is quite clever. I never would have thought of it (but that's not saying much, honestly).
So I looked into the CRC-16 shortcut that Greg Cook did, and essentially, the code does this:
Code:
The polynomial is x^12 + x^5 + x^0, a.k.a. \$1021 (00010000 00100001)
[CRC_HI] [CRC_LO]
abcdefgh ijklmnop
│
┌         ▼
│ ijklmnop abcdefgh = Input, shifted left 8 bits (wraps around because x^0)
XOR ┘ ...abcde fgh..... = x^5
Together ┐ efgh.... ....abcd = x^12
│ abcd...a bcd..... = Feedback
└     │       │
▼       ▼
[CRC_HI] [CRC_LO]

I had to pick apart those strings and assemble them like that in order for it to make sense. The first three lines are straightforward enough; wherever there's a 1 in the polynomial, you XOR a shifted copy of the input data. However, the "feedback" line didn't make sense until I really thought about it.
If you start with an input of \$0100, then you get \$1021 out of it. Makes sense, because the very last of the 8 shifts will shift out a 1, and that'll make you XOR the polynomial into your input.
If you start with \$0200, you get \$2042, which, as expected, is the polynomial again, but shifted left once.
If you start with \$1000, you get \$1231. What? Why isn't this the polynomial, just shifted left 4 times?

The reason is because you need to think about what's going on when you use a traditional CRC routine. You do XOR a copy of the polynomial shifted left 4 times, but this creates a 1 bit that changes the result of one of your upcoming shifts before you've finished shifting all 8 bits.
On the fourth shift, you XOR the polynomial, which places a 1 such that in another four shifts, you're going to XOR the polynomial a second time. So basically, this is feedback.

The reason there are 4 bits of feedback in this specific case is because the polynomial, \$1021, has its most significant "1" bit as the fourth bit from the top four bits away from the final shift. If the polynomial were instead \$2021, then you'd have 3 5 bits of feedback. [Edited, I tested this]

So the key to creating a really quick CRC-32 routine (and thus, a very powerful PRNG) would be to find a polynomial that has the least amount of 1 bits in it.
For a RNG I use something that Blargg used to post here which basically does :
Code:
rand = (rand * 5 + 0x3611) & 0xffff

Works wonders, and can be done very quickly, without even using X or Y registers.
Bregalad, is there a mistake in your code? I just tested distribution of this random, and it is really bad, practicaly no distribution at all.

I personally use a random random code, i.e. it is not mathematically designed, I just put it by a guess, which has more or less nice distribution, although I'm sure it is far from perfect. Vars are 16 bit words:

Code:
seed1+=(seed2>>1);
seed2-=(15^seed1);
return seed1;
For a RNG I use something that Blargg used to post here which basically does :
Code:
rand = (rand * 5 + 0x3611) & 0xffff

That's called a linear congruential generator. LCGs tend to have a "hyperplanes" problem unless you use only the high byte.

I couldn't find blargg's code in my search; might it have been somebody else?
I use something like this to experiment with various LFSRs and their periods:
Code:
#include <stdio.h>

static unsigned lfsr( unsigned n )
{
unsigned c = n & 1;
n >>= 1;
if ( !c )
n ^= 0x860E;

return n;
}

int main()
{
unsigned n = 1;

int period = 0;
do
{
period++;
if ( period > 0x100000 )
return 0;

n = lfsr( n );
}
while ( n != 1 );

printf( "%d\n", period );
return 0;
}

Note how lfsr() is written similar to how the 6502 asm would be: save bit that would be shifted into carry, right shift, then test carry. There are lists of "maximal-length" lfsr seeds.

Also, CRC LFSR seeds are designed for particular characteristics that help detect bit errors, thus they are not going to be the most optimal to implement in 6502 for a random number generator. Look around for seeds that favor fewer instructions.
I don't remember exactly, but I'm 100% sure I got this code from this very board, and it was by a regular. I would have bet blargg, but pehaps it was another person. Pehaps it was in the old WWWThread boards, I'm not sure.

I'll post it anyways :
Code:
GetRandomNmr       ; See "linear-congruential random number generator" for more.
; rand = (rand * 5 + 0x3611) & 0xffff;
; return (rand >> 8) & 0xff;
lda RandomH     ; multiply by 5
sta RandomTemp
lda RandomL
asl a      ; rand = rand * 4 + rand
rol RandomTemp
asl a
rol RandomTemp
clc
pha
lda RandomTemp
sta RandomH
pla      ; rand = rand + 0x3611
clc
sta RandomL
lda RandomH
sta RandomH
rts      ; return high 8 bits

It works wonders to me, and is very fast.

LFSRs are great, but they are probably more suited to a hardware implementation as they can be done with N flip-flops, several inverters and a XOR gate.
This is a bit slow to emulate in software in my opinion, as you'll need at the very least 16-bit, so it means quite a lot of shift/rotate instructions to execute to get new 8 random bits. A single mult + add is definitely faster.

Back to the original topic, those CRC routines could also be very useful to generate some cryptography for passwords in a video game.
I mean, take the first byte, compute the CRC, then XOR it with the second byte, take the CRC, etc... and you get all bytes encrypted.
Well, I'd be looking for a polynomial with as few terms as possible.

With some help of Wikipedia, I found the polynomial x^n + x^31 + x^30 + x^10 + 1, and if my math is correct, that should be
11000000 00000000 00000100 00000001 = C0000401

So, when a 1 is shifted out from an ASL, the variable is XORed with C0000401.

I may be wrong though.

C0000401 creates a lot of feedback, which requires a lot of extra XORs along with shifts. Perhaps a polynomial that begins with 00xxxxxx would be better.
I mentioned in the other thread, but the LFSR generators don't necessarily have to generate all 8 bits. For full effectiveness you should generate as many new bits as you are consuming, but in practice 2 or 3 generations might be enough for what you're doing. Even just 1 generation is okay for some purposes.

Linear congruential generators like Bregalad just suggested can be pretty good too. They're what I normally use on a platform that has a multiply. Using 5 as the multiplier isn't the greatest, but I'm sure the results are useful enough. (Any two primes for the multiply and add will work, though it's normal to choose a multiplier that is a bit larger than that.)
Drag wrote:
Well, I'd be looking for a polynomial with as few terms as possible.

The polynomials can get extremely simple, for instance the APU Noise LFSR only has two taps. This does not tend to be optimal for "quality", but again, fine for some purposes. (In this case, 1 bit noise.)
Well yeah, unless I can find a decent 32-bit polynomial (decent meaning maximal-length, as few 1s as possible, ideally with them spread out and away from the most significant 8 bits, while also allowing a good random distribution), it's probably a lot easier to just use a traditional LFSR instead of trying to come up with an optimized one.
Super Mario Bros. Pseudo code:

Code:
if ( (PseudoRandom[0] & 2) ^ (PseudoRandom[1] & 2) ) == 0
clc
else
sec
endif

for x = 0 to 6 {
ror PseudoRandom[ x ]
}

I should add: PseudoRandom[0] is initialized with \$A5.

Also you could say:

Code:
c = ( (PseudoRandom[0] & 2) ^ (PseudoRandom[1] & 2) )

for x = 0 to 6 {
ror PseudoRandom[ x ]
}
All LFSRs must have the first bit set.

http://www.newwaveinstruments.com/resou ... stages.txt is a list of many different 32-bit LFSRs; they don't know of any with fewer taps.

If you're willing to use a 31-bit LFSR there's a few with just 2 taps: http://www.newwaveinstruments.com/resou ... stages.txt

Also, the sequence of all maximal-length LFSRs contain 2ⁿ different 2ⁱ-bit-long sequences of both 1 and 0, where 2⁽ⁿ⁺ⁱ⁾-1=LFSR period length.
It works wonders to me, and is very fast.

I checked my code few times (difficult to make a mistake), and what I'm getting from this is this picture - totally not random. I'm just generating two 8-bit number with this code to get X,Y for a point, and putting many thousands of them on the screen (points are black). Normal C rand or the code that I posted above both fill the whole screen leaving no gaps, as it expected from a properly working random generator.

Attachment:

random.png [ 1.62 KiB | Viewed 2695 times ]
Drag wrote:
Well yeah, unless I can find a decent 32-bit polynomial (decent meaning maximal-length, as few 1s as possible, ideally with them spread out and away from the most significant 8 bits, while also allowing a good random distribution), it's probably a lot easier to just use a traditional LFSR instead of trying to come up with an optimized one.

I think the general rule of thumb is you want about half 1s to stirr things up.
(dunno if that's valid)

Google "Galois LFSR" if you don't know what that is.

A common ploy is to combine two different types of PRNGs
in the hopes that they'll hide each others deficiencies.
Say, an LFSR and an LCG XORed toghether

So instead of a 32 bit PRNG you might combine two 16 bit ones

Here are lists of polynomials:
http://www.ece.cmu.edu/~koopman/lfsr/
Galois and Fibonacci are just two "opposite" ways to implement a particular LFSR (with Galois being simpler in software).
bogax wrote:
A common ploy is to combine two different types of PRNGs
in the hopes that they'll hide each others deficiencies.
Say, an LFSR and an LCG XORed toghether

I would not recommend this. A better quality generator will be superior to the XOR of two crappy generators, and it's not likely you're saving performance/complexity vs the alternative. It may sometimes be an improvement over using just one crappy generator (provided they don't have overlapping/dependent failings-- and you really need to test this to make sure, because it's more likely than you might think), but it's a poor way to solve the problem of low-quality random numbers.

Two points though:

1. For a lot of gameplay situations, a crummy random generator is sufficient, since a lot of the randomness comes from the timing of the human player and when they cause random decisions to happen. A lot of logical decisions are binary anyway, so 1 random bit from an LFSR can be totally fine. There are situations where you might need better quality numbers however, such as a turn based battle system, or trying to procedurally generate a level / background (shiru's test is one of many that will quickly show a poor generator). Consider how good you need the numbers to be before you decide to seek a more complex, higher quality PRNG.

2. How often is the PRNG being used? If we're talking about a big batch during level load only, or if they're only being used once a frame, or not even every frame, there's no reason to be skimpy about performance if quality random numbers would be preferred. You probably only need a fast PRNG if you have a continual stream of random numbers during gameplay.

Long story short, "good" and "fast" PRNGs both have their place. If you're going to cut and paste somebody else's code without evaluating, I'd say it's probably better to err on the side of "good" rather than "fast".
lidnariq wrote:
All LFSRs must have the first bit set.

http://www.newwaveinstruments.com/resou ... stages.txt is a list of many different 32-bit LFSRs; they don't know of any with fewer taps.

If you're willing to use a 31-bit LFSR there's a few with just 2 taps: http://www.newwaveinstruments.com/resou ... stages.txt

Also, the sequence of all maximal-length LFSRs contain 2ⁿ different 2ⁱ-bit-long sequences of both 1 and 0, where 2⁽ⁿ⁺ⁱ⁾-1=LFSR period length.

Thanks lidnariq! I actually ended up using one of the polynomials from 32stages.txt. Specifically, [32, 23, 21, 16].
Ok everybody, I have a working 32-bit lfsr using the shortcut method I figured out. I'll need to see if I can improvie it at all, since this was just quick and dirty. However, it runs in constant time, and it probably could be simplified even more, I'd just need to look at it some more.
Code:
game_random2
txa
pha
lda rnd4   ;3
tax      ;2  5
eor rnd2   ;3  8
sta rnd2   ;3  11
lsr rnd4   ;5  16
ror      ;2  18
lsr rnd4   ;5  23
ror      ;2  25
pha      ;3  28
txa      ;2  30
eor rnd4   ;3  33
sta rnd4   ;3  36
pla      ;4  40
lsr rnd4   ;5  45
ror      ;2  47
and #\$e0   ;2  49
eor rnd2   ;3  52
pha      ;3  55
lda rnd4   ;3  58
eor rnd3   ;3  61
sta rnd4   ;3  64
pla      ;4  68
sta rnd3   ;3  71
lda random   ;3  74
sta rnd2   ;3  77
txa      ;2  79
sta random   ;3  82
pla
tax
lda random
rts

I've had a simple NES program (the one this routine is for, actually) running a loop where it begins by copying 8 random bytes generated from this routine, and then repeatedly reads from the routine, looking for anything that matches this 8-byte string. After 10 minutes, it's still looking, so I think this has more than enough of a period for non-repeating randomness. However, I don't know how well the values are distributed; they look pretty random to me!

For the record, the above code is the equivalent of a Galois LFSR that uses \$00A10001 as its XOR, where each byte of output is the result of 8 shifts.

If anyone wants to use this code or run some tests on it, be my guest. (Also, if anyone wants me to explain how I came up with this code, I'll be happy to explain; I just don't want to post a wall of text if nobody's interested. )

Edit: One last thing! You need to initialize this routine by writing a nonzero value to random, rnd2, rnd3, or rnd4. I did it by writing \$80 to rnd4.
Remember the trick someone mentioned a while back here (Bregalad?) of continuously writing the result to \$4011 and listening for any audible patterns.
RANDU: A Cautionary Tale

Also, it may be worth running your PRNG for a while before you start testing for cycles. It is possible to enter a cycle from a seed value outside that cycle. (Also, obviously if you want to test longer cycles you can reimplement it on a modern computer. I usually do this first, and then implement it on the target hardware after.)
blargg wrote:
Remember the trick someone mentioned a while back here (Bregalad?) of continuously writing the result to \$4011 and listening for any audible patterns.

Hah, that's a pretty cool idea. I don't hear any buzzing or any subconsciously repeating tones or anything, it just sounds like white noise. It sounds even better than the APU's noise channel if you can believe it (which you probably can). Though, comparing with the PRNG I was using before this, I can tell this newer run runs faster.

rainwarrior wrote:
RANDU: A Cautionary Tale

Also, it may be worth running your PRNG for a while before you start testing for cycles. It is possible to enter a cycle from a seed value outside that cycle. (Also, obviously if you want to test longer cycles you can reimplement it on a modern computer. I usually do this first, and then implement it on the target hardware after.)

I did think of that, so I had the tester generate 4 random numbers before saving the initial string of 8. Hypothetically, if the polynomial I chose is indeed a maximal-length one, then that means the number of unique states for the shift register is (2^32)-1, meaning, the shift register should cycle between every possible combination of 32 bits, minus the all-zeros one. That means, the only closed-off cycle is the all-zeros one.

As for testing on a modern computer, I would have, but I don't have any IDEs conveniently set up at the moment, other than for NES programming. I guess it saves me the need to convert my routine back and forth.
rainwarrior wrote:
bogax wrote:
A common ploy is to combine two different types of PRNGs
in the hopes that they'll hide each others deficiencies.
Say, an LFSR and an LCG XORed toghether

I would not recommend this.

Not sure I'd go so far as to recommend it, but it certainly is a common ploy.

Quote:
A better quality generator will be superior to the XOR of two crappy generators,

Right no true scotsman would use a crappy generator

Quote:
and it's not likely you're saving performance/complexity vs the alternative.

I definetly disagree with that. What's your idea of the alternative? (ie a better generator
that's as simple and fast as an LFSR and an LCG)

Quote:
It may sometimes be an improvement over using just one crappy generator (provided they don't have overlapping/dependent failings-- and you really need to test this to make sure, because it's more likely than you might think),

I have no argument with that.

Quote:
but it's a poor way to solve the problem of low-quality random numbers.

Where do you get that idea?
Drag wrote:
then that means the number of unique states for the shift register is (2^32)-1, meaning, the shift register should cycle between every possible combination of 32 bits, minus the all-zeros one. That means, the only closed-off cycle is the all-zeros one.
Right. LFSRs are extremely well characterized (and bleed state, so it's easy to calculate the next bit and taps after watching the output for a while, but that's different from what we perceive as predictable): they are white, wide-sense stationary noise source with a too-perfect distribution of runs.

As a tangent, I managed to wikiwalk my way to the shrinking generator
I tested with a seed that I grabbed from a minute's worth of random numbers, \$C83C5447, and after another ten minutes of random numbers (starting with that seed), there still haven't been any cycles detected.

The only reason there'd be a closed-off cycle is if I messed up when converting the polynomial into its hexadecimal representation: I took [32, 23, 21, 16] (which is supposed to be a maximal-length polynomial), interpreted it as (x^32 + x^23 + x^21 + x^16 + x^0), which I then interpreted as \$00A10001 (I used x^0 instead of x^32). I wasn't sure if I was allowed to use x^0 instead of x^32, but it seems to be working so far.

So if everything is correct, this prng should give you 4,294,967,295 random numbers before it repeats the cycle. The only danger would be if it somehow got stuck in the all-zeros state, which I don't think is supposed to be naturally possible with an LFSR, be it Galois or Fibonacci.
bogax wrote:
Right no true scotsman would use a crappy generator

I'm not sure how this was a "no true scotsman" argument. If faced with a choice of XORing two 16 bit generators, or one equivalent 32 bit generator (or even a 24 bit generator), I believe you would find with empirical testing that the latter will give better results, and can be done with equivalent performance.

bogax wrote:
What's your idea of the alternative? (ie a better generator that's as simple and fast as an LFSR and an LCG)

A 32 bit LFSR or LCG will give better 16-bit results than a 16 bit LFSR XORed with a 16 bit LCG.

bogax wrote:
rainwarrior wrote:
it's a poor way to solve the problem of low-quality random numbers.

Where do you get that idea?

The idea comes from my experience with random number generators used for a variety of purposes. I've tried this technique before and I call it a poor solution because I think you can get better results from a single generator of similar computational complexity.

The reason I don't think XORing two generators together is that both signals have a periodicity that is much-much-much shorter than the equivalent single 32 bit generator. Even though they are interfering with each other via XOR, there are situations where both periods will show through this operation.

There are also many situations where the difference doesn't matter, and it's not a big deal how you build your PRNG, but I'm talking about the case where all else is more or less equal, and if your goal is to produce a more uniform, less predictable PRNG, I think the XOR approach is a bad idea. There are better ways to improve the output of your PRNG for equivalent computational cost.

I'm sorry to be dismissive of the idea, but this is my honest opinion of it.
Shiru, did you remember to take only the 8 most significant bits of the result when you plotted this ?
I think the least significant bits are not random but the most significant bits are.

At least I tested this RNG with the \$4011 white noise technique before using it and it worked fine.
I tried both LSB and MSB, the difference is subtle, MSB just gives more think lines on the picture, but it still does not look like random.
Have you put the right multiply and add constants ? Because I just tried to plot it in GNU octave, and it actually looks very random to me :
Code:
x = 1
for i=1:1000; x(i+1) = bitand(x(i)*5 + 0x3611, 0xffff); end
y = bitand(x, 0xff00) / 0x100
plot(y)

Pehaps there is other values that would work better than 5 and 0x3611, in fact I have no idea what makes these values work particularly well or not well. In any cases, it would not make the thing more complex or slower.

PS : Oh I know ! What seed did you use ? I used 1, perhaps if you use another seed (such as 0) the sequence could be not random at all.
I tried various values for seed, they all give the same result.

I think I know how to explain the results: it simply produces very short repeating sequence, like 256 or something numbers. As my test array is 256x256, it simply can't produce enough variety to fill every possible position.
I just checked with GNU octave, and the RNG actually has a sequence of 65536 states, which means it is optimal for a 16-bit RNG (it could technically not get any better with 16-bit). It is also, of course, independant of the seed, since the loop goes through all possible states.

I execute this script as a proof :
Code:
x = 0;
for i=1:65537;  x(i+1) = bitand(x(i)*5 + 0x3611, 0xffff); end;
y = bitand(x, 0xff00) / 0x100;
% plot(y)
find(not(x))   % Print when the RNG rolls back to value '0'

% Compute the probability of all 256 possible values

for i=1:256; z(i) = length(find(not(y-i-1))); end;
plot(z)

It also shows that all values are almost equiprobable for the 8 MSBs.

I don't know how you test the RNG, and how you use a bi-dimentional array for it. Would you share the details ? Thanks.
I often use quick and dirty random number generators, but to improve randomness I continuously clock the random number generator while waiting for VBlank. Since the time taken to process a frame is pretty random depending on which objects are active and all the actions the engine takes in different frames, that seems to work out well.
Shiru wrote:
I think I know how to explain the results: it simply produces very short repeating sequence, like 256 or something numbers. As my test array is 256x256, it simply can't produce enough variety to fill every possible position.

I tried plotting this RNG, too, and got this picture:

Attachment:

rng-xy.png [ 4.53 KiB | Viewed 4718 times ]

The numbers were pulled in pairs of two (X and Y), so if you look at this picture carefully, it basically means that for any given value (0..255, in X axis) returned by the RNG there are only 5-6 different (sequential) values that may follow it, in other words it's very predictable.

So it's not a good RNG.
The picture that thefox got is the same that I'm getting using MSB.
And I guess that's why commonly used error detection protocols use CRC (based on LFSR) rather than a linear congruential hash like the DJB hash, right? Part of the drawback of the linear congruential structure is that high-order bits don't cascade back to low-order bits.

DJB hash:
hash[0] = 5381
hash[i + 1] = (33 * hash[i] + data[i]) mod 2^n

Why 33? It's a power of two plus or minus one (for quick multiplication) that's close to the number of letters in the Latin alphabet (for good distribution when used to hash strings).
tokumaru wrote:
I continuously clock the random number generator while waiting for VBlank.

I'd just like to say that this is a pretty great idea. I wonder if you get could even get decent results just incrementing a variable by one during the vblank wait loop. I bet you could!
edit:
Heh. It's true. You can do pretty well! (Probably not well enough, though)

This chart was made by solely incrementing \$FF during the Vblank loop. Each pixel was made using the random number from each set of two consecutive frames.
Frame 1, frame 2 = point 1's x and y
Frame 3, frame 4 = point 2's x and y
etc.

The very clear diagonal line was caused by me standing completely still. (I didn't have enemies or anything else too variable in the level I used to test, so the processing time was probably pretty stable.)

Edit 2: Continuation of the same line:

My game also has a bug which occasionally makes the game wait for a variable vblank sets rather than the vblank itself. (It halts the program so no new tiles are put into the buffer before the previous one have been updated by the NMI routine), This caused the second diagonal line.

Still, pretty interesting!
Some commercial games update a random seed during wait-for-vblank as well, for which Dwedit had a heck of a time computing the exact result for PocketNES speed hacks. Ultimately the time between the end of the game logic and the next vblank is a complicated function of the entire game state, but it's still a function. Incrementing a seed while waiting for vblank is thus still a PRNG, providing no more entropy than any other way of hashing keypresses. I guess that lends credence to the concept of reading the controller dozens of times during menus with little or no animation going on, such as the title screen, to gather more entropy.
Even if the RNG is in the NMI, the RN is still going to be accessed at random based on input in most cases. Unless you can cause an event at the same frame everytime, it will seem random if the numbers are distributed well.
rainwarrior wrote:
bogax wrote:
Right no true scotsman would use a crappy generator

I'm not sure how this was a "no true scotsman" argument. If faced with a choice of XORing two 16 bit generators, or one equivalent 32 bit generator (or even a 24 bit generator), I believe you would find with empirical testing that the latter will give better results, and can be done with equivalent performance.

Well, of course a better generator will be better than a crappy one, especially for sufficiently
But that was meant as a rhetorical statement rather than an accusation. So insofar as it
was an invitation to elaborate, perhaps it was a bit disingenuous of me not to have
elaborated a bit more myself.
If it sounded like an accusation, I apologize.

rainwarrior wrote:
bogax wrote:
What's your idea of the alternative? (ie a better generator that's as simple and fast as an LFSR and an LCG)

A 32 bit LFSR or LCG will give better 16-bit results than a 16 bit LFSR XORed with a 16 bit LCG.

I don't think any LFSR of similar simplicity is going to be better though it may not be any
worse (I don't think that's likely but I haven't surveyed all the possibilities do you have
a candidate in mind?)

rainwarrior wrote:
bogax wrote:
rainwarrior wrote:
it's a poor way to solve the problem of low-quality random numbers.

Where do you get that idea?

The idea comes from my experience with random number generators used for a variety of purposes. I've tried this technique before and I call it a poor solution because I think you can get better results from a single generator of similar computational complexity.

Again, if you've got a recommendation I'd love to hear it

rainwarrior wrote:
The reason I don't think XORing two generators together is that both signals have a periodicity that is much-much-much shorter than the equivalent single 32 bit generator. Even though they are interfering with each other via XOR, there are situations where both periods will show through this operation.

There are also many situations where the difference doesn't matter, and it's not a big deal how you build your PRNG, but I'm talking about the case where all else is more or less equal, and if your goal is to produce a more uniform, less predictable PRNG, I think the XOR approach is a bad idea. There are better ways to improve the output of your PRNG for equivalent computational cost.

I'm sorry to be dismissive of the idea, but this is my honest opinion of it.

If we were using some more objective criterion like which is best for monte carlo type
calcultions, I'd definitely agree with all you've said.
I think this is more a matter of taste. If it doesn't need to be random but only look random
(which might actually be less random)
here's some pictures that are meant to be illustrative not necessarily representative.
for one thing they only use 8 bits. They all use the same LCG x'=x*17+103
top row are runs of 256 bottom row is 65280 (which is a full run of the 8 bit
LFSR-LCG combo)
I have used ARCFOUR, updated several times per frame, using the microphone for additional entropy, and initializing two of its parameters using the initial contents of RAM.

I do not know how well it compare to whatever else is suggested in this forum.
A big advantage of an entirely-software RNG is repeatability, either when debugging or when you want to generate the same sequence again from just a seed (think of SimCity's random land generation and the three?-digit code you could get if you ever wanted that land formation again). It's also testably-reliable, whereas something depending on hardware might generate low-quality numbers in some circumstances.
So for a dithering experiment I needed a lot of pseudo-random bytes but I felt that a 16 bit state space was too small. I found this thread and figured I could do a similar thing with the 32 bit Galois LFSR I was using at the time.

Code:
rng_step_bit:     ; 26 to 27 cycles (+6 for rts)
lda rand+3
lsr
ror rand+2
ror rand+1
ror rand+0
bcc +
eor #\$a3
+:
sta rand+3
rts

And this is a byte version of the above using techniques hinted at by Drag. To use read rand+0 then call rng_step_byte.

Code:
rng_step_byte:    ; 79 cycles (+6 for rts)
lda #\$00
rng_mix_byte:     ; 77 cycles (+6 for rts)
eor rand+0      ; mix in input byte, and start
sta rand+0      ; XORing this with the low byte
asl             ; starting with x^30 term.
asl
asl
asl
eor rand+0      ; x^26 term
asl
eor rand+0      ; x^25 term
asl
eor rand+3      ; XOR with original low byte.
sta rand+3
lda rand+0      ; Again for the high byte.
eor rand+0      ; x^26 term
lsr
lsr
lsr
lsr
eor rand+0      ; x^30 term
lsr
lsr
eor rand+0      ; x^32 term (or original high byte)
ldy rand+1      ; finaly, rotate the bytes
sty rand+0      ; if Y needs to be preserved
ldy rand+2      ; stash A somewhere, then rotate
sty rand+1      ; the bytes with A.
ldy rand+3
sty rand+2
sta rand+3
rts