Improving RND

Discussions related to mathematics, numerical methods, graph plotting etc.
Post Reply
MattC
Posts: 10
Joined: Mon 16 Apr 2018, 06:17

Improving RND

Post by MattC » Mon 16 Apr 2018, 17:08

Hi,

I've produced a simple program (just for fun) that 'tosses a coin' and counts heads and tails runs, i.e. how many heads, for e.g., in a row. The program uses a simple "RND(2) - 1" test. The total run so far has reached over 70 billion 'tosses', and the heads-in-a-row has reached 33. However, I'm now concerned that, as the RND function is psudo, the figure would never increase as the series would repeat. Is this true, and what could I do to further randomise the figures?

Matt

guest
Posts: 104
Joined: Mon 02 Apr 2018, 09:12

Re: Improving RND

Post by guest » Mon 16 Apr 2018, 17:22

MattC wrote:
Mon 16 Apr 2018, 17:08
I'm now concerned that, as the RND function is psudo, the figure would never increase as the series would repeat.
The RND sequence length is 233-1, that is 8,589,934,591, so yes if you've run 70 billion 'tosses' you've already repeated the sequence several times! There are several alternative PRNGs at the wiki, including this one but you'd need to be an expert statistician (perhaps you are!) to know if any are suitable for your task.

Richard.

MattC
Posts: 10
Joined: Mon 16 Apr 2018, 06:17

Re: Improving RND

Post by MattC » Mon 16 Apr 2018, 18:25

Thank you, Richard.

Unfortunately, I'm not an expert statistician so I'm not sure about these. I'll check them out to see if I can make sense of them. I'm assuming reseeding simply moves the 'series pointer' to a new place, rather than changing the sequence altogether. I was wondering if multiplying the RND(2)-1 with, say, the LSB of the TIME index would produce a more random figure. Just a thought.

Incidentally, what's the reason for 33 bits in the RND sequence? Without checking I would have guessed (wrongly, obviously) 31 or 32.

Matt

guest
Posts: 104
Joined: Mon 02 Apr 2018, 09:12

Re: Improving RND

Post by guest » Mon 16 Apr 2018, 18:43

MattC wrote:
Mon 16 Apr 2018, 18:25
Incidentally, what's the reason for 33 bits in the RND sequence? Without checking I would have guessed (wrongly, obviously) 31 or 32.
Two ways to answer that question! (1) Ask Sophie Wilson - she presumably chose the algorithm in 1981 (if not before) but I'm not aware that she's been asked about it since. (2) Look at the number of feedback taps required for a maximal-length LFSR of each of those lengths: 31 and 33 require two taps, 32 requires four feedback taps.

Richard.

MattC
Posts: 10
Joined: Mon 16 Apr 2018, 06:17

Re: Improving RND

Post by MattC » Mon 16 Apr 2018, 19:57

Thanks again, Richard.

It's stretching my understanding a little, but I'll look into that, too.

I briefly investigated the possibility of using TIME as an additional source, but I came up with some confusing results which I can't seem to figure out.

The test program:
      AT% = 0
      BT% = 0
      HT% = 0
      FOR I% = 0 TO 9999999
        A% = RND(2) - 1
        B% = TIME MOD 2
        REM PRINT A%, B%, A% EOR B%
        IF A% = 1 THEN AT% += 1
        IF B% = 1 THEN BT% += 1
        IF A% EOR B% = 1 THEN HT% += 1
      NEXT
      PRINT "         A         B       TOT    TOSSES"
      PRINT AT%, BT%, HT%, I%
      PRINT AT%/I%*100, BT%/I%*100, HT%/I%*100

The program runs for around 10 seconds before giving the results. If A% and B% are random (and I know that TIME is not) then there should be around 50% on each, but there's not. It produces consistently (on several runs) ~50% for AT%, ~46% for BT% and ~73% for HT%. I've tried to analyse what's happening, but I'm failing. It certainly doesn't seem to be a good way to get an additional random element. What am I missing?

Matt

guest
Posts: 104
Joined: Mon 02 Apr 2018, 09:12

Re: Improving RND

Post by guest » Mon 16 Apr 2018, 20:23

MattC wrote:
Mon 16 Apr 2018, 19:57
What am I missing?
What were you expecting? All one can say about TIME is that, over the long term, it changes on average 100 times per second. There's nothing to say it will be odd for the same total time as it is even: it could legitimately be odd for twice as long as it's even (that's determined, in part, by the underlying 'timebase'). If it's as close as being even (or odd) 46% of the time I would have thought that was pretty good, and nothing to complain about! HT% should come out at theoretically 75% I think, so that's not too far out either.

Richard.

MattC
Posts: 10
Joined: Mon 16 Apr 2018, 06:17

Re: Improving RND

Post by MattC » Wed 18 Apr 2018, 19:45

Hi Richard,

I've reanalysed the program here - at least the TIME issue with BT% being smaller than 50% and I think I've worked out what's happening. It's the line IF B% = 1 THEN BT% += 1. There's a speed issue with this line. It takes longer to execute it if TIME MOD 2 is 1. If it is, it completes the process, adding 1 to BT%, if not, it ignores the rest of the command and carries on. Therefore, when TIME is odd it decreases the overall execution time and executes fewer loops decreasing the number of odds (1s) counted, decreasing the number of HEADS by a few per cent. Adding an additional line after each of the three condition, such as IF B% = 0 THEN BT% += 0, seems to balance the outputs - at least of AT% and BT% - causing them both to hit around 49.9-50.1%. However, HT% still seems to be around the 75% mark. I need to examine this area, now.

Matt

guest
Posts: 104
Joined: Mon 02 Apr 2018, 09:12

Re: Improving RND

Post by guest » Wed 18 Apr 2018, 21:06

MattC wrote:
Wed 18 Apr 2018, 19:45
Adding an additional line after each of the three condition, such as IF B% = 0 THEN BT% += 0, seems to balance the outputs
Even if that's the case, I would still want to emphasise that you should not necessarily expect TIME to be odd or even with equal probability. My background is principally in hardware electronics, not software, and I visualise the LSB of TIME as being the output from the first flip-flop in a counter chain. The waveform there has a certain frequency (in the case of TIME it's 50Hz) but also a certain mark:space ratio, and although one might expect that ratio to be close to 1:1 it need not necessarily be so. I will try to illustrate that with some ASCII art:

Master clock:  |   |    |   |    |   |    |   |    |   |    |   |    |   |    |   |    |
                ___      ___      ___      ___      ___      ___      ___      ___
Counter LSB:   |   |____|   |____|   |____|   |____|   |____|   |____|   |____|   |____|
However, HT% still seems to be around the 75% mark. I need to examine this area, now.
According to my analysis it should indeed be theoretically 75%. You could make it 50% by changing your code to:

Code: Select all

        IF (A% EOR B%) = 1 THEN HT% += 1
Richard.

MattC
Posts: 10
Joined: Mon 16 Apr 2018, 06:17

Re: Improving RND

Post by MattC » Thu 19 Apr 2018, 06:09

Thanks Richard.

I think I understand. Although, if the Master clock is 50Hz, does TIME tick over on its own in the middle to give it centiseconds; and if so, could this be another reason why the ratio might not be 1:1 - an inaccuracy of the timing of the intermediate 'flip'?

Also, with the code change, I was forgetting how perfectly logical one has to be with BB4W. (I'm saying this is good, not bad.) I assume the original would be read as IF A% EOR (B% = 1) THEN HT% += 1, which would indeed produce a theoretical exact 75%.

Thanks

Matt

guest
Posts: 104
Joined: Mon 02 Apr 2018, 09:12

Re: Improving RND

Post by guest » Thu 19 Apr 2018, 08:17

MattC wrote:
Thu 19 Apr 2018, 06:09
Although, if the Master clock is 50Hz, does TIME tick over on its own in the middle to give it centiseconds
You can't be sure what the 'master clock' (timer interrupt) in a PC will be. Back in the Windows 95 era it was commonly around 18 Hz or so (I think the exact frequency goes way back to the original IBM PC-AT), so TIME would make quite big jumps and many values would never be returned. In a modern PC the hardware interrupt rate is likely to be 1000 Hz (1 kHz) so TIME shouldn't 'miss a beat' but I still would not be prepared to speculate on the accuracy of its mark:space ratio.
I was forgetting how perfectly logical one has to be with BB4W.
This is nothing specifically to do with BBC BASIC for Windows or even BASIC in general. It is pretty much universally true in all programming languages that '=' has a higher priority than 'EOR' in the Operator Precedence. For example in the relevant Wikipedia article 'equality' is at priority level 7 and 'exclusive-or' is at level 9. BBC BASIC doesn't have so many levels but the same general rule applies (and always has since the earliest BBC Micro days).

Richard.

Post Reply