Statistics: March

A log is kept of the options and words used for each random search. I wrote a small visual basic program to analyze everything from the options used to probabilities that certain words will be used from the past 30 days.

General:
 Number of searches through 30 days 1378 Total number of words used in searches 4268 Number of times safemode was used 8 Number of words in the database 7111

Number of times individual words from the database have been used in the searches
 Number of words that have been used once 2215 31.2% Number of words that have been used twice 715 10.1% Number of words that have been used three times 156 2.2% Number of words that have been used four times 31 0.44% Number of words that have been used five times 6 0.08% Number of words that haven't been used 3988 56.1%

We can compare the results obtained above with that expected from probability theory. The probability P(x) of an event x occuring, that has a probability of success p, over n independent trials are called Bernoulli trials, is given by using the binomial distribution,

In this case, an event would be that a word was used once, the number of trials is 4268 and the probability of success is 1/7111, which is the probability of choosing a single word from the database of 7111 words. The results are summarized in the table below.

Compare the results with those expected from probability theory.
 Probability a word had been used once 32.9% Probability a word had been used twice 9.9% Probability a word had been used three times 2.0% Probability a word had been used four times 0.3% Probability a word had been used five times 0.04% Probability a word hasn't been used 54.9%

Comparing the results we obtained with those expected from probability theory, we see that they agree very well. As more searches are done, it is likely that the differences between actual results and theory will decrease even further.

Words used 5 times:
individuals, remove, rental, slipped, stored, tells

Words used 4 times:
abilities, advantages, autonomy, base, concerned, contacted, defendants, democratic, distinguished, distributing, distributions, door, firefighters, illustrated, investigator, magic, maintaining, none, overseas, politicians, predictions, relevant, remainder, retiring, sides, singing, specifically, stayed, supporter, transformed, underlying

Frequency of option 'Number of words'
 Number of times 1 word was used in search 27 2.0% Number of times 2 words was used in search 9 0.7% Number of times 3 words was used in search (default) 1195 86.7% Number of times 4 words was used in search 106 7.7% Number of times 5 words was used in search 41 3.0%

The number of times that, within a single search using 2-5 words, one word from the database was repeated within that search:
Zero. It is extremely unlikely that this would occur, as the probability of the same word used in a 2-word search is 1 in 51 million, and in a 5-word search, 1 in 5 million.