According to Wikipedia the geometric distribution is either of two discrete probability distributions:

- The probability distribution of the number of \(X\) Bernoulli trials needed to get one success, supported on the set \({ 1, 2, 3, …}\)
- The probability distribution of the number (Y = X − 1) of failures before the first success, supported on the set \({ 0, 1, 2, 3, … }\)

Which of these one calls “the” geometric distribution is a matter of convention and convenience **if** is an old school longhand calculation. However if you are using statistical packages such like SAS, R, SPSS etc you should really know what definition is being used.

**R** *uses the number of failures (Y).*

Thus, the form of geometric distribution for modeling number of failures until the first success:

$P(Y=k)=(1-p)^{k}p$

for \(k=0,1,2,3…\)

## Properties

**Mean:**\(\frac{1-p}{p}\)**Variance:**\(\frac{1-p}{p^2}\)**Standard deviation:**\(\frac{\sqrt{1-p}}{p}\)**Skewness:**\(\frac{2-p}{\sqrt{1-p}}\)**Kurtosis:**\(6+\frac{p^2}{1-p}\)**Moment generating function:**\(\frac{p}{1-(1-p)e^t}\)**Characteristic function:**\(\frac{p}{1-(1-p)e^{it}}\)

## Geometric Distribution in R

### Generate Geometric random numbers

In R we use the function **rgeom(N,p)** to generate geometric random numbers.
Parameters:

**N:**number of geometric(p) counts.**p:**probability of success in each trial

**Example1:** if we roll a fair die, and count the number of rolls before the first **6** appears, we have a geometric distribution with \(p = 1/6\)

So in R if roll the die 3 times we have:

The above result means:

- We had 3 failures before the first 6 appeared.
- We had 2 failure before the first 6 appeared.
- We had 14 failures before the first 6 appeared.

### Probability distribution function dgeom() density function pgeom()

**Example2:** Veronica is rolling a die. Calculate the probability of getting a 3 on the 8th roll. This statement is traduced in R as the probability of 7 failures before the first 3 appears.

For answering this question we use *dgeom()*.

The probability of getting a 3 on the 8th roll is 0.04651 which is very low. Let’s see what is the expected number of rolls before a 3 comes out. For this we’ll use the mean of a geometric distribution \(\frac{1-p}{p}\)

The expected number of rolls before a 3 come out is 5

**Example3:** Lionel Messi scores 28% of his shots (2012/13 Spanish La Liga).

a) What is the probability that Messi will not score a goal until his 7th try?

The probability of not scoring until his 7th try is 0.03901

b) What is the expected number of shots before he scores?

Lionel Messi scores a goal every 2.571 shots on average.

c) What is the probability that the ﬁrst goal occurs in the ﬁrst 5 shots?

The above probability is the probability of: - scoring on the first shot (no failure) + - missing 1 shot + - missing 2 shots + - missing 3 shots + - missing 4 shots

or

\(P(X \leqslant 5)=P(X=0)+P(X=1)+P(X=2)+P(X=3)+P(X=4)\)

or simply using the cumulative probability function *pgeom()*: