Description
This script use a Morte Carlo simulation to estimate \(\pi\) by finding proportion of points falling inside a unit circle out of all points inside a square centered at origin with width 2
The following function takes in one argument n which is number of pairs of x1 and x2, and return a matrix where each row is a pair of x1 and x2, there are n number of rows.
generate <- function(n) {
X <- matrix(runif(2*n, min = -1, max = 1), # generate n iid samples of (x1,x2) and store in n*2 matrix
nrow = n, ncol = 2, byrow = TRUE)
colnames(X) = paste("x", 1:2, sep = "") # change column names to "x1" and "x2"
return(X)
}
To achieve the goal I first write a function that take in a vector with 2 entries(x1 and x2), compute sum of their squares.
squaresum <- function(x) {
return(x[1]^2 + x[2]^2) # compute sum of squares for vector of length 2
}
The following function generates n number of (x1, x2) iid from uniform distribution[-1,1], compute sum of square for each sample. Then find proportion of sample with sum of squares equal or smaller than 1 (all the points falling within a unit circle center at origin), multiply the proportion by 4 which is area of a 2*2 square, I will get my estimate of area of a unit circle and it is also estimate of \(\pi\)
PIestimate <- function(n) {
sample <- generate(n) # generate n iid samples of x1 x2 from uniform(-1,1)
sum(apply(sample, 1, squaresum) <= 1)/n*4 # estimate area of a unit circle
}
The following is an example of using Morte Carlo simulation with 10000 times of repeat to estimate \(\pi\).
set.seed(123)
pi.estimate = PIestimate(100000)
sprintf("%.10f",pi.estimate)
## [1] "3.1404400000"
The probability that a point is inside the unit circle should be p which is given by \(p = \frac{\pi}{4}\) Our estimate for \(\hat{\pi} = 4*\hat{p}\) The standard error of estimator for \(\pi\) is
\(SE(\hat{\pi}) = 4*\sqrt{\frac{p*(1-p)}{n}})\)
We can use \(\hat{p}\) in place of p in above formulae, but I will use the conservative estimate that the standard error given n is greater when \(p=0.5\)
n = 100000
SE = 4 * sqrt(0.5*0.5/n) # standard error of pi-hat
pi.estimate = PIestimate(n) # estimate of pi
m = qnorm(0.975) # 97.5% quantile of standard normal distribution
confidence.interval = c(pi.estimate-m*SE, pi.estimate+m*SE) # confidence interval
print(confidence.interval)
## [1] 3.116804 3.141596
Indeed the 95% confidence interval cover the true value.
We need the width of 99% confidence interval for \(\pi\) should be at most 0.1, therefore we need \(2*2.57*4*\sqrt{\frac{0.5*(1-0.5)}{n}} \leq 0.1\) solve for n we know n should be at least 10567.
n = 20000
SE = 4 * sqrt(0.5*0.5/n) # standard error of pi-hat
pi.estimate = PIestimate(n) # estimate of pi
m = qnorm(0.995) # 99.5% quantile of standard normal distribution
confidence.interval = c(pi.estimate-m*SE, pi.estimate+m*SE) # confidence interval
print(confidence.interval)
## [1] 3.125172 3.198028
print(pi.estimate)
## [1] 3.1616
Indeed the confidence interval is accurate to 1st digit.
The only change is to change the uniform distribution where we iid sample x1 and x2 from [-1, 1] to [0,1].
generate2 <- function(n) {
X <- matrix(runif(2*n, min = 0, max = 1), # generate iid sample from Uniform[0,1]
nrow = n, ncol = 2, byrow = TRUE)
colnames(X) = paste("x", 1:2, sep = "") # change column names to "x1" and "x2"
return(X)
}
PIestimate2 <- function(n) {
sample <- generate(n) # generate n iid samples of x1 x2 from uniform(-1,1)
sum(apply(sample, 1, squaresum) <= 1) / n * 4 # get estimate of area of unit circle (i.e. pi)
}
set.seed(123)
PIestimate2(10000)
## [1] 3.1416
Since we estimate p by proportion of points inside unit circle in first quadrant out of all point in first quadrant, and estimate \(\pi\) by multiply estimate of p by 4, we are effectively have the same estimator for \(\pi\), therefore we do not need to change standard error and confidence interval.