STATS 506 Problem Set 2 Question 3

Description
This script use a Morte Carlo simulation to estimate \(\pi\) by finding proportion of points falling inside a unit circle out of all points inside a square centered at origin with width 2

(a)

The following function takes in one argument n which is number of pairs of x1 and x2, and return a matrix where each row is a pair of x1 and x2, there are n number of rows.

generate <- function(n) {
  X <- matrix(runif(2*n, min = -1, max = 1),  # generate n iid samples of (x1,x2) and store in n*2 matrix
              nrow = n, ncol = 2, byrow = TRUE)
  colnames(X) = paste("x", 1:2, sep = "") # change column names to "x1" and "x2"
  return(X)
}

(b)

To achieve the goal I first write a function that take in a vector with 2 entries(x1 and x2), compute sum of their squares.

squaresum <- function(x) {
  return(x[1]^2 + x[2]^2) # compute sum of squares for vector of length 2
}

The following function generates n number of (x1, x2) iid from uniform distribution[-1,1], compute sum of square for each sample. Then find proportion of sample with sum of squares equal or smaller than 1 (all the points falling within a unit circle center at origin), multiply the proportion by 4 which is area of a 2*2 square, I will get my estimate of area of a unit circle and it is also estimate of \(\pi\)

PIestimate <- function(n) {
  sample <- generate(n) # generate n iid samples of x1 x2 from uniform(-1,1) 
  sum(apply(sample, 1, squaresum) <= 1)/n*4 # estimate area of a unit circle
}

The following is an example of using Morte Carlo simulation with 10000 times of repeat to estimate \(\pi\).

set.seed(123)
pi.estimate = PIestimate(100000)
sprintf("%.10f",pi.estimate)

## [1] "3.1404400000"

(c)

The probability that a point is inside the unit circle should be p which is given by \(p = \frac{\pi}{4}\) Our estimate for \(\hat{\pi} = 4*\hat{p}\) The standard error of estimator for \(\pi\) is
\(SE(\hat{\pi}) = 4*\sqrt{\frac{p*(1-p)}{n}})\)
We can use \(\hat{p}\) in place of p in above formulae, but I will use the conservative estimate that the standard error given n is greater when \(p=0.5\)

n = 100000 
SE = 4 * sqrt(0.5*0.5/n) # standard error of pi-hat
pi.estimate = PIestimate(n) # estimate of pi
m = qnorm(0.975) # 97.5% quantile of standard normal distribution 
confidence.interval = c(pi.estimate-m*SE, pi.estimate+m*SE) # confidence interval
print(confidence.interval)

## [1] 3.116804 3.141596

Indeed the 95% confidence interval cover the true value.

(d)

We need the width of 99% confidence interval for \(\pi\) should be at most 0.1, therefore we need \(2*2.57*4*\sqrt{\frac{0.5*(1-0.5)}{n}} \leq 0.1\) solve for n we know n should be at least 10567.

n = 20000 
SE = 4 * sqrt(0.5*0.5/n) # standard error of pi-hat
pi.estimate = PIestimate(n) # estimate of pi
m = qnorm(0.995) # 99.5% quantile of standard normal distribution 
confidence.interval = c(pi.estimate-m*SE, pi.estimate+m*SE) # confidence interval
print(confidence.interval)

## [1] 3.125172 3.198028

print(pi.estimate)

## [1] 3.1616

Indeed the confidence interval is accurate to 1st digit.

(e)

The only change is to change the uniform distribution where we iid sample x1 and x2 from [-1, 1] to [0,1].

generate2 <- function(n) {
  X <- matrix(runif(2*n, min = 0, max = 1), # generate iid sample from Uniform[0,1]
         nrow = n, ncol = 2, byrow = TRUE)
  colnames(X) = paste("x", 1:2, sep = "") # change column names to "x1" and "x2"
  return(X)
}

PIestimate2 <- function(n) {
  sample <- generate(n) # generate n iid samples of x1 x2 from uniform(-1,1) 
  sum(apply(sample, 1, squaresum) <= 1) / n * 4 # get estimate of area of unit circle (i.e. pi)
}
set.seed(123)
PIestimate2(10000)

## [1] 3.1416

Since we estimate p by proportion of points inside unit circle in first quadrant out of all point in first quadrant, and estimate \(\pi\) by multiply estimate of p by 4, we are effectively have the same estimator for \(\pi\), therefore we do not need to change standard error and confidence interval.

STATS 506 Problem Set 2 Question 3

Yunbin Peng

October 20, 2017

(a)

(b)

(c)

(d)

(e)