C++代写:STAT40780 Data Programming

代写算法作业,用Rcpp实现Root-finding算法。

Root-finding

Notes

There are two questions on this assignment. The second requires you to have considered Lab 8 in the course material.

This assignment is worth 20% of the total grade.

Ensure your code is well-commented. You will be penalized for submitting code that is not well-commented.

Only the following files should be submitted:

  • Any .cpp files that form part of your solutions
  • A single R script file named in the format YourName assignment.R containing all of the R code required for this assignment (submit a single R script only!)
  • A single pdf document containing any other required information (e.g. descriptions of code implementations), named in the format YourName assignment.pdf.

Any additional files you submit will not be graded.

Root-finding algorithms

A root of a function g(θ) is a solution to the equation g(θ) = 0. Root-finding is an important computational problem that arises commonly across many disciplines. For example, in statistics, root-finding can be used in maximum likelihood estimation. Here you are required to find a maximum likelihood estimate for the Cauchy distribution with the help of root-finding algorithms implemented in C++ with Rcpp.

The Cauchy distribution with unknown location parameter θ and scale parameter set to γ = 1 is a continuous probability distribution, with density function.

For n independent observations x1, x2, … , xn from this Cauchy distribution, the log likelihood is calculated.

Consider the following numeric vector in R storing observations generated from a Cauchy distribution with location parameter θ and scale parameter γ= 1:

1
2
3
4
5
x <− c (12.262307, 10.281078, 10.287090, 12.734039,
11.731881, 8.861998, 12.246509, 11.244818,
9.696278, 11.557572, 11.112531, 10.550190,
9.018438, 10.704774, 9.515617, 10.003247,
10.278352, 9.709630, 10.963905, 17.314814)

A researcher wishes to estimate the maximum likelihood estimate θ for the Cauchy distribution from which the data in x were generated.

A maximum likelihood estimate for θ can be found by differentiating the log likelihood, equating to zero, and solving for θ i.e. we can find a maximum likelihood estimate for the above Cauchy distribution by finding a root of l(θ), the first derivative of the log likelihood. Taking the first derivative and equating to zero, gives the following equation.

A maximum likelihood estimate can be obtained by solving the equation l(θ) = 0 for θ i.e. by finding a root of the function l(θ). There are several numerical optimization algorithms available for finding the roots of continuous univariate functions such as l(θ), including the:

  • Bisection method
  • Newton’s method
  • Secant Method

For this assignment, implement each of the above three root finding algorithms in C++ with Rcpp, to find the MLE of θ for the data x.

You will need to research the algorithm for each method. They are widely available from any online search or within the accompanying PDF (with pseudo-code). Call each function bisectRcpp, newtonRcpp, and secantRcpp, respectively. The function calls (from a wrapper using a cxxfunction) should take the form:

1
2
bisectRcpp(x,a=*,b=*,n=1000,tol=0.00000001)
newtonRcpp(x, b=*,n=100 ,tol=0.00000001) secantRcpp(x,a=*,b=*,n=1000,tol=0.00000001)

where a and b represent initial values for the algorithms, n is the number of iterations in your algorithm, and tol is the tolerance - the level of accuracy for terminating before n is reached.

  • Each function should take the vector x as input from R (along with any other required inputs), and return the maximum likelihood estimate to R.
  • Ensure that all code submitted is well commented
  • Briefly describe each implementation in your submitted pdf document. This must be self-contained, i.e. do not refer to your commented code.
  • Evaluate and discuss the behaviour of each implemention with different starting values. Consider the following:
1
bisectRcpp (a,b)= {(9,11), (0,30), (-10,50), (15,50)} newtonRcpp b= {11, 9, 13} secantRcpp (a,b)= {(10,12), (7,12), (8,13), (8,14)} where some will work and others will fail.
  • Benchmark your implementations against each other by the following:
1
2
3
4
5
# Benchmark the implementations against each other
benchmark ( bisectRcpp( x, a=9, b=11, n=1000, tol=0.00000001 ),
secantRcpp( x, a=8, b=11, n=1000, tol=0.00000001 ),
newtonRcpp( x, b=11, n=100, tol=0.00000001 ),
order = "relative" )
  • Discuss their advantages and limitations. Comment on the suitability and efficiency of each implementation.

Note: the second derivative of the log likelihood is.

Welford’s variance algorithm with Rcpp - with missing values

Modify Welford’s variance algorithm (Lab 8) such that it can handle missing values.