R代写:CS106 R Script

在R-studio上编写R-Script完成题目要求,比较基础。

R

Requirement

Please read each question carefully and follow the instructions given. You will produce your answers by working on the R-Script Editor on R-studio. The questions will request you to provide more information in your comments to demonstrate that you have understood what is required. When you have completed the exercise, save your R-studio script as a text file (.txt) by copying and pasting the text on to your text editor.

Question 1

Set up a new summative assessment R-script for the session today.

  1. In your script, you need a complete header with all the information required for transparent and accessible programming. After the header, include a welcome message with a date on which the script is run within the message.
  2. Simulate and plot trigonometry functions in the form y = A*sin(kx). The function space x should run from -10 to 10 in steps of dx = 0.05. Simulate two functions y.1 and y.2. The parameters for y.1 should be A = 1 and k = 3/4. The parameters for y.2 should be set by you so that y.2 has three times the amplitude and twice the frequency of y.1.
  3. Plot y.1 and y.2 together on the same plot with both fully visible. In your comments, explain what the frequency relates to in the plots you have produced and how changing frequency affects the cycles you see in your plot.

Question 2

Behavioural scientists are interested in whether home-owners are more conscientious about their physical health than renters. To explore this, you will use the NHANES database to statistically evaluate the evidence.

  1. First create a dataframe reduced to include only working age adults (Age 25 and over) participants and the three variables: Age, PhysActive and HomeOwn.
  2. From the reduced dataframe, produce a separate 2 by 2 contingency table with the frequencies (total numbers) of the home owners who are physically active alongside the frequencies of those who are not physically active. Repeat the same pair of frequencies for the physical activity of the home renters for the contingency table. Ensure that you have labelled rows in the resulting frequency contingency table.
  3. Run a chi-squared test to probe the question by behavioural scientists about the relationship between home ownership and physical activity and report your result and interpretation of it in the console including all necessary information (odds ratio too!). In your answer, within the comments explain which line runs the chi squared test and what all the input arguments and outputs from the function mean.

Question 3

Fifteen young students who play competitive school sports were involved in a study which measured variables such as Age, Football rating (by a coach), Reading level and Numeracy level. Their data was as follows: Age {12, 12, 13, 13, 14, 14, 14, 15, 15, 15, 15, 15, 16, 16, 17}, Football rating {3, 5, 3, 7, 8, 4, 4, 3, 7, 8, 3, 4, 2, 6, 4}, Reading level {41, 47, 33, 91, 72, 28, 47, 45, 36, 58, 71, 43, 47, 32, 69} and Numeracy level {27, 48, 23, 58, 62,71, 33, 41, 32, 56, 73, 26, 43, 35, 47}. All scores are arranged in order according to individual participants.

  1. Combine the scores into a Student performance dataframe.
  2. The coaches were interested in checking if Football rating was predicted by literacy and numeracy scores. A more dominant contributor however might simply be student age. Use one multiple regression model to ask whether football scores are predicted by age, literacy and numeracy. Report the results obtained.
  3. Plot two scatter graphs of, first, Football rating against Reading level and then separately the Football rating against Numeracy. In your comments, discuss your plotting functions. Talk about how you would control the appearance of the graphs themselves in your code. Consider axes, points and trendlines.

Question 4

Rumours are circulating around the university that students on our programme are exceptionally good at statistics.

  1. You are asked by History and Politics students which outcome is more likely: flipping four heads in a row in a fair coin flip or picking two royal cards in a row (i.e. King or Queen) in a standard 52 card deck. Within your code, calculate the analytical probabilities of both alternatives to answer the question.
  2. The students also argue about the effectiveness of surge screening for covid-19. Screening is being proposed within the Greater London population of 10 million. The prevalence of Covid within this population in July 2021 is estimated to be 350 cases in every 100,000 of the population. The lateral flow tests proposed for this wider screening have a reported sensitivity of 98.2% and a specificity of 99.3%. Using the Bayesian formulation of the posterior, estimate the probability that a positive test result for a Londoner means the individual has the corona virus. The Prior can be assumed to be the prevalence level in London. Within your comments, list all the functions you used in your calculations. For each, state the meaning of their output.

Question 5

It has been suggested that employment status has a substantial impact on mental health. In this question we use the NHANES database to explore this hypothesised link with two physiological measures associated with mental state: resting Pulse and Blood Pressure.

  1. First, create a reduced NHANES dataframe of Adults (25 or over) which includes Age, Work, Pulse and BPSysAve.
  2. Obtain a pair of random samples (N=50) of participants who are either working (first sample) or not working (i.e. including not working or looking) in a second sample.
  3. Plot a comparison of the Pulses of each of the samples (working vs not working) using a violin plot and do the same with a separate violin plot for the Blood Pressure data.
  4. Run two t-tests for the Pulse hypothesis and for the Blood Pressure hypothesis and report the results in the console including Bayes Factors. To do this question, you will have to use many functions. In your comments, identify whether each function used is part of the R studio standard list or whether it comes from a particular toolbox. Where it comes from a toolbox, name the toolbox e.g. knitr.

STRUCTURE: Ensure that all your commenting has been done. Clearly separate out the different sections of your code.