Java代写:CS3016 Mobile Analytics



This assignment broadly deals with location-based mobile marketing. You have data from a location-based marketing agency which handles geo-fencing campaigns on behalf of advertisers. Due to the very large volume of data, you are given a random sample for two campaigns of a single advertiser - AMC Theaters. The advertising impressions are inserted into the mobile app being used on the device. The data include the following elements: impression size (e.g., 320x50 pixels), app category (e.g., IAB1), app review volume and valence, device OS (e.g., iOS), geo-fence lat/long coordinates, mobile device lat/long coordinates, and click outcome (0 or 1). The column names are self-explanatory, although we have provided a data dictionary file on Canvas.


Data Processing

  • a. Create dummy variable imp_large for the large impression
  • b. Create dummy variables cat_entertainment, cat_social and cat_tech for app categories
  • c. Create dummy variable os_ios for iOS devices
  • d. Create variable distance using Harvesine formula to calculate the distance for a pair of latitude/longitude coordinates. Distance (in kilometers) = 6371 * acos( cos( radians(LATITUDE1) ) * cos( radians( LATITUDE2 ) ) * cos( radians( LONGITUDE1 ) - radians(LONGITUDE2) ) + sin( radians(LATITUDE1) ) * sin( radians( LATITUDE2 ) ) )
  • e. Create variable distance_squared by squaring variable distance
  • f. Create variable ln_app_review_vol by taking natural log of app_review_vol

Descriptive Statistics

  • a. Summarize the data by calculating the summary statistics (i.e., mean, median, std. dev., minimum and maximum) for didclick, distance, imp_large, cat_entertainment, cat_social, cat_tech, os_ios, ln_app_review_vol and app_review_val.
  • b. Report the correlations among the above variables.
  • c. Plot the relationship of distance (x-axis) and click-through-rate (y-axis), and any other pairs of variables of interest.

Logistics Regression

  • a. Specify the following Logistic regression model:
    Dependent variable: didclick
    Independent variables: distance, distance_squared, imp_large, cat_entertainment, cat_social, cat_tech, os_ios, ln_app_review_vol and app_review_val.
  • b. Estimate the model in R (using the glm function) and report coefficients and p-value of the estimates. - c. Discuss your findings and their implications, limiting your answer to a page or so.