完成SQL相关内容,包括DDL, DML, Critical等。

Overview
The purpose of this assignment is to test your ability to use and apply SQL concepts to complete tasks in a real-world scenario. Specifically, this assessment will examine your ability to use SQL Data Manipulation Language to return specific subsets of information which exist in a database and Data Definition Language to create new relational schema.
This assignment can be completed in group of two however your group member must be enrolled in the same course.
Submission
All submissions must be made through an electronic marking tool called Gradescope, which will also be used for providing feedback and automated marking. You will need to submit two types of files to the autograder:
- Query Files: For each question in sections A, B and C, you are required to submit a .sql or .txt file which contains your SQL query solution for that question (only one of these files, if you submit both, the .sql file will be graded). The file should only contain the SQL query(s), no additional text. The file should be named as per the description in the question. Additionally, the number of queries allowed to be run per question is also specified in each question’s description. More details will be provided about submitting to the Gradescope autograder closer to the assignment deadline. Note that you will be able to resubmit to the autograder an unlimited number of times before the deadline.
- Assignment PDF: After you have completed all the questions for this assignment, you should insert your answers into the template boxes where appropriate. Then you should export this document to a pdf and upload it to the respective Gradescope portal. Please note that this portal is simply a backup for Sections A, B and C. Only section D will be hand marked from your pdf submission.
When submitting to the autograder, please select all your .txt or .sql individually instead of uploading a zip file. Additionally, for student working in a group, only one group member should submit via Gradescope. The student submitting on behalf of their group must add their group member to their submission via gradescope.
Marking
The module 3 assignment is worth 30 course marks (of 100 course marks total for all assessment). Marking will be primarily completed using the autograder software, however we reserve the right to change to hand marking using the pdf submission should the need arise. Upon submitting to the autograder, you will be able to see the results of two basics tests: (i) file existence and (ii) compilation. More details will be provided regarding how you can interpret the results of these tests and what it means for your assignment grade during practicals. The marking distribution per section is as follows:
- Section A - SQL DDL
- Section B - SQL DML (UPDATE. INSERT DELETE)
- Section C - SQL DML (SELECT)
- Section D - Critical Thinking
- Section E - RiPPLE Task
Task
For this assignment you will be presented with the simplified schema of an online movie streaming service. You will be required to write a combination of SQL DML and DDL queries which answer higher level questions about the data constrained in the database or perform operations against the database’s schema and instance data.
Note: Your queries must compile using MySQL version 8.0. This is the same DBMS software as is used on your zones. You may use any MySQL function that have been used in class in addition to those specified in the questions. You may also use other MySQL functions not covered in this course to assist with manipulating the data if needed, however please ensure you read the MySQL documentation page first to ensure the functions works as intended.
Assignment Specification
A relational database has been setup to track customer browsing activity for an online movie streaming service called SurfTheStream. Movies are identified by a unique code that consists of a four-character prefix and four-digit suffix. Additionally, each movie is assigned a content rating which must be one of the following options: “G”, “PG”, “M”, “MA15+” or “R18+”. The first time a customer previews a movie is captured by the database. Customers may preview a movie before they stream it, however, they cannot preview a movie after they have started to stream it. You may assume “Duration” refers to the time in seconds a customer has spent streaming a particular movie after the “Timestamp”.
A simplified version of their database schema has been provided below including foreign key constraints. You should consult the provided blank database import file for further constraints which may be acting within the system.
Relational Schema
Customer [id, name, dob, bestFriend, subscriptionLevel]
Customer.bestFriend references Customer.id
Customer.subscriptionLevel references Subscription.level
Movie [prefix, suffix, name, description, rating, releaseDate]
Previews [customer, moviePrefix, movieSuffix, timestamp]
Previews.customer references Customer.id
Previews.{moviePrefix, movieSuffix} reference Movie.{prefix, suffix}
Streams [customer, moviePrefix, movieSuffix, timestamp, duration]
Streams.customer reference Customer.id
Streams.{moviePrefix, movieSuffix} reference Movie.{prefix, suffix}
Subscription [level]
For this assignment you will be required to write SQL queries to complete the following questions. Please use the submission boxes provided to record your answers, but do not forget to submit to the autograder as well!
Section A - SQL DDL
Example
Task
The company has decided to stop recording and tracking how many customers preview movies. Write an SQL query to reflect this change in the database schema.
Explanation
This change implies that the Previews table will no longer be needed.
SQL Solution
DROP TABLE Previews;
Question 1
Task
Write a SQL DDL query to implement the following relational schema and associated foreign keys.
Explanation
The relational schema for this the new table is as follows:
Table: MovieEmployee
Column | Data Type | Allow Nulls? | Primary Key? |
---|---|---|---|
moviePrefix | VARCHAR(4) | No | Yes |
movieSuffix | VARCHAR(4) | No | Yes |
employeeName | VARCHAR(100) | No | Yes |
role | {“Actor”, “Production”, “Other”} | No | Yes |
startDate | DATE | Yes | No |
Additionally, no employee should be able to start two roles in the same or different movies on the same day.
The foreign keys for this new table are as follows: MovieEmployee.{moviePrefix, movieSuffix} references Movie.{prefix, suffix}
Foreign key constraints should be implemented such that:
- Updates to a Movie’s prefix and/or suffix are automatically updated in MovieEmployee as well.
- A Movie cannot be deleted if there is an employee recorded as having worked on that movie.
Note: You may wish to consult the MySQL documentation on the enum datatype.
File Name
a1.txt or a1.sql
Maximum Number of Queries
3
Question 2
Task
SurfTheStream is interested to see how long customers are watching movie previews. Write an SQL query(s) to allow the database to capture these statistics.
Explanation
This query should add an attribute to the Previews table called “duration”. It should store a number greater than zero which corresponds to the number of seconds for which the customer watched the Preview. This attribute should not be null however any existing tuples in the Previews table should have their “duration” set to 100.
File Name
a2.txt or a2.sql
Maximum Number of Queries
3
Question 3
Task
To reduce query time, SurfTheStream has decided to make the Subscription table redundant and instead just store a customer’s subscription level simply in the Customer table without the foreign key.
Explanation
This change implies that the Subscription table should be removed. You may assume the foreign key constraint Customer.subscriptionLevel references Subscription.level is called “CustomerSubscriptionFK”.
File
Name a3.txt or a3.sql
Maximum Number of Queries
2
Section B - SQL DML (UPDATE, DELETE, INSERT)
Note: Modification made to the database schema made in previous questions are not persisted.
Example
Task
Set the content rating of all movies called “Bad Day” to be “PG”.
Explanation
The rating for each movie is captured using the rating attribute in the Movie table.
SQL Solution1
2UPDATE Movie SET rating = "PG"
WHERE name = "Bad Day"
Question 1
Task
Add the string, “This movie is a sequel. “ to the start of the description for any movie which is the second, third or fourth movie in a franchise.
Note: The space character at the end of the sentence needs to be included.
Explanation
You may assume the movie name for any movie which is the second movie in a franchise ends with “ 2”. For example, “Fast and Furious 2”. The same principle applies for the third and fourth movies in a franchise. For example, “Iron Man 3” and “Back to the Future 4”.
File Name
b1.txt or b1.sql
Maximum Number of Queries
1
Question 2
Task
In an effort to purge the system of fake user accounts the Chief Information Officer of SurfTheStream has authorised all customer accounts to be removed that meet either (or both) of the following conditions:
- The account name is less than 4 characters long. (E.g., “AAA”)
- The account name has no spaces. (E.g., “JackTheRealPerson’)
Explanation
You may assume that all the fake user accounts which meet one (or both) of the above criteria are not referenced by foreign keys in any other tables. Additionally, you may find the MySQL function CHAR_LENGTH helpful for this question. The MySQL documentation page for this function can be viewed here.
File Name
b2.txt or b2.sql
Maximum Number of Queries
1
Question 3
Task
A customer (with id = 1234) has requested that they no longer be shown any previews. Make the necessary modification to the database to accommodate this.
Explanation
The system will not show a movie preview if the customer has already previewed the movie. Therefore, for all movies which this customer has not yet previewed, create a fake preview record with the timestamp being the current time the query is run. You may assume the customer has not streamed any movies they have not previewed.
File Name
b3.txt or b3.sql
Maximum Number of Queries
1
Section C - SQL DML (SELECT)
Example
Task
Return the prefix and suffix of all movies with a rating of “M”.
Explanation
This query should return a table with two columns. The first column should correspond to the movie’s prefix and the second column to the movie’s suffix.
SQL Solution
1 | SELECT prefix, suffix |
Question 1
Task
Return the name, prefix and suffix of all movies which star the actor “Yeseul Jeon” or “Jack Smith” (or both) in ascending order by release date.
Explanation
This query should return a table with three columns, the first containing the movie name, the second containing the movie’s prefix and the third containing the movie’s suffix. For this question, you may assume that information regarding the actors of a particular movie is included in the movie’s description.
File Name
c1.txt or c1.sql
Maximum Number of Queries
1
Question 2
Task
Return the number of customers who have streamed a movie for longer than 2 minutes over the past 15 days.
Explanation
This query should return a table with one column that has a single numerical tuple. The 15-day time period is inclusive and should be correct to the second the query is run. Only streaming instances that started within the 15-day period should be counted.
File Name
c2.txt or c2.sql
Maximum Number of Queries
1
Question 3
Task
Return the number of movies which were released per rating category each day in 2021.
Explanation
This query should return a table with three columns, the first containing a date, the second containing a rating and the third containing the number of movies released in that rating group on that day. You do not need to days/rating combinations which had zero movies released.
File Name
c3.txt or c3.sql
Maximum Number of Queries
1
Question 4
Task
For each customer who has a best friend, return their id, name and age difference in comparison to their best friend.
Explanation
This query should return a table with fours columns, the first for the id of the customer, the second for the name of the customer, the third for the name of their best friend and the four for the age difference (in days). The age difference should be calculated using:
Customer's DOB - Customer's best friend's DOB
Additionally, you may find the MySQL function DATEDIFF helpful for this question. The MySQL documentation page for this function can be viewed here.
File Name
c4.txt or c4.sql
Maximum Number of Queries
1
Question 5
Task
Return the movie details for each movie which has been previewed less than 50 times by customers who were born before 2005.
Explanation
This query should return a table with four columns. The first containing movies prefixes, the second containing movie suffixes, the third containing movie names and the fourth containing the number of times that movie has been previewed.
File Name
c5.txt or c5.sql
Maximum Number of Queries
1
Question 6
Task
Return a distinct list of users who meet one of the following two conditions:
- The user has streamed the same movie more than once.
- In September 2021, the user started streaming a movie which they watched for at least 10 minutes.
Explanation
This query should return a table with two columns, the first being user ids and the second being users names. Hint: Non-Correlated Subquery
File Name
c6.txt or c6.sql
Maximum Number of Queries
1
Question 7
Task
Return the id and name of all customers who have previewed at least all the movies previewed by customers who were born on the “2000-10-10”. Customers born on “2000-10-10” should not be included in the final list.
Explanation
This query should return a table with two columns, the first being user ids and the second being users names. Hint: Correlated Subquery
File Name
c7.txt or c7.sql
Maximum Number of Queries
1
Question 8
Task
For each movie which is recorded in the database, return the name of the first customer to preview the movie (or NULL if no customer has previewed the movie) and the number of times that movie has been streamed.
Explanation
This query should return a table with five columns. The first containing the prefix of the movie, the second containing the suffix of the movie, the third containing the movie name, the fourth containing the name of the first customer to preview the movie, and the fifth containing the number of times that movie has been streamed. If there are several customers who all previewed the movie first, they should all be returned in duplicate rows with the only difference being customer’s name. Hint: VIEW.
File Name
c8.txt or c8.sql
Maximum Number of Queries
3
Section D - Critical Thinking
In this section you will be presented with an abstract scenario(s) relating to the UoD provided in the task description. For each question, you must complete the following:
- Propose two different strategies to complete the given task. Your strategies should outline and justify what type of data would be useful to answer the given task and how you could use various SQL techniques to obtain such insights from the existing schema.
- Pick one of those two strategies and write an SQL query(s) which implements that strategy.
Question 1
Task
SurfTheStream wants to select a list of movie previews which it will briefly play to customer when they open the SurfTheStream app. Propose a strategy for how they can identify which movie previews are most effective for customers and therefore should be included in this list.
Question 2
Task
SurfTheStream is looking for customers who can work for the company as movie critics and write blogs on new releases. Propose a strategy for how to identify customers who generally watch popular movies before they become very popular and therefore have good tastes to be qualified as movie critics.
Section E - RiPPLE Task
Using the RiPPLE online software, you must complete the following activities before the assignment due date:
- Resource Creation: Create one or more effective resource. For a learning resource to be considered as effective it needs to pass a moderation process which is administered by your peers and the teaching team. Teaching staff will be spot-checking to review moderations performed by just peers and change the outcome if necessary.
- Resource Moderation: Moderate 4 or more resources effectively. An effective moderation means that you have completed the moderation rubric and have provided a detailed justification for your judgement as well as constructive feedback on how the resource can be improved. Simply saying a resource is “good” does not qualify. Again, teaching staff will be spot-checking the quality of moderations and change the outcome when necessary.
- Answering Questions: Answer 10 or more questions correctly. To answer a resource correctly your first response must be correct. You can attempt as many questions as you want, and incorrect answers do not count against you. Only answers from the Practice tab are counted. Answering in-class RiPPLE activity questions does not count towards questions answers.
These tasks are to be completed through the RiPPLE platform, via the link available on Blackboard.
Note: For the above three activities, the resources you create, moderate and answer must be in the following categories on RiPPLE:
- SQL
- Functional-dependency
- Normalization
Creating, moderating or answering questions from other categories will not be counted towards your mark for the RiPPLE component of this assignment.