## Requirement

In Assignment 2, you will work with a dataset that contains voting data from a Canadian election. You can complete the whole assignment with only the concepts from Weeks 1 through 6 of the course.

## Goals of this Assignment

In completing Assignment 2, you will practice with the following concepts we have seen in the course so far:

• You will continue to use the Function Design Recipe to document, write, and test functions.
• You will write code that uses loops and the list data type in a variety of ways (looping over lists, list methods, list indexing, etc.)
• You will apply what you’ve learned about mutability, writing functions that mutate input only when it is appropriate to do so
• You will write code that uses conditionals, strings, boolean expressions, and other concepts from Assignment 1 and earlier in the course.
• You will reuse functions to help implement other functions according to their docstring description
• You will read a problem description in English and provided docstring examples, and implement function bodies to solve the problem according to these specifications
• You will continue to use Python 3, Wing 101, a checker module, and other tools

## Background Information

Voting theory is the study of voting systems. A voting system is an algorithm for computing the winner of an election given a list of candidates and a set of ballots. For example, you may be familiar with the Plurality voting system: each voter marks their ballot for exactly one candidate, and the candidate with the most votes is elected.

There are dozens of different voting systems and many different types of ballots. In this assignment, we’ll be investigating five systems (Plurality, Approval Voting, Range Voting, Borda Count, and Instant Run-Off) that use four different types of ballots.

In case you’re curious, we have provided a table showing you where each voting system is used. (This is only for the curious since everything you need to know about each voting system and Canadian elections is contained within this handout.)

## Canadian Elections: Ridings, Members of Parliament, and the House of Commons

Canada is divided into 338 geographical areas called ridings. In a standard Canadian election, one candidate from each political party runs in each riding. Each riding’s voters elect one of these candidates to the government using the Plurality voting system. These elected Members of Parliament (MPs) get a seat in the House of Commons. (So, there are 338 seats in the House of Commons.)

At present, there are five parties represented in the House of Commons; four of them put forward candidates in all provinces. To simplify the assignment somewhat, our simulation will include only those four parties:

• the Conservative Party of Canada (CPC) (Links to an external site.)
• the Green Party (Links to an external site.)
• the Liberals (Links to an external site.)
• the New Democratic Party (NDP) (Links to an external site.)

For the purposes of the assignment, we will not differentiate candidates from their parties; Ballots will include only the party names.

Starter Filesand extract the zip archive. A description of each of the files that we have provided is given in the paragraphs below.

### Starter code: voting_systems.py

The voting_systems.py file contains some constants, some sample data, and function headers and docstrings for the functions you will write. For each function, read both the handout and the header and docstring (especially the examples) to understand what the function should do.

The sample_votes.csv file contains vote data in comma-separated values (CSV) format. You must not modify this file.

### Simulation code: voting_simulation.py

The voting_simulation.py file will let you run a simulation that uses the code you write in voting_systems.py. You will not be submitting this file and there is no need to change it.

### Checker: a2_checker.py

As we did in Assignment 1, we have provided a checker program (a2_checker.py) that you can run on your own computer before submitting your assignment to MarkUs. There are some other files that will appear that we have included so that the checker runs properly. Do not edit them, and do not move or delete them.

## Types of Ballots

Four types of ballots are used in this assignment. This section refers to the constant PARTY_ORDER, which you can find at the top of the voting_systems.py file.

## The Data

For this assignment, you will use data from a Comma Separated Value (CSV) file named sample_votes.csv. Each row of this file contains the following information about a single voter:

• riding number: the number of the riding to which the voter belongs
• voter number: a number assigned to a voter
• rank ballot: the voter’s rank ballot
• range ballots: the voter’s range ballot
• approval ballots: the voter’s approval ballot

Notice that the row of data for each voter does not contain their singlecandidate ballot. We will determine their single-candidate ballot from their top choice in their rank ballot.

## Custom Data Type

Within our code, the data from each row in sample_votes.csv will be represented as a list that contains ints (for riding number and voter number) and sublists (for representing each of the rank, range and approval ballots).

Within the starter code in voting_systems.py, we will refer to this data as VoteData in our type contracts.

## What to do

Suppose you want to analyze how different voting systems change the results of elections. There are multiple voting systems to compare, and hundreds of votes and ridings to consider. To make your life easier, you will write Python functions to help you manage this information.

### Assumptions and Preconditions

You do not need to write any preconditions in this assignment, as we have provided the docstrings for the required functions for you. If you choose to write any of your own helper functions, you should write complete docstrings for them.

To simplify the assignment, you can make the following assumptions:

• There will only ever be 4 parties, and their names in PARTY_ORDER and a cleaned VoteData list will always be uppercase. Note that while the given PARTY_ORDER list is sorted in alphabetical order, that may not always be the case.
• If ties occur, they should be broken by choosing the party that comes first in PARTY_ORDER. We suggest you write your code ignoring ties at first, then test to see what happens and then think about how to implement this tie-breaking if necessary.
• All lists representing a single rank, range, or approval ballot will have the same number of elements as PARTY_ORDER (i.e. 4).

## Task 0: Creating testing data

We have provided you with docstrings in the starter code in voting_system.py, however many of the docstrings only have one example. We have provided you with one sample list of VoteData called SAMPLE_DATA_1. Your task here is to create a second list sample list called SAMPLE_DATA_2 that contains 3 VoteData items representing the data of the voters in the following image:

We have started this off for you, setting SAMPLE_DATA_2 to be an empty list in the starter code. As you work through the following tasks, start by adding a second example that uses SAMPLE_DATA_2 in any docstring that does not contain 2 examples.

The code we provided in voting_simulation.py reads data from a CSV file produces a List[List[str]]. Here is a sample of the kind of list returned.

You are to write the function clean_data, which should modify the list according to the following rules:

• ridings should be ints
• voter numbers should be ints
• rank ballots should be a list of strs
• range ballots should be a list of ints
• approval ballots should be a list of bools

After applying the clean_data function to the example list above, it should contain this.

Each of the three sublists above is of type VoteData.

You must not use the built-in function eval in clean_data. This function is one of the more challenging functions in A2, because it mutates a list. We suggest that you start with some of the other functions in Task 2, and come back to this one later.

Once the data has been cleaned, you can use the following functions to extract information from your data.

You can work on these functions even before you have completed the clean_data function by working with the provided sample data in the starter code, and by creating your own small lists of clean vote data for testing.
See the docstrings in the starter code for examples of how to work with that data.
Note: For all the functions in this task, the arguments to the functions should NOT be mutated.

Once the data has been cleaned and you are able to extract the specific data you need, you can write functions to analyze the vote data and figure out election results.

You can work on these functions even before you have completed the clean_data function by working with the provided sample data in the starter code, and by creating your own small lists of clean vote data for testing.

See the docstrings in the starter code for examples of how to work with that data.

### Task 3.1: Plurality Voting System

The plurality voting system uses single-candidate ballots. Implement a function which counts all single-candidate ballots from our vote data, as described below.

### Task 3.2: Approval Voting System

The approval voting system uses approval ballots. Implement a function which counts all approval ballots from our vote data, as described below.

### Task 3.3: Range Voting System

The range voting system uses range ballots. Implement a function which counts all range ballots from our vote data, as described below.

### Task 3.4: Borda Count Voting System

The Borda count voting system uses rank ballots. The Borda Count is determined by assigning points according to ranking. A party gets 3 points for each first-choice ranking, 2 points for each second-choice ranking and 1 point for each third-choice ranking. No points are awarded for being ranked fourth.

For example, the rank ballot [‘LIBERAL’, ‘GREEN’, ‘CPC’, ‘NDP’] would contribute 3 points to the Liberal count, 2 points to the Green count and 1 point to the CPC count.

### Task 3.5: Instant Run-Off Voting System

The calculations involved in Instant Run-off Voting are considerably more complex than the previous voting systems. To help you implement this system, we have defined two helper functions that you must implement and use in your solution. If you are having trouble with this task, we recommend focusing on the other parts of the assignment first, and then returning to this later.

The Instant Run-Off Voting (IRV) system uses rank ballots, and does the counting as shown in the flowchart below. Initially, it considers only the first choice in the rank ballot of each voter. If this gives one party a majority (more than* 50% of the votes), then this party wins the seat. If not, the party with the fewest votes is removed from every ballot. For example, if ‘GREEN’ is removed from the rank ballot [‘NDP’, ‘GREEN’, ‘LIBERAL’, ‘CPC’], the ballot becomes [‘NDP’, ‘LIBERAL’, ‘CPC’]. The process is repeated with the new shorter ballots. (* Update (Oct 20): the provided example in the docstring should return ‘NDP’.

Most of the vote data analysis functions we wrote in Task 3 return a list of integers representing the vote points received by each party, corresponding to their order in the constant list PARTY_ORDER. With this list of points, we can determine the winner for that particular voting system.

The points_to_winner function is meant to take in a list of points and return the name of the winning party. Complete this function’s body as described below.

## What not to do

As you’re coding, keep in mind the following rules:

• Do not add statements that call print, input, or open.
• Do not use any break or continue statements. We are imposing this restriction (and we have not even taught you these statements) because they are very easy to abuse, resulting in terrible code.
• Do not modify or add to the import statements provided in the starter code.

## Note: Using Constants

As in Assignment 1, your code should make use of the provided constants for the indexes in VoteData and for other special strings in the uncleaned data.

This will not only make your code easier to read, but if the columns in the data moved around, your code would still work.

You should also make use of the constant PARTY_ORDER throughout your code. You should not assume it will have particular values in it (i.e. you should not hardcode strings like ‘GREEN’ in your code.)