Python代写:COMSW3101 Introduction to Python

Introduction

本次需要代写的Python作业,包含了5个算术问题需要解决。

Problem 1 - Decrypting Government Data

Your job is to summarize this gov data about oil consumation

  • The format of the file is rather bizzare - note that each line has data for two months, in two different years! (Plus I had to hand edit the file to make it parseable)
  • Fortunately, Python is great for untangling and manipulating data.
  • Write a generator that reads from the given url over the network, and produces a summary line for a year’s data on each ‘next’ call
  • remember that urllib.request returns ‘bytes arrays’, not strings
  • The generator should read the lines of the oil2.txt file in a lazy fashion - it should only read 13 lines for every two years of output. Note a loop can have any number of ‘yield’ calls in it.
  • Ignore the monthly data, just extract the yearly info
  • Drop the month column
  • In addition to the ‘oil’ generator function, my solution had a separate helper function, ‘def makeCSV- Line(year, data):’

Here is the first two years of data, 2014 and 2013

Year,Quantity,QuantityChange,Unknown,Unknown2,Price,PriceChange
2014,2700903,-112867,246409332,-26397845,91.23,-5.72
2013,2813770,-283638,272807177,-40367786,96.95,-4.15
2012,3097408,-224509,313174963,-18407090,101.11,1.29
2011,3321917,-55160,331582053,79421544,99.82,25.15
2010,3377077,62290,252160509,63448733,74.67,17.74
2009,3314787,-275841,188711776,-153200712,56.93,-38.29
2008,3590628,-99940,341912488,104700835,95.22,30.95
2007,3690568,-43658,237211653,20584322,64.28,6.26
2006,3734226,-20445,216627331,40871990,58.01,11.20
2005,3754671,-66308,175755341,44012676,46.81,12.33
2004,3820979,144974,131742665,32575492,34.48,7.50
2003,3676005,257983,99167173,21883842,26.98,4.37
2002,3418022,-53045,77283331,2990437,22.61,1.21
2001,3471067,71827,74292894,-15583539,21.40,-5.04
2000,3399240,171148,89876433,38986812,26.44,10.68
1999,3228092,-14620,50889621,13637399,15.76,4.28
1998,3242712,173281,37252222,-16973685,11.49,-6.18
1997,3069431,175785,54225907,-704950,17.67,-1.32
1996,2893646,126333,54930857,11181204,18.98,3.17

now that we have something that looks like a CVS file, can do all kinds of things

  • could save it to a file then
    • excel, openoffice could read it
    • Python has a CVS Reader
  • with a little juggling, can easily pump the data into a panda DataFrame

Input:

1
2
3
4
5
6
7
8
9
10
11
12
with open('/tmp/oil.csv', 'w') as f:
for l in oil(url):
f.write(l + '\n')

o = oil(url)
ls = list(o)
s = '\n'.join(ls)
import pandas as pd
import io
# we will cover StringIO next week - kind of an 'in-memory' file
df = pd.read_csv(io.StringIO(s))
df

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    Year Quantity QuantityChange    Unknown   Unknown2   Price PriceChange
0 2014 2700903 -112867 246409332 -26397845 91.23 -5.72
1 2013 2813770 -283638 272807177 -40367786 96.95 -4.15
2 2012 3097408 -224509 313174963 -18407090 101.11 1.29
3 2011 3321917 -55160 331582053 79421544 99.82 25.15
4 2010 3377077 62290 252160509 63448733 74.67 17.74
5 2009 3314787 -275841 188711776 -153200712 56.93 -38.29
6 2008 3590628 -99940 341912488 104700835 95.22 30.95
7 2007 3690568 -43658 237211653 20584322 64.28 6.26
8 2006 3734226 -20445 216627331 40871990 58.01 11.20
9 2005 3754671 -66308 175755341 44012676 46.81 12.33
10 2004 3820979 144974 131742665 32575492 34.48 7.50
11 2003 3676005 257983 99167173 21883842 26.98 4.37
12 2002 3418022 -53045 77283331 2990437 22.61 1.21
13 2001 3471067 71827 74292894 -15583539 21.40 -5.04
14 2000 3399240 171148 89876433 38986812 26.44 10.68
15 1999 3228092 -14620 50889621 13637399 15.76 4.28
16 1998 3242712 173281 37252222 -16973685 11.49 -6.18
17 1997 3069431 175785 54225907 -704950 17.67 -1.32
18 1996 2893646 126333 54930857 11181204 18.98 3.17
19 1995 2767313 63116 43749653 5270236 15.81 1.58
20 1994 2704197 160822 38479417 10041 14.23 -0.90
21 1993 2543375 248805 38469376 -83679 15.13 -1.68

Input:

1
[df['Price'].mean(), df['Price'].min(), df['Price'].max()]

Output:

1
[46.63681818181818, 11.49, 101.11]

Problem 2

  • suppose we want to convert between C(Celsius) and F(Fahrenheit), using the equation 9C = 5 (F-32)
  • could write functions ‘c2f’ and ‘f2c’
  • do all computation in floating point for this problem

Input:

1
2
3
4
5
def c2f(c):
return((9. * c + 5. * 32.) / 5.)
def f2c(f):
return(5. * (f - 32) / 9.)
[c2f(0), c2f(100), f2c(32), f2c(212)]

Output:

1
[32.0, 212.0, 0.0, 100.0]

  • to write f2c, we solved the equation for C, and made a function out of the other side of the equation
  • to write c2f, we solved for F, . . .
  • there is another way to think about this
  • rearrange the equation into a symmetric form 9 * C - 5 * F = -32 * 5
  • you can think of the equation above as a “constraint” between F and C. if you specify one variable, the other’s value is determined by the equation. in general, if we have c0 * x0 + c1 * x1 + … cN * xN = total
  • cI are fixed coefficients
  • specifying any N of the (N + 1) x’s will determine the remaining x variable
  • define a class, ‘Constaint’ that will do ‘constraint satisfaction’
  • you may find ‘dotnone’ to be helpful

Input:

1
2
3
4
5
6
7
8
9
10
# regular dot product, except that if or both values in a pair is 'None',
# that term is defined to contribute 0 to the sum
def dotnone(l1, l2):
'''another dot product variant'''
sum = 0
for e1,e2 in zip(l1,l2):
if not (e1 is None or e2 is None):
sum += e1 * e2
return(sum)
[dotnone([1,2,3], [4,5,6]), dotnone([1,None,3], [4,5,6]), dotnone([None,1], [2,None])]

Output:

1
[32, 22, 0]

Input:

1
2
3
4
5
6
7
8
9
10
11
12
13
# setup constraint btw C and F
# 1st arg is var names,
# 2nd arg is coefficients
# 3rd arg is total
c = Constraint('C F', [9, -5], -5 * 32)
# 1st arg - variable index or name
# 2nd arg - variable value
# setvar will fire when there is only one unset variable remaining
# it will print the variable values, return them in a list, and
# clear all variable values
c.setvar(0, 100)
C = 100.0
F = 212.0

Output:

1
[100.0, 212.0]

Problem 3 - Hamlet

  • Python is very popular in ‘digital humanities’
  • MIT has the complete works of Shakespeare in a simple html format
  • You will do a simple analysis of Hamlet by reading the html file, one line at a time(usual iteration scheme) and doing pattern matching
  • The goal is to return a list of the linecnt, total number of ‘speeches’(look at the file format), and a dict showing the number of ‘speeches’ each character gives
  • Your program should read directly from the url given, but you may want to download a copy to examine the structure of the file.
  • remember that usrlib.request returns ‘byte arrays’, not strings
  • here’s a short sample of the file
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
<A NAME=speech25><b>HORATIO</b></a>
<blockquote>
<A NAME=1.1.37>Tush, tush, 'twill not appear.</A><br>
</blockquote>

<A NAME=speech26><b>BERNARDO</b></a>
<blockquote>
<A NAME=1.1.38>Sit down awhile;</A><br>
<A NAME=1.1.39>And let us once again assail your ears,</A><br>
<A NAME=1.1.40>That are so fortified against our story</A><br>
<A NAME=1.1.41>What we have two nights seen.</A><br>
</blockquote>

<A NAME=speech27><b>HORATIO</b></a>
<blockquote>
<A NAME=1.1.42>Well, sit we down,</A><br>
<A NAME=1.1.43>And let us hear Bernardo speak of this.</A><br>
</blockquote>

<A NAME=speech28><b>BERNARDO</b></a>
<blockquote>
<A NAME=1.1.44>Last night of all,</A><br>
<A NAME=1.1.45>When yond same star that's westward from the pole</A><br>
<A NAME=1.1.46>Had made his course to illume that part of heaven</A><br>
<A NAME=1.1.47>Where now it burns, Marcellus and myself,</A><br>
<A NAME=1.1.48>The bell then beating one,--</A><br>
<p><i>Enter Ghost</i></p>
</blockquote>

<A NAME=speech29><b>MARCELLUS</b></a>
<blockquote>
<A NAME=1.1.49>Peace, break thee off; look, where it comes again!</A><br>
</blockquote>

<A NAME=speech30><b>BERNARDO</b></a>
<blockquote>
<A NAME=1.1.50>In the same figure, like the king that's dead.</A><br>
</blockquote>

Input:

1
hamlet(url)

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
[8881,
1150,
defaultdict(int,
{'All': 4,
'BERNARDO': 23,
'CORNELIUS': 1,
'Captain': 7,
'Danes': 3,
'FRANCISCO': 8,
'First Ambassador': 1,
'First Clown': 33,
'First Player': 8,
'First Priest': 2,
'First Sailor': 2,
'GUILDENSTERN': 33,
'Gentleman': 3,
'Ghost': 14,
'HAMLET': 359,
'HORATIO': 112,
'KING CLAUDIUS': 102,
'LAERTES': 62,
'LORD POLONIUS': 86,
'LUCIANUS': 1,
'Lord': 3,
'MARCELLUS': 36,
'Messenger': 2,
'OPHELIA': 58,
'OSRIC': 25,
'PRINCE FORTINBRAS': 6,
'Player King': 4,
'Player Queen': 5,
'Prologue': 1,
'QUEEN GERTRUDE': 69,
'REYNALDO': 13,
'ROSENCRANTZ': 49,
'Second Clown': 12,
'Servant': 1,
'VOLTIMAND': 2})]

Problem 4

  • in class, we discussed two different ways to represent a polynomial
    • polylist, a ‘dense’ represenation, that hold the coefficients in a list
    • polydict, a ‘sparse’ representation, that holds (exponent, coefficent) pairs in a dict
  • add a method, ‘topolydict()’ to class ‘polylist’, that converts the polylist into a polydict
  • add a method, ‘topolylist()’ to class ‘polydict’, that converts the polydict into a polylist
  • note that polylist->polydict will always work, but polydict->polylist can fail, because a polylist cannot represent negative exponents. in this case, raise a ValueError
  • just to tell them apart, polylist prints with a leading ‘+’

Input:

1
2
3
4
5
6
pl1 = polylist([1, 2, 3])
pl2 = polylist([0, 10, 5])
pd1 = polydict({2:3, 1:2, 0:1})
pd2 = polydict({1:10, 2:5})
pd3 = polydict({-1:10, 2:5})
[pl1, pl2, pd1, pd2, pd3]

Output:

1
2
3
4
5
[+ 3 * X ** 2 + 2 * X + 1,
+ 5 * X ** 2 + 10 * x,
3 * X ** 2 + 2 * X + 1,
5 * X ** 2 + 10 * X,
5 * X ** 2 + 10 * X ** -1]

Input:

1
[pl1.topolydict(), pl2.topolydict(), pd1.topolylist(), pd2.topolylist()]

Output:

1
[3 * X ** 2 + 2 * X + 1, 5 * X ** 2 + 10 * X, + 3 * X ** 2 + 2 * X + 1, + 5 * X ** 2 + 10 * X]

Problem 5

define the __mul__ method for polydict
Input:

1
[pd1, pd2, pd3, pd1 * pd2, pd1 * pd3, pd2 * pd3]

Output:

1
2
3
4
5
6
7
8
[+ 3 * X ** 2 + 2 * X + 1,
+ 5 * X ** 2 + 10 * x,
3 * X ** 2 + 2 * X + 1,
5 * X ** 2 + 10 * X,
5 * X ** 2 + 10 * X ** -1,
15 * X ** 4 + 40 * X ** 3 + 25 * X ** 2 + 10 * X,
15 * X ** 4 + 10 * X ** 3 + 5 * X ** 2 + 30 * X + 20 * X ** -1,
25 * X ** 4 + 50 * X ** 3 + 50 * X + 100]