bslatkin / effectivepython Goto Github PK

View Code? Open in Web Editor NEW

2.2K 2.2K 710.0 273 KB

Effective Python: Second Edition — Source Code and Errata for the Book

Home Page: https://effectivepython.com

Python 100.00%

effectivepython's People

Contributors

Stargazers

Watchers

Forkers

honsys pepsalehi coffe67 wolfcubone ruddyscent songyaos prakashr85 illy nbcin yourmoonlight aygons junhe neostoic nlpet slideclick sehahn carloshmir m-benites marktseng 18965050 gforester kamyu104 carlosgprado gabhijit sureshjosh aroradev frostytear dynaryu azmikamis sxfmol mileserickson wjmuse jhanna9 bkwart 0--key navachok stevenqzhang paveu qingxianlai kumare3 wk66 kawasaki2013 fatdeer zesenzeventig linearregression evshary ch1dyc4t ripitrust loveyuu larryhillyer davidyoonsik taiwoabegunde hunkimforks wonyonyon tforsberg mynameisfashanu cackyhk debo0611 rikazry funwidak2 tuanavu tpnguyen quietcoolwu zhenv5 rtu sungju-hong loochao williansdcm boyanglin whzatsh kznfrd ceston mohsenis cddxcdx dungvtdev trietptm-on-coding-algorithms sethips junyang-chen cloudsurfer365 codebicycle shshlomy aistlab seven magicxxx elviswf wmingstar christi3k jaytoday openstacker sinomiko yuyc yukimori blackswordsman1987 chrischiedo kashiwabara daerwang machinel1994 decolnz henrilin28 hmeftah

effectivepython's Issues

not actually a generator

in item 9, the example of composing generators,
roots = ((x, x**0.5) for x in it)
is actually just one generator returning tuples.

Item 25: Doesn't seem accurate to say "calls"

Originally reported in #40:

On a closely related note, in the paragraph after the last block of code on p. 72, it doesn't seem accurate to say that TimesFiveCorrect.init calls PlusTwoCorrect.init; that's just the order in which those initializers are run.

Item 53, p. 193 word missing

"Users on a system may upgrade one package to a new version but not others, which could dependencies."

There's a verb missing here, probably something like "affect" or "impact".

Item 21 Python 2 example

The error message in the book from calling the keyword-only function in Python 2 with too many positional arguments is wrong. It should be:

Trying to pass keyword-only arguments by position won’t work, just like in Python 3.
safe_division_d(1, 0, False, True)
>>>
TypeError: safe_division_d() takes 2 positional arguments but 4 �were given
Trying to pass unexpected keyword arguments also won’t work.
safe_division_d(0, 0, unexpected=True)
>>>
TypeError: Unexpected **kwargs: {'unexpected': True}

Item 40, ex. 7: code works, but may mislead readers

First of all, congratulations for the awesome example of Game of Life using coroutines and yield from. It's the most interesting example of yield from outside of the asyncio context I've found.

A nit pick, in Item 40, ex. 7, you have this snippet:

try:
    count = it.send(EMPTY) # Send q8 state, retrieve count
except StopIteration as e:
    print('Count: ', e.value) # Value from return statement

This code does display the count, but the assignment inside the try clause never actually happens. Some of your readers may be misled by this, and the comment in the same line reinforces the potential misunderstanding.

I'd rewrite it along these lines:

try:
    it.send(EMPTY)  # Send q8 state, driving the coroutine to end
except StopIteration as e:
    count = e.value  # Value from return statement
    print('Count: ', count)

This is a tiny detail, otherwise the example and your explanations in the book are outstanding!

Cheers!

-> character missing from Item 6

Page 14, Line 9 Item 6:

UnicodeDecodeError: 'utf-8' codec can’t decode byte 0x9d in
position 0: invalid start byte

The wrap-around symbol (->) is missing, it should be read as
UnicodeDecodeError: 'utf-8' codec can’t decode byte 0x9d in
(>) position 0: invalid start byte

Item 17: Iter() on iterator

Hi,

In Item 17:
"The protocol states that when an iterator is passed to the iter built-in function, iter will return the iterator itself."

In the corresponding code:
if iter(numbers) is iter(numbers):

shouldn't it be:
if iter(numbers) is numbers:

Sorry if I am missing anything.

-Santhosh Kasa.

Slatkin, Brett. Effective Python: 59 Specific Ways to Write Better Python (Effective Software Development Series) (Kindle Locations 1714-1715). Pearson Education. Kindle Edition.

Item 17 - normalize_defensive

In the normalize_defensive() function, page 42, would it be more natural to use

"if numbers is iter(numbers)"

rather than

"if iter(numbers) is iter(numbers)" ?

Both are correct, but the former looks a bit clearer.

Item 36, the result of code is not correct on Windows

def run_sleep(period):
    proc = subprocess.Popen(['sleep', str(period)])
    return proc

start = time.time()
procs = []
for _ in range(10):
    proc = run_sleep(0.1)
    procs.append(proc)

for proc in procs:
    proc.communicate()
end = time.time()
print('Finished in %.3f seconds' % (end - start))

the result is "Finished in 1.068 seconds" when run on windows, though it's OK on linux.
I think using this code to illustrate the "Decoupling the child process from the parent means that the parent process is free to run many child processes in parallel." may bring confusion for readers.

Item 21 python 2 example needs to use floating point numbers

From this part, all of the numerators inputted need to be floats.

# Python 2
def safe_division_d(number, divisor, **kwargs):
    ignore_overflow = kwargs.pop('ignore_overflow', False)
    ignore_zero_div = kwargs.pop('ignore_zero_division', False)
    if kwargs:
        raise TypeError('Unexpected **kwargs: %r' % kwargs)
    # ...
Now, you can call the function with or without keyword arguments.
safe_division_d(1, 10)
safe_division_d(1, 0, ignore_zero_division=True)
safe_division_d(1, 10**500, ignore_overflow=True)
Trying to pass keyword-only arguments by position won’t work, just like in Python 3.
safe_division_d(1, 0, False, True)
>>>
TypeError: safe_division_d() takes 2 positional arguments but 4 �were given
Trying to pass unexpected keyword arguments also won’t work.
safe_division_d(0, 0, unexpected=True)
>>>
TypeError: Unexpected **kwargs: {'unexpected': True}

Item 24, p. 69, L13: classes vs subclasses

Hi bslatkin.

Now you can write other GenericInputData and GenericWorker classes as you wish ...

I think that "other ... subclasses" insted of "other ... classes".

Best.
Hayao (a.k.a. XaroCydeykn on Twitter)

syntax highlighting in item 13, p. 26

Three lines from the bottom of p. 26: raise should be in purple.

Item 25: the number 5

In this item the number 5 is used in three distinct ways: i) as the initial value, ii) as the number that is added on in PlusFive, and iii) as the number that multiplies the value in TimesFive and TimesFiveCorrect.

The latter two uses are clear enough, but perhaps using a different number (e.g. 3) for the initial value might help distinguish between the first use and the other two. This applies especially in the case of expressions like 5 * (5 + 2). Drawing the right conclusion is not too difficult given the explanations in the text, but using a different starting value would eliminate this issue altogether.

(Obviously, this is not an erratum, merely a suggestion for possible improvement).

Item 32 - getattribute(self, name) with super

Hi Brett,

Thank you for your book. I have tried to run the following code based on your code.

class LazyDB(object):
    def __init__(self):
        self.exists = 5

    def __getattr__(self, name):
        value = 'Value of __getattr__ %s' % name
        return value

    def __getattribute__(self, name):
        print '__getattribute__: %s' % name
        return super().__getattribute__(name)


if __name__ == '__main__':
    data = LazyDB()
    print data.exists

And I get the following error:

(!1007) $ python getatt.py
__getattribute__: exists
Traceback (most recent call last):
  File "getatt.py", line 16, in <module>
    print data.exists
  File "getatt.py", line 11, in __getattribute__
    return super().__getattribute__(name)
TypeError: super() takes at least 1 argument (0 given)

By the way, I would like to know your opinion about functional programming with python ?

Thank you

Item 57 PDB example shows stop point on pdb line, but that's wrong

This code:

def complex_func(a, b, c):
  a = b * c
  import pdb; pdb.set_trace()
  a *= 2
  return a

complex_func(1, 2, 3)

When run, will show this prompt:

> .../scratch.py(4)complex_func()
-> a *= 2

In the book it says this on page 208, which is wrong

-> import pdb; pdb.set_trace()
(Pdb)

Return statement for Python 2 generators is possible, but only without a value

I say this in Item #40:

"The second limitation is that there is no support for the return statement in Python 2 generators. To get the same behavior that interacts correctly with try/except/finally blocks, you need to define your own exception type and raise it when you want to return a value."

I meant return with a value. You can do a "bare" return in Python 2 and it will do try/except/finally correctly:

>>> def foo():
...   try:
...     yield 1
...     return
...   finally:
...     print('hi')
... 
>>> foo()
<generator object foo at 0x10754f050>
>>> it = foo()
>>> it.next()
1
>>> it.next()
hi
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Item 30, page 92

Bucket class needs:

from datetime import datetime
from datetime import timedelta

More informative Errata

Hey, thanks for the great book.

It would be helpful if we had more granular Errata information compared to the current page. Something on a per-printing basis like for this book.

Right now, the only information that we have is whether an issue is confirmed and if it is "Fixed for the next printing" (which printing? which date?). It would be practical to see which corrections apply if I have, say, the 2nd printing in my hands.

Item 32, p. 104 - BrokenDictionaryDB does nothing with init parameters

I think the infinite recursion example directly following the class definition will still work, but is the intent of the constructor to set the incoming data parameter to self._data?

Item 15, P32, definition of closure function seems wrong .

def sort_priority(values, group):
    def helper(x):
        if x in group:
            return (0, x)
        return (1, x)
    values.sort(key=helper)

if '__main__' == __name__:
     numbers = [8,3,1,2,5,4,7,6]
     group =[2,3,5,7]
     sort_priority(numbers,group)  # return [2, 3, 5, 7, 1, 4, 6, 8]
     print(numbers)
     print(sort_priority.__closure__) # return None

There is a line in the middle of page 32, which says : This is how the sort method can accept a closure function as the key argument, But actually function helper is only a nested function rather than a closure function.
A closure function must be a nested function, but a nested function is not necessarily a closure function.

To be a closure function

It itself must be a nested function.

It must refer to a value defined in the enclosing function.

the enclosing function must return the nested function.

Obviously helper function doesn't meet the above #3 , it isn't returned by the enclosing function.
Also, if helper is a closure function,sort_priority.__closure__ should return something like
(<cell at 0x10d95a868: int object at 0x7f99e340bf68>, <cell at 0x10da08750: int object at 0x7f99e340bf50> other than None

Whitespace not preserved on `pprint` output

Here's an example from page 75:

>>>
{'left': {'left': None,
          'right': {'left': None, 'right': None, 'value': 9},
          'value': 7},
'right': {'left': {'left': None, 'right': None, 'value': 11},
           'right': None,
           'value': 13},
'value': 10}

The 'value' key should have a space before it so it lines up with 'left'. There's another example on page 76 and probably many others throughout the book.

Item 19: Unit conversion should use multiply not divide

This function has a / where it should have a *. You should multiply by units_per_kg not divide.

def flow_rate(weight_diff, time_diff,
              period=1, units_per_kg=1):
    return ((weight_diff / units_per_kg) / time_diff) * period

From the example:

weight_diff = 0.5
time_diff = 3
>>> 0.5 / 3
0.16666666666666666
>>> 0.5 / 3 * 3600
600.0
>>> 600 * 2.2
1320.0
>>> 600 / 2.2
272.7272727272727

Item 25: Inconsistent method resolution order

p. 71 describes MRO as "depth-first, left-to-right", but the example on p. 72 is breadth-first.

According to

http://docstore.mik.ua/orelly/other/python/0596001886_pythonian-chp-5-sect-2.html#pythonian-CHP-5-FIG-1

the former is "classic" and the latter is "new-style".

On a closely related note, in the paragraph after the last block of code on p. 72, it doesn't seem accurate to say that TimesFiveCorrect.__init__ calls PlusTwoCorrect.__init__; that's just the order in which those initializers are run.

Item 39, Page 134: ClosableQueue close() method

This isn't an error in your book but more of a gotcha; you define the ClosableQueue class as a subclass for Queue.Queue (I'm using Python 2.7 here); ClosableQueue implements a close() method to inform queue consumers that no more items will be added to the queue. This is totally fine with Queue.Queue, but if you want to take it to the next level with multiprocessing.Queue, you encounter an unexpected problem because multiprocessing.Queue already implements a close() method that does something different:

Indicate that no more data will be put on this queue by the current process. The background thread will quit once it has flushed all buffered data to the pipe. This is called automatically when the queue is garbage collected.

As an unwitting user, I subclassed ClosableQueue from multiprocessing.Queue, which resulted in the base class close() method being overwritten -- not what I wanted!

You may want to add that caveat to the book or consider changing the ClosableQueue name to something like FinalizableQueue.

P.S.
Loved the book!

Item #23: No need for a lambda expression

David writes:

question: you have an example with:

names.sort(key=lambda x: len(x))

Why a lambda rather than just 'key=len'?

That's in reference to this part of the book:

Here, I sort a list of names based on their lengths by
providing a lambda expression as the key hook:

names = ['Socrates', 'Archimedes', 'Plato', 'Aristotle']
names.sort(key=lambda x: len(x))
print(names)

I should remove the lambda

I think I put it in there to emphasize how the key parameter takes a function. But I could just say I can pass in the len built-in function and be done with it. As it is, I'm kind of confusing it with currying functions, which was not the goal of this item.

Item 41, p. 149 grammar

"However, the cost of doing so is high and may introduce bugs."

To me this reads like "the cost of doing so may introduce bugs", whereas you want to say that "doing so may introduce bugs", no? Perhaps this is fixed by something like: "However, doing so has a high cost and may introduce bugs."

Item 21: For overflow error you must use a float

This doesn't work:

result = safe_division(1, 10**500, True, False)

It must be 10.0

result = safe_division(1, 10.0**500, True, False)

Item 48 should say "decimal places" instead of "decimal points"

Item 48, Page 172

"28 decimal points"

That's wrong. It should be decimal places.

p. 218 index entry

"class variable
super built_in function with, 73"

This should, obviously, say built-in.

Since we're on the subject, perhaps there should be a distinct index entry on built-in functions (and possibly another one on special methods). Such index entries would be useful when using the book as a reference.

BTW, is the index listing page numbers of only first instances/definitions? I'm asking because e.g. an iter method is used on p. 135 but not mentioned under iter in the index.

(I've finished reading the book, so I probably won't be posting any more issues. I thought I'd take the opportunity to say that I really enjoyed Effective Python. Keep up the good work!)

item 6: odds and evens are opposites

item 6, page 13, technicality, odds = a[::2] actually only gives the even indexes (it starts with index 0 = even) and should therefor be evens = a[::2].

The same for evens = a[1::2] should be odds = a[1::2] since with start on the odd index 1.

Kind regards,
Bo Rydberg

Document how to chain iterator send() for Python 2

Since there's no yield from you need to use a loop in order to chain iterators. However, in the default explained by item 40, a simple loop doesn't properly pass through sending. You have to nest the send() calls like this:

    it = count_neighbors(y, x)
    try:
        value = next(it)
        while True:
            value = it.send((yield value))
    except MyReturn as e:
        neighbors = e.value

A full example showing the problem is worked out in this gist:

https://gist.github.com/bslatkin/6934d53441198f3db30d32cb7f1dbf73#file-my_coroutine_in_py_27-py-L33

Item #15: 'nonlocal' workaround for Python 2

Instead of using lists or dictionaries to send data out of inner scope, I've seen the following alternatives that seem cleaner and more explicit:

Using the function object as namespace container

# Example 9
def sort_priority(numbers, group):
    sort_priority.found = False
    def helper(x):
        if x in group:
            sort_priority.found = True
            return (0, x)
        return (1, x)
    numbers.sort(key=helper)
    return sort_priority.found

Using namespace classes

# Example 9
def sort_priority(numbers, group):
    class NS: found = False
    def helper(x):
        if x in group:
            NS.found = True
            return (0, x)
        return (1, x)
    numbers.sort(key=helper)
    return NS.found

Item 24, p. 68: InputData vs GenericInputData

"are up to the InputData concrete subclass to interpret":

as shown in the code that follows, this is actually referring to GenericInputData concrete subclasses, not InputData ones. This construction is probably a result of the way the previous paragraph (on p. 67) is set up, where InputData is essentially used as a synonym of GenericInputData.

Use "numerator" not "number" in item #19

"With the call remainder(20, 7), it’s not evident which argument is the number and which is the divisor without looking at the implementation of the remainder method."

I should use "numerator" throughout because this is confusing.

Item 8, suggestion of two expressions

The rule of thumb is to avoid using more than two expressions in a list comprehension. This could be two conditions, two loops, or one condition and one loop.

This sentence could be more specific. What I mean here is you have the primary expression which is the value the list comprehensions takes on, then you have the dependent expressions in conditions and loops:

[exp1 exp2 exp3]

The goal is to minimize the complexity of this so only 2 out of 3 are complex.

# Two conditions (exp1 is a simple loop, exp2 and exp3 are complex conditions)
[x for x in y if x % 2 == 0 and x % 3 == 0]

# A loop and a condition (exp1 is a more complex expression, exp2 is a simple loop, exp3 is a complex condition)
[math.exp(x) for x in y if x % 2 == 0]

# Two loops (exp1 is a complex expression, exp2 and exp3 are loops)
[math.exp(x) for y in z for x in y]

Item 5, page 13. Small typo(?)

Hi!

You have write:

If you assign a slice with no start or end indexes, you’ll replace its
entire contents with a copy of what’s referenced (instead of allocating
a new list).

b = a
print('Before', a)
a[:] = [101, 102, 103]
assert a is b # Still the same list object
print('After ', a) # Now has different contents

I think that it's not interesting to print a after since we want to see that b has not changed. Especially, you have described assignment with slicing in previous example.

Sorry, if I'm wrong.

Best wishes and thanks for the book!

Item 26, p. 74: `_traverse` method of `ToDictMixin` class has unused parameter `key`

As far as I can tell, the key parameter isn't used for anything in the method body except to be passed back in for the recursive call.

Item 32 Things to Remember: "getattr only gets called once"

__getattr__ only gets called once when accessing a missing
attribute"

should read

__getattr__ only gets called when accessing a missing
attribute"

ie. the "once" is wrong.

Fix /tmp references for Windows

Item 3, page 7 and Item 13, page 26. Windows complains about /tmp. It's easier just to use the current directory, like this:

handle = open('random_data.txt')  # May raise IOError
try:
    data = handle.read()  # May raise UnicodeDecodeError
finally:
    handle.close()        # Always runs after try:

Item 29, p89, BoundedResistance has no attribute _ohm

The parent class Resistor has no attribute of _ohm, it has ohm defined instead.
BoundedReistance refers to _ohm nonetheless, so the example won't run as it is:

class Resistor(object):
    def __init__(self, ohms):
        self.ohms - ohms
        self.voltage = 0
        self.current = 0

class BoundedResistance(Resistor):
    def __init__(self, ohms):
        super().__init__(ohms)

    @property
    def ohms(self):
        return self._ohms

    @ohms.setter
    def ohms(self, ohms):
        if ohms <= 0:
            raise ValueError
        self._ohms = ohms


r3 = BoundedResistance(1e3)
r3.ohms = 0

The output is:

Traceback (most recent call last):
  File "attr.py", line 22, in <module>
    r3 = BoundedResistance(1e3)
  File "attr.py", line 9, in __init__
    super().__init__(ohms)
  File "attr.py", line 3, in __init__
    self.ohms - ohms
  File "attr.py", line 13, in ohms
    return self._ohms
AttributeError: 'BoundedResistance' object has no attribute '_ohms'

Item 39, p. 135 syntax highlighting

for _ in range(1000):

in and 1000 should be in purple/blue as elsewhere in the book (e.g. on p. 131).

Classes in improper order

Using Python 2 (which forced me to change the metaclass=Meta argument into the statement metaclass=Meta, though I don't think it should matter), I found that the example would not run with the classes defined in the given order: Meta, DatabaseRow, Field, BetterCustomer, because when class DatabaseRow is first executed, it invokes Meta.new(), which compares the type of an object against type 'Field', and Field has not yet been defined. If, however, I switch the positions of classes DatabaseRow and Field, then everything works.

p. 117 paths of execution & workload

"When two distinct paths of execution in a program make forward progress in parallel, the time it takes to do the total work is cut in half; the speed of execution is faster by a factor of two."

This implies that there is no overhead whatsoever. More importantly, however, it implicitly assumes that the two distinct paths of execution are given the same workload. This certainly does not have to be the case: if one path of execution is given, say, 10% of the total workload, then the overall time is not cut in half.

Item 27, p. 82: get method indenting

The definition of get in ApiClass seems to be indented by one space less than init in Child. This shouldn't matter, of course, but it appears to reflect inconsistent indenting in ApiClass itself (probably due to the page break). Everywhere else, the "d" in "def" is in the column below the second "s" in "class".

Item 41, p. 148 greatest common denominator

On p. 146 we were told the gcd function finds the greatest common divisor. On p. 148, however, this has turned into the greatest common denominator. (This is probably the result of thinking of a "lowest common denominator").

Item 45: Use datetime Instead of time for Local Clocks

p.165 there's a snippet demonstrating pytz:

arrival_nyc = '2014-05-01 23:33:24'
nyc_dt_naive = datetime.strptime(arrival_nyc, time_format)
eastern = pytz.timezone('US/Eastern')
nyc_dt = eastern.localize(nyc_dt_naive)
utc_dt = pytz.utc.normalize(nyc_dt.astimezone(pytz.utc))
print(utc_dt)

I don't quite understand this line utc_dt = pytz.utc.normalize(nyc_dt.astimezone(pytz.utc)).
Which seems to me tries to convert to utc datetime twice.

I tried utc_dt = pytz.utc.normalize(nyc_dt) and I feel that's sufficient unless I miss something.

Item 44, p. 162 pickle vs unpickle

"You can specify a stable identifier for the function to use for unpickling an object.

...

copyreg.pickle(BetterGameState, pickle_game_state)

...

you can see that the import path to pickle_game_state is encoded

...

b'\x80\x03c__main__\nunpickle_game_state

...

you can't change the path of the module in which the unpickle_game_state function is present"

It's not obvious to me that there's something wrong here, but the repeated jumps from pickling to unpickling can be confusing. For example, the main text speaks of the encoding of pickle_game_state but the output of print(serialized[:35]) contains unpickle_game_state.

I think this is a result of the fact that the first thing that pickle_game_state returns is unpickle_game_state, but even so I was confused reading this portion of the text.

Item 35, P.113, L-9, Typographic error?

Hi bslatkin.

In Item 35, P.113, L-9,

Then, the return valueof that assinged to Customer.field_name.

I think that Customer.field_name should have been Customer.first_name.

Best.
Hayao Suzuki

item 33 - output shows result of pprint, but print listed - minor

As a very minor point (page 106 of printed book)

class Meta(type):
    def __new__(meta, name, bases, class_dict):
        print((meta, name, bases, class_dict))            # change print -> pprint?
        return super().__new__(meta, name, bases, class_dict)

is the code snippet, and the output listed is:

(<class '__main__.Meta'>,
 'MyClass',
 (<class 'object'>,),
 {'__module__': '__main__',
  '__qualname__': 'MyClass',
  'foo': <function MyClass.foo at 0x1014e2d08>,
  'stuff': 123})

This is the output which would result from pprint (not print).
Perhaps the code snippet should be updated to reflect this.

item 6: slight improvement of "Things to Remember"

Item 6, page 15, bullet 3 reads "... ,consider doing two assignments (one to
slice, another to stride) ...", but since page 14 states
"The first operation should try to reduce the size of the resulting slice by
as much as possible." maybe part of bullet 3 should be phrased, "... ,consider doing two assignments (one to stride, another to slice) ...".