rougier / numpy-100 Goto Github PK

100 numpy exercises (with solutions)

License: MIT License

Python 100.00%

python numpy binder notebook exercises

numpy-100's Introduction

100 numpy exercises

This is a collection of numpy exercises from numpy mailing list, stack overflow, and numpy documentation. I've also created some problems myself to reach the 100 limit. The goal of this collection is to offer a quick reference for both old and new users but also to provide a set of exercises for those who teach. For extended exercises, make sure to read From Python to NumPy.

→ Test them on Binder
→ Read them on GitHub

Note: markdown and ipython notebook are created programmatically from the source data in source/exercises.ktx. To modify the content of these files, please change the text in the source and run the generators.py module with a python interpreter with the libraries under requirements.txt installed.

The keyed text format (ktx) is a minimal human readable key-values to store text (markdown or others) indexed by keys.

This work is licensed under the MIT license.

Variants in Other Languages

Julia: 100 Julia Exercises.

numpy-100's People

Stargazers

Watchers

Forkers

alexlib datahacking svaksha pywaker xuanqing94 honglongwu blueberry166 cherler zxsted nkhuyu suranands annamalainagappan chaoshengt vovoma xmpy pfjob09 thtbse yonglin marengz ghevcoul deepack claudejrogers megiyer9 rahiyer zuiwufenghua programmer-util janchorowski ovidiucs yanghs aman-bhatia anboqing vchollati beutifulskin lmpizarro d13sl0w tandakun cdeil italmassov dkwmd yunque azmikamis sriharshamp antefnava fedorajzf an100 shaoguangcheng prateeknepaliya09 louisliaoxh1989 dandelin ajaycode wangzhwei vkuznet zhf459 jayinai fyffyt coloratto liyumeng midnightradio lennondu laisun felixmonkey bbolker gokul-uf westamine lizsz dapid zhangkj miradel51 floren1969 lihaossu qihongl lemonlalala prasanna99 xen kkawailab kurozumi scottt arokem sanjaymeena mehrtash cleverer123 chenchaodev jhamrick mirca andrewosh kaiser34 pabloleon xign ovenguo shunk031 fumengyao5544 pawanvirsingh lambdaofgod servak nirvanarhk salukhadka veeru2015 btel longrw ibah

numpy-100's Issues

Problem 54 error

For origin code, I encounter the error.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-234023c5571a> in <module>()
      5                 6,  ,  , 7, 8\n
      6                  ,  , 9,10,11\n""")
----> 7 Z = np.genfromtxt(s, delimiter=",", dtype=np.int)
      8 print(Z)

C:\Users\yiyuezhuo\Anaconda3\lib\site-packages\numpy\lib\npyio.py in genfromtxt(fname, dtype, comments, delimiter, skip_header, skip_footer, converters, missing_values, filling_values, usecols, names, excludelist, deletechars, replace_space, autostrip, case_sensitive, defaultfmt, unpack, usemask, loose, invalid_raise, max_rows)
   1480                     first_line = (
   1481                         asbytes('').join(first_line.split(comments)[1:]))
-> 1482             first_values = split_line(first_line)
   1483     except StopIteration:
   1484         # return an empty array if the datafile is empty

C:\Users\yiyuezhuo\Anaconda3\lib\site-packages\numpy\lib\_iotools.py in _delimited_splitter(self, line)
    217     def _delimited_splitter(self, line):
    218         if self.comments is not None:
--> 219             line = line.split(self.comments)[0]
    220         line = line.strip(asbytes(" \r\n"))
    221         if not line:

TypeError: Can't convert 'bytes' object to str implicitly

I read numpy document, and I found

Mote that generators must return byte strings in Python 3k.

However, when I change StringIO to BytesIO and add prefix b before the string. I can get true result.

Fail to load the ipython notebook

Hi, I got something wrong with the ipython notebook in Binder.

what I got:

and when I opened the 100 numpy exercises.ipynb file with my ipython I got the same error.
my ipython version: 4.2.1

Apprentice exercise 7 will not run as written

The scipy comparison will not run as written and throws an AttributeError on the line D = scipy.spatial.distance.cdist(Z,Z). To resolve this, you need to import scipy.spatial on the previous line.

Why can we normalize an array in the way that No.22 do?

I know what np.mean() does and the process of np.std(), but I couldn't figure out why we can do a normalization in the way that combine them together.

Why does it works normalize an array like Z = (Z - np.mean (Z)) / (np.std (Z)) use dividing?

18 - Alternative that might be clearer

Z = np.tile([[0, 1], [1, 0]], (4, 4))
print(Z)

a couple of suggestions

Here's a couple of examples I've stumbled upon when teaching/learning myself. All these are variations on existing themes, and none are very challenging.
Feel free to use as you see fit.

(demonstrates chaining of bitwise operations)
Given a 1D array, negate all elements which are between 3 and 8, in place.

>>> a = np.arange(11)
>>> a[(3 < a) & (a <= 8)] *= -1


Sum a small array faster than np.sum

>>> a = np.arange(10)
>>> np.add.reduce(a)


Given two arrays, ``x`` and ``y``, construct the Cauchy matrix, 
:math:`C_{i,j} = 1/(x_i - y_j)`

>>> x = np.arange(8)
>>> y = x + 0.5
>>> C = 1.0 / np.subtract.outer(x, y)
>>> print(np.linalg.det(C))


Repeat the previous excercise, but this time do the determinant computation
in logspace to avoid a possible over/underflow.

>>> x = np.arange(8)
>>> y = x + 0.5
>>> C = 1.0 / np.subtract.outer(x, y)
>>> # using the fact that log det C == tr log C
>>> from scipy.linalg import logm
>>> np.trace(logm(C))


Given an integer ``n`` and a 2D array ``x``, select from ``x`` the rows which
can be interpreted as draws from a multinomial distribution with ``n`` degrees,
i.e., the rows which only contain integers and which sum to ``n``.

>>> x = np.asarray([[1, 0, 3, 8],
...                 [2, 0, 1, 1],
...                 [1.5, 2.5, 1, 0]])
>>> mask = np.logical_and.reduce(np.mod(x, 1) == 0, axis=-1)
>>> mask &= (x.sum(axis=-1) == n)
>>> x[mask]
array([[ 2.,  0.,  1.,  1.]])

Simpler solution for Q.25

Given a 1D array, negate all elements which are between 3 and 8, in place.

Current solution:

Z = np.arange(11)
Z[(3 < Z) & (Z <= 8)] *= -1
print(Z)

Cleaner solution:

Z = np.arange(11)
Z[3:9] *= -1
print(Z)

Jupyter notebook files are not opening

When I try to view the .ipynb files on GitHub:

100_Numpy_exercises.ipynb
100_Numpy_exercises_with_hint.ipynb

I get the error

Sorry, something went wrong. Reload?

I also tried to view them on https://nbviewer.jupyter.org/ but without success either:
https://nbviewer.jupyter.org/github/rougier/numpy-100/blob/master/100_Numpy_exercises.ipynb
https://nbviewer.jupyter.org/github/rougier/numpy-100/blob/master/100_Numpy_exercises_with_hint.ipynb

Error reading JSON notebook

Same when running locally on my computer.

Note: 100_Numpy_exercises_no_solution.ipynb is opening without problem.

Q36 : np.ceil yield wrong answer

Q: Extract the integer part of a random array using 5 different methods
A:

...
print (np.ceil(Z)-1)
....

But it yields wrong results when float numbers have only integer part:

In[]:
Z = np.random.uniform(0,10,10)

Z = np.hstack([Z, [1., -1.]])

print (Z - Z%1)
print (np.floor(Z))
print (np.ceil(Z)-1)
print (Z.astype(int))
print (np.trunc(Z))

Out[]:
[ 4.  1.  2.  6.  0.  9.  0.  8.  4.  8.  1. -1.]
[ 4.  1.  2.  6.  0.  9.  0.  8.  4.  8.  1. -1.]
[ 4.  1.  2.  6.  0.  9.  0.  8.  4.  8.  0. -2.]
[ 4  1  2  6  0  9  0  8  4  8  1 -1]
[ 4.  1.  2.  6.  0.  9.  0.  8.  4.  8.  1. -1.]

Should be removed.

Solution for 76 (Can be wrong when use a view of array as input)

76. Consider a one-dimensional array Z, build a two-dimensional array whose first row is (Z[0],Z[1],Z[2]) and each subsequent row is shifted by 1 (last row should be (Z[-3],Z[-2],Z[-1])

# Author: Joe Kington / Erik Rigtorp
from numpy.lib import stride_tricks

def rolling(a, window):
    shape = (a.size - window + 1, window)
    strides = (a.itemsize, a.itemsize)
    return stride_tricks.as_strided(a, shape=shape, strides=strides)

Z = rolling(np.arange(10), 3)
print(Z)

When use view of array with strides as input.

a = np.arange(10)
a = a[::2]
Z = rolling(a, 3)
print(Z)

The output

[[0 1 2]
 [1 2 3]
 [2 3 4]]

My solution is using strides instead of itemsize. Therefore, the answer looks like this:

from numpy.lib import stride_tricks

def rolling(a, window):
    shape = (a.size - window + 1, window)
    strides = (a.strides[0], a.strides[0])
    return stride_tricks.as_strided(a, shape=shape, strides=strides)

In this way, the result is correct.

[[0 2 4]
 [2 4 6]
 [4 6 8]]

couldn't run initialise.py

there is an error occured when I run initialise.py :

`UnicodeDecodeError Traceback (most recent call last)
D:\cs\mechine learning\numpy-100-master\initialise.py in
1 import numpy as np
2
----> 3 import generators as ge
4
5

D:\cs\mechine learning\numpy-100-master\generators.py in
34
35 HEADERS = ktx_to_dict(os.path.join('source', 'headers.ktx'))
---> 36 QHA = ktx_to_dict(os.path.join('source', 'exercises100.ktx'))
37
38

D:\cs\mechine learning\numpy-100-master\generators.py in ktx_to_dict(input_file, keystarter)
9
10 with open(input_file, 'r+') as f:
---> 11 lines = f.readlines()
12
13 k, val = '', ''

UnicodeDecodeError: 'gbk' codec can't decode byte 0x86 in position 59: illegal multibyte sequence`

When I solved this error by change the 'r+' to 'rb', another error occured. So I'm confused.
Is there a problem with my environment or something else wrong ?
If it helps, my python is 3.7

Solution to "How to find rows of A that contain elements of each row of B" exercise

Hi,
It seems the solution given is wrong unless I misunderstood the question. The question is:

Consider two arrays A and B of shape (8,3) and (2,2). How to find rows of A
that contain elements of each row of B regardless of the order of the elements in B?

The solution provided selects rows of A that contain (1) at least two unique elements of B (can be the same unique element of B repeated twice) or (2) at least one element that occurs 2 or more times in B. The rows of B are irrelevant. This seems not to answer the question correctly.
Example:

B = np.array([[0,1],[2,2]])
A = np.array([[3,3,3],[0,1,3],[0,0,3],[1,1,3],[2,3,3],[0,2,3],[1,2,3]])

The correct solution should pick rows of A that contains 0 and 2 or 1 and 2 (i.e. each row of B is represented in a given row of A) - so only rows 5 an 6.
Current solution will also pick rows of A that contain [0,1], 2x 0, 2x 1 or 1x 2 - so rows 1,2,3,4.

C = (A[..., np.newaxis, np.newaxis] == B)
rows = (C.sum(axis=(1,2,3)) >= B.shape[1]).nonzero()[0]
print(rows)

[1 2 3 4 5 6]

The correct solution could be:

C = (A[..., np.newaxis, np.newaxis] == B)
rows = np.where(C.any((3,1)).all(1))[0]
print(rows)

[5 6]

the 5th question's solution doesn't work.

%run python -c "import numpy; numpy.info(numpy.add)"

It yields:

ERROR:root:File `u'`python.py'` not found.

#53 IS BACKWARDS

The question asks how to convert a FLOAT into an INTEGER however the answer is written as if you were to convert an INTEGER into a FLOAT. So either the question should be reversed, or the answer to the problem should be reversed.

There is a wrong answer in 16

Two answers are given to 16, one is using np.pad and the second is using indexing.

Z = np.ones((5,5))
Z = np.pad(Z, pad_width=1, mode='constant', constant_values=0)
print(Z)

# Using fancy indexing
Z[:, [0, -1]] = 0
Z[[0, -1], :] = 0
print(Z)

But the first one adds a new border and the second one is just replaces current border.

Alternate solution to Q29

Q29. How to round away from zero a float array

Solution:

Z = np.random.uniform(-10, 10, 10)
Z = np.where(Z>0, np.ceil(Z), np.floor(Z))
print(Z)

There is a wrong answer in 66

Considering a (w,h,3) image of (dtype=ubyte), compute the number of unique colors (★★★)

Author: Nadav Horesh

w,h = 16,16
I = np.random.randint(0,2,(h,w,3)).astype(np.ubyte)
F = I[...,0] * 256 * 256 + I[...,1] * 256 +I[...,2]
n = len(np.unique(F))
print(np.unique(I))

In numpy 1.19.0. F = I[...,0] * 256 * 256 + I[...,1] * 256 +I[...,2] Cause of 256 is a uint16, the data type of F covert to uint16 but I[...,0]*256*256 expected to be a uint32 other wise will cause I[...,0]* 256 * 256 overflow to a zero matrix. So n will be a wrong answer 4.

Should correct to F = I[...,0]*65536 + I[...,1]*256 +I[...,2]

Alternative solution for 66

w, h = 16, 16
img = np.random.randint(0, 256, (w, h, 3)).astype(np.ubyte)
colors = np.unique(img.reshape(-1, 3), axis=0)
n = len(colors) 
print(n)

Alternative solution for 21

You can also use:

np.tile(np.identity(2),(4,4))

No.9's solution is wrong in 100_Numpy_exercises_with_hints_with_solutions.md file

No.9's solution may be miscopied from No.10's solution.
I suggest the solution should be revised from Before to After.

9. Create a 3x3 matrix with values ranging from 0 to 8 (★☆☆)

hint: reshape

Before

nz = np.nonzero([1,2,0,0,4,0])
print(nz)

After

np.arange(0,9).reshape(3,3)

17. add `print(np.nan in set([np.nan])) # True`

print(np.nan == np.nan) # False
print(np.nan in set([np.nan])) # True

20 - solution returns index of 101st element

In question 20, it is assumed that the 100th element is at index 100, which is inconsistent with question 6, where the 5th element is assumed to be at index 4.

About No.25

Given a 1D array, negate all elements which are between 3 and 8, in place

the code is
Z[(3 < Z) & (Z <= 8)] *= -1

and according to the title that "### between 3 and 8, in place",shouldn't it is
Z[(3 < Z) & (Z < 8)] *= -1? without = because between 3 and 8 is 4,5,6,7 these four elements?

A doubt about question No.74

Given an array C that is a bincount, how to produce an array A such that np.bincount(A) == C? (★★★)

The given answer is shown as below.
C = np.bincount([1,1,2,3,4,4,6]) A = np.repeat(np.arange(len(C)), C) print(A)

There maybe a problem with the description of question No.74,the answer given can only solve the problem of Non-strictly increasing 1-d array not for all kind of 1-d array.

Game of Life - 'lifeless boundary'

Hi,
The implementation of the Game of Life sets all the boundary cells (i.e. rows and columns indexed 0 and -1) to 0, independent of the statuses of the adjacent cells.

E.g. if the top-right corner is:
[[0,1,...],
[1,1,...],
then the cell [0,0] should be recognized as 'birth' and become '1'. Instead it is set to 0, as all cells on the boundaries.

Is it an intended behavior or should be corrected? I can add a version/correction allowing for life cells also on the boundaries.

examples 32 and 33 are identical

and this one is really cool BTW!

96. unique rows

It can be done directly by numpy unique function specifying the axis
np.unique(Z, axis=0)

# 12 is wrong

I think the following is correct.
Z = np.random.randn((3,3,3))
print(Z)

Problem #22. Different from actual definition of Normalization

Normalization theoretically means subtracting by mean and dividing by standard deviation so that overall data has zero mean and unit variance.

So the code becomes

Z=np.random.random((5,5));
a=(Z-np.mean(Z))/(np.std(Z))
print(a)

Error when running initialise.py

%run initialise.py

UnicodeDecodeError Traceback (most recent call last)
~\mygit\numpy-100\initialise.py in
1 import numpy as np
2
----> 3 import generators as ge
4
5

~\mygit\numpy-100\generators.py in
34
35 HEADERS = ktx_to_dict(os.path.join('source', 'headers.ktx'))
---> 36 QHA = ktx_to_dict(os.path.join('source', 'exercises100.ktx'))
37
38

~\mygit\numpy-100\generators.py in ktx_to_dict(input_file, keystarter)
9
10 with open(input_file, 'r+') as f:
---> 11 lines = f.readlines()
12
13 k, val = '', ''

~\Anaconda3\lib\encodings\cp1251.py in decode(self, input, final)
21 class IncrementalDecoder(codecs.IncrementalDecoder):
22 def decode(self, input, final=False):
---> 23 return codecs.charmap_decode(input,self.errors,decoding_table)[0]
24
25 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 53: character maps to

I think this error is due to the default cp1251 encoding on my localized Windows 10.
I changed line 10 in generators.py to
with open(input_file, 'r+', encoding='utf-8') as f:
and problem was solved.
I'm noob in GitHub. Should I do something else? Pull request?

The solution to 65 doesn't make sense

The task is as follows: How to accumulate elements of a vector (X) to an array (F) based on an index list (I)? To me this means that F should contain some elements of X, in the order defined by the indices vector.

The proposed solution looks like this:

X = [1,2,3,4,5,6]
I = [1,3,9,3,4,1]
F = np.bincount(I,X)
print(F)

Which results in: [0. 7. 0. 6. 5. 0. 0. 0. 0. 3.]. X doesn't contain 0 or 7. Why is this a solution?

minor error in Problem 92

When I run problem 92, I have this problem. It seems the dimension of the matrix should be "int"

TypeError Traceback (most recent call last)
in
1 # Author: Ryan G.
2
----> 3 x = np.random.rand(5e7)
4
5 get_ipython().run_line_magic('timeit', 'np.power(x,3)')

mtrand.pyx in mtrand.RandomState.rand()

mtrand.pyx in mtrand.RandomState.random_sample()

mtrand.pyx in mtrand.cont0_array()

TypeError: 'float' object cannot be interpreted as an integer

Alternative Solution to Q76

Since the numpy reference advises against using "numpy.lib.stride_tricks.as_strided" when possible, I came up with another solution:

Z = np.random.randint(1,20,10)
window = 4
indices = np.arange(Z.size - window + 1)[:,None] + np.arange(window)
X = Z[indices]
print(X)

Solution for 9th question(Answers of 9th and 10th questions is accidently the same in solution list)

Create a 3x3 matrix with values ranging from 0 to 8 (★☆☆)
ANS: a = np.arange(0,9).reshape(3,3)
print(a)

Thanks have a good-day!

Wrapping code in answers for PDF output

In my fork of this repo, see https://github.com/deeplook/numpy-100, I'm generating double-sided PDF index cards (call them quiz cards or flash cards if you like) from the markdown file 100 Numpy exercises.md with code extracted from an unrelated project. There it would be nice to manually wrap the Python code of the answers a bit more to make it fit on the cards. Would you accept a PR doing this? This might be a first step in including the code to generate the PDF output in your own repo, but I don't know if you want that, of course. It would help avoid maintaining different versions of the same Markdown file, though.

The solution to 49 doesn't work

I got an exception when running the 49th solution. The exception as follows:
ValueError: threshold must be numeric and non-NAN, try sys.maxsize for untruncated representation

Better way of showing the numpy configuration

In exercise 2, I think np.show_config() is a better way of showing the configuration than np.__config__.show().

27. round away from 0 solution is wrong

trunc() should be round(), like this:
print (np.round(Z + np.copysign(0.5, Z)))

Example: an original value of 1.4 becomes 1.9, which then rounds to 2 (correct) or truncates to 1 (incorrect).

Alternatively, this might be more readable, but maybe it teaches less. Maybe having multiple solutions for each problem would be most educational?

Z[Z<0] = np.floor(Z[Z<0])
Z[Z>0] = np.ceil(Z[Z>0])
print (Z)

rougier/numpy-100 exercise - Que 16. alternate solution

How to add a border (filled with 0's) around an existing array? (★☆☆)¶

Proposed Solution -

a=np.random.randint(1,10,(5,5))
print(a)
a[0:,(0,-1)]=0
a[(0,-1),1:-1]=0
print(a)

Pl. give feedback on this, I am new to Python.

minor spelling check

23. Create a custom dtype that describes a color as four unisgned bytes (RGBA) (★☆☆)

unisgned --> unsigined

spelling mistake

In Exercise no.34

"Consider two random array A anb B, check if they are equal (★★☆)"

instead of and it was typed as anb

Find the nearest value from a given value in an array (★★☆)

How to find the closest value (to a given scalar) in a vector? (★★☆)

Separate wording from solution

Hi !

Very nice set of exercice, I am walking through it !
It would be nice to be able to scroll without the fear of seeing the answer and to do exercices in any order.

Kind regards