oreilly-japan / deep-learning-from-scratch Goto Github PK

View Code? Open in Web Editor NEW

3.9K 3.9K 3.3K 10.55 MB

『ゼロから作る Deep Learning』(O'Reilly Japan, 2016)

License: MIT License

Python 5.78% Jupyter Notebook 94.22%

deep-learning-from-scratch's People

Contributors

Stargazers

Watchers

Forkers

hrstt fijixfiji puyopop mochisan jotaros nipe0324 akigt techkazu moon295 taiki45 zabaot su-10 tzatter seiketkm infinith4 sugicoma kambehmw ytorii uni8inu decebel wwacky ryo-1 utilitis yue82 naokomada tanukkii007 shunsuke-aikawa kawasaki2013 embeddedsamurai sakumasa hyamatter podcroco dproject21 mizti yfurukawa llxxxll itakoji3 hnakamur elzup kd21 tkc kurage10 speg03 yuki0x80 tuki0918 takashi-ds-masuda o-kei nomadblacky wegralee kouma1990 michitokazzy woshahua kenji0x02 gennei tanamako nkudrw atotto shunpi panicdragon naoto02 heavywatal katsuyaabe punchdrunker smizy yat1ma-garage akyao adwd miyamofigo yutaroohno husky774rr dezi5 0517newborn pyonpyon2 katsuyan currypurin satot sadaaki unokun h-michael hgsgtk knoguchi dtokumoto daisuke6106 tearoom6 ushios endoyuta bunorita yuki10kobayashi tamakimiyamoto cdepillabout ta2yak hidenori-t kazunari-h axis-sato homata livingthingsnow s-wiki boojongmin tomute nyaoya

deep-learning-from-scratch's Issues

Is the numerical_gradient function wrong?

Here is the test code:

import numpy as np

def numerical_gradient(f, x):
    h = 1e-4 # 0.0001
    grad = np.zeros_like(x)
    
    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
    while not it.finished:
        idx = it.multi_index
        tmp_val = x[idx]
        x[idx] = float(tmp_val) + h
        fxh1 = f(x) # f(x+h)
        
        x[idx] = tmp_val - h 
        fxh2 = f(x) # f(x-h)
        grad[idx] = (fxh1 - fxh2) / (2*h)
        
        x[idx] = tmp_val # 値を元に戻す
        it.iternext()   
        
    return grad

def func(x):
    if x.ndim == 1:
        return np.sum(x**2)
    else:
        return np.sum(x**2, axis=1)

if __name__ == "__main__":
    x = np.array([1, 2])
    grad = numerical_gradient(func, x)
    print(grad)

The output is [5000 15000].

Then I added x = x.astype(float) at the beginning of numerical_gradient():

def numerical_gradient(f, x):
    x = x.astype(float)
    h = 1e-4 # 0.0001
    grad = np.zeros_like(x)
    
    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
    while not it.finished:
        idx = it.multi_index
        tmp_val = x[idx]
        x[idx] = tmp_val + h
        fxh1 = f(x) # f(x+h)
        
        x[idx] = tmp_val - h 
        fxh2 = f(x) # f(x-h)
        grad[idx] = (fxh1 - fxh2) / (2*h)
        
        x[idx] = tmp_val # 値を元に戻す
        it.iternext()   
        
    return grad

The output is [2. 4.]

Why type change is unnecessary?

Question on Chapter 7.4.4 pooling layger output reshape

I've got a question on the following line of code. According to the illustration of the book, I think this line

deep-learning-from-scratch/common/layers.py

Line 266 in cb755e9

out = out.reshape(N, out_h, out_w, C).transpose(0, 3, 1, 2)

perhaps should be

out = out.reshape(N, C, out_h, out_w)

because the tensor here seems to have a different struture from the convolution computation. However, I'm not quite sure about this, please do not hesitate to correct me if I'm wrong.

Thanks in advance!

sys.path.append(os.pardir)がうまく行かず、os.pardirのところにフルパスを指定したところ動きました

sys.path.append(os.pardir) だと、下記のエラーが出ました。

Traceback (most recent call last):
  File "ch05/gradient_check.py", line 5, in <module>
    from dataset.mnist import load_mnist
ModuleNotFoundError: No module named 'dataset'

os.pardir のところをフルパス (e.g. Users/user_name/deep-learning-from-scratch)にしたところ動きました。

環境は

macOS High Sierra
Python 3.6.2
anaconda 1.6.5
です。

Ch4, page79 formula 4.1

I read the chinese version, the formula (4.1) is

E = (1/2)sigma(yk - tk)^2

and the following program is also

0.5 * np.sum((y-t)**2)

I wonder if the formula is wrong cause mean squared error on the internet is

(1/n)sigma(yk-tk)^2 the coeefficient is 1/n instead of 1/2

P168の6.1.3(SGDの欠点)について

P168の関数の勾配をグラフで表示させたのですが、P170 - 図6-3のようなジグザグのグラフではなく、下記のようなグラフになりました。

errors of pre_node_num in deep_convnet.py

pre_node_num = [1*3*3, 16*3*3, 16*3*3, 32*3*3, 64*3*3, 64*4*4]
the node_num is decided by the fiter_num * feature_map_h * feature_map_w. So I think the the result of node_num from 1*3*3 to 64*3*3 is wrong.

Simple practice with one dimensional data 1

Define input data1 (n x 1) = {P(T-n+1), P(T-n+2), P(T-n+3), .., P(T)}
answer: = 0 if P(T)-d <= P(T+m) <= P(T)+d
= 1 if P(T+m) > P(T)+d
= 2 if P(T+m) < P(T)-d

ゼロから作るDeepLearning

rmsprpoはrmspropのタイポ

https://github.com/oreilly-japan/deep-learning-from-scratch/blob/master/common/trainer.py

25~26行目の下記箇所、rmsprpoはrmspropのタイポと思います。

    optimizer_class_dict = {'sgd':SGD, 'momentum':Momentum, 'nesterov':Nesterov,
                            'adagrad':AdaGrad, 'rmsprpo':RMSprop, 'adam':Adam}

8章：学習済みの重みパラメータのファイル名

第13刷のP.243、5行目では

学習済みの重みパラメータをch08/deep_conv_net_params.pklとして与えています。

とありますが、実際のファイル名は ch08/deep_convnet_params.pkl になっています。convとnetの間の_の有無が異なります。

col2im関数のimg配列の初期化サイズについて

deep-learning-from-scratch/common/util.py

Lines 71 to 99 in 77eba24

    
           def col2im(col, input_shape, filter_h, filter_w, stride=1, pad=0): 
        
               """ 
        
               Parameters 
        
               ---------- 
        
               col : 
        
               input_shape : 入力データの形状（例：(10, 1, 28, 28)） 
        
               filter_h : 
        
               filter_w 
        
               stride 
        
               pad 
        
               Returns 
        
               ------- 
        
               """ 
        
               N, C, H, W = input_shape 
        
               out_h = (H + 2*pad - filter_h)//stride + 1 
        
               out_w = (W + 2*pad - filter_w)//stride + 1 
        
               col = col.reshape(N, out_h, out_w, C, filter_h, filter_w).transpose(0, 3, 4, 5, 1, 2) 
        
               img = np.zeros((N, C, H + 2*pad + stride - 1, W + 2*pad + stride - 1)) 
        
               for y in range(filter_h): 
        
                   y_max = y + stride*out_h 
        
                   for x in range(filter_w): 
        
                       x_max = x + stride*out_w 
        
                       img[:, :, y:y_max:stride, x:x_max:stride] += col[:, :, y, x, :, :] 
        
               return img[:, :, pad:H + pad, pad:W + pad]

上記コードでimgの初期化部分
img = np.zeros((N, C, H + 2*pad + stride - 1, W + 2*pad + stride - 1))
について、stride>1のときに余分なゼロ要素を生成している気がします。

return時のスライスにて消去されるので結果には影響ないかと思いますが、
余分なメモリ使用やコード理解の妨げになるのではと思い投稿させていただきました。

改善案として、img初期化時の配列サイズは下記でよいかと思うのですがいかがでしょうか。
img = np.zeros((N, C, H + 2*pad, W + 2*pad))

ゼロから作るディープラーニング

common/gradient.py内のnumerical_gradient関数におけるfloatキャストの必要性

該当箇所につきまして、tmp_valをfloat変換しておりますが、45行目と同様、変換が必要ないと思われます。

ご確認の程、宜しくお願い致します。

Question about the algorithm of col2im

Dear author/team, I'm your reader, I need help to confirm about some logic of the algorithm used in col2im, line 92 as follow

deep-learning-from-scratch/common/util.py

Line 92 in ea91786

img = np.zeros((N, C, H + 2*pad + stride - 1, W + 2*pad + stride - 1))

Is that the two additions of stride - 1 in both height and width of img just to prevent "index out of bound" error? I've tried some examples without it and haven't encountered any error, so I'm curious about at which case that error would happen. And I also found that col2im is not the inverse function of im2col when H + 2*pad - FH is divisible by stride.

Thank you!

ch03以下のファイル中にあるimport dataset.mnistが機能しない

sys.path.append(os.pardir)
ではなく
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))) )
を使うことでimportできたが、当方python超初心者なのでより良い改善を求む

誤字

一応メールで送信しましたが、2週間経っても返信がないようなのでここに書きます。

232ページの8行目、self.last_layer = SoftmaxWithLoss()とありますが、そのページの解説や他のコードには、lastLayerと表記されています。他の変数がスネークケースで書かれていること、githubのコードには全てlast_layerと表記されていることから、last_layerが正しいと思われます。

col2im の結果について

どうもいつも勉強させていただいております。
deep-learning-from-scratch/common/util.py
の 97行目の col2im でcol から img へ写像するところが代入でなく加算代入になっているようですが正しいでしょうか。

for y in range(filter_h):
  y_max = y + stride*out_h
    for x in range(filter_w):
      x_max = x + stride*out_w
      img[:, :, y:y_max:stride, x:x_max:stride] += col[:, :, y, x, :, :] # <- img[:, :, y:y_max:stride, x:x_max:stride] = col[:, :, y, x, :, :] では？

'+=' では下記の unit test で失敗しましたが、'=' に置き換えると成功しました。

    def test_im2col(self):
        input = np.array([[[[ 0,  1,  2], 
                            [ 4,  5,  6], 
                            [ 8,  9, 10]], 
                           [[12, 13, 14], 
                            [16, 17, 18], 
                            [20, 21, 22]]]], dtype=float)

        outExp = np.array([[0, 1, 4, 5, 12, 13, 16, 17],
                           [1, 2, 5, 6, 13, 14, 17, 18],
                           [4, 5, 8, 9, 16, 17, 20, 21],
                           [5, 6, 9,10, 17, 18, 21, 22]], dtype=float)
        
        output = util.im2col(input, 2, 2, 1, 0)
        self.assertEqual(outExp.tolist(), output.tolist())
        x = util.col2im(output, input.shape, 2, 2, 1, 0)
        self.assertEqual(x.tolist(), input.tolist())

恐縮ではございますが、ご確認ください。

6章のハイパーパラメータ最適化のコードについて

hyperparameter_optimization.py
は現状のコードでは動きません。
19行目
validation_num = x_train.shape[0] * validation_rate
は
validation_num = int(x_train.shape[0] * validation_rate)
に修正して動作しました。

_pickle.PicklingError on Pythonista 3

http://omz-software.com/pythonista/ Python 3.5.1 and NumPy 1.8.0

Downloading t10k-labels-idx1-ubyte.gz ... 
Done
Downloading train-labels-idx1-ubyte.gz ... 
Done
Downloading t10k-images-idx3-ubyte.gz ... 
Done
Downloading train-images-idx3-ubyte.gz ... 
Done
Converting train-images-idx3-ubyte.gz to NumPy Array ...
Done
Converting train-labels-idx1-ubyte.gz to NumPy Array ...
Done
Converting t10k-images-idx3-ubyte.gz to NumPy Array ...
Done
Converting t10k-labels-idx1-ubyte.gz to NumPy Array ...
Done
Creating pickle file ...
Traceback (most recent call last):
  File "/private/var/mobile/Containers/Shared/AppGroup/C2DC6C38-3D98-4394-B761-B6CA01482067/Pythonista3/Documents/from GitHub/deep-learning-from-scratch-master/dataset/mnist.py", line 128, in <module>
    init_mnist()
  File "/private/var/mobile/Containers/Shared/AppGroup/C2DC6C38-3D98-4394-B761-B6CA01482067/Pythonista3/Documents/from GitHub/deep-learning-from-scratch-master/dataset/mnist.py", line 79, in init_mnist
    pickle.dump(dataset, f, -1)
_pickle.PicklingError: Can't pickle <built-in function _reconstruct>: import of module 'multiarray' failed

---

>>> import platform
>>> platform.python_version()
'3.5.1'
>>> import numpy
>>> numpy.__version__
'1.8.0'

I'm confused with the numerical_gradient function used in TwoLayerNet class in Chapter4

Here is the part of codes about my question:

#In ch04/two_layer_net.py

def numerical_gradient(self, x, t):
    loss_W = lambda W: self.loss(x, t)
    
    grads = {}
    grads['W1'] = numerical_gradient(loss_W, self.params['W1'])
    grads['b1'] = numerical_gradient(loss_W, self.params['b1'])
    grads['W2'] = numerical_gradient(loss_W, self.params['W2'])
    grads['b2'] = numerical_gradient(loss_W, self.params['b2'])
    
    return grads

#In common/gradient.py

def numerical_gradient(f, x):
    h = 1e-4 # 0.0001
    grad = np.zeros_like(x)

    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
    while not it.finished:
        idx = it.multi_index
        tmp_val = x[idx]
        x[idx] = tmp_val + h
        fxh1 = f(x) # f(x+h)
    
        x[idx] = tmp_val - h 
        fxh2 = f(x) # f(x-h)
        grad[idx] = (fxh1 - fxh2) / (2*h)
    
        x[idx] = tmp_val # 値を元に戻す
        it.iternext()   
    
    return grad

I'm not sure about whether the parameter W defined in the lambda function loss_W is meaningful.
How will the f(x) in numerical_gradient() in gradient.py work? I think it will not affect the value of the loss function.

解決方法：MNISTデータがHTTP Error 403: Forbiddenで読み込めない

実行環境

OS：macOS Sonoma 14.4.1

Python：3.12.3

パッケージ管理：Poetry (version 1.8.2)

入っているライブラリ（poetry showの結果）

contourpy 1.2.1

cycler 0.12.1

fonttools 4.51.0

kiwisolver 1.4.5

matplotlib 3.8.4

numpy 1.26.4

packaging 24.0

pillow 10.3.0

pyparsing 3.1.2

python-dateutil 2.9.0.post0

six 1.16.0

問題の事象

ch03/neuralnet_mnist.pyを実行したところ、以下のようにurllib.error.HTTPError: HTTP Error 403: Forbiddenというエラーが発生し、MNISTデータをダウンロードできない。
また、ブラウザからダウンロード先のリンクへアクセスしても403 Forbiddenとなっている。

Downloading train-images-idx3-ubyte.gz ... 
Traceback (most recent call last):
  File "/Users/takumi/github/deep-learning-from-scratch/ch03/neuralnet_mnist.py", line 35, in <module>
    x, t = get_data()
           ^^^^^^^^^^
  File "/Users/takumi/github/deep-learning-from-scratch/ch03/neuralnet_mnist.py", line 11, in get_data
    (x_train, t_train), (x_test, t_test) = load_mnist(normalize=True, flatten=True, one_hot_label=False)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/takumi/github/deep-learning-from-scratch/ch03/../dataset/mnist.py", line 111, in load_mnist
    init_mnist()
  File "/Users/takumi/github/deep-learning-from-scratch/ch03/../dataset/mnist.py", line 80, in init_mnist
    download_mnist()
  File "/Users/takumi/github/deep-learning-from-scratch/ch03/../dataset/mnist.py", line 47, in download_mnist
    _download(v)
  File "/Users/takumi/github/deep-learning-from-scratch/ch03/../dataset/mnist.py", line 40, in _download
    response = urllib.request.urlopen(request).read()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/urllib/request.py", line 215, in urlopen
    return opener.open(url, data, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/urllib/request.py", line 521, in open
    response = meth(req, response)
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/urllib/request.py", line 630, in http_response
    response = self.parent.error(
               ^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/urllib/request.py", line 559, in error
    return self._call_chain(*args)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/urllib/request.py", line 492, in _call_chain
    result = func(*args)
             ^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/urllib/request.py", line 639, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

解決方法

dataset/mnist.pyの13行目を以下のように変更する。
変更前

url_base = 'http://yann.lecun.com/exdb/mnist/'

変更後

url_base = 'https://storage.googleapis.com/cvdf-datasets/mnist/'

解決にあたって参考にしたもの

https://github.com/cvdfoundation/mnist?tab=readme-ov-file#download
- READMEのTraining images, Training labels, Testing images , Testing labelsに張られたリンク先からダウンロードすればよい

@koki0702 お手数おかけしますが修正お願いいたします。

aaa

3章のMNIST画像の表示にはpillowライブラリが必要なようです

3章のMNIST画像（手書き数字認識のサンプル画像）を表示する部分ですが
mnist_show.pyを実行すると

% python mnist_show.py
Traceback (most recent call last):
  File "mnist_show.py", line 6, in <module>
    from PIL import Image
ModuleNotFoundError: No module named 'PIL'

PILのライブラリが存在しないと言うエラーになります。
こちらですがpillowをインストールで解決致しました。

pip3 install pillow

どこかに補足など頂けると幸いでございます。

参考にさせていただたURL
https://qiita.com/ukwksk/items/483d1b9e525667b77187

softmax関数のコード改善案

softmax関数について簡潔なコードのご提案です。

deep-learning-from-scratch/common/functions.py

Lines 31 to 39 in 77eba24

    
           def softmax(x): 
        
               if x.ndim == 2: 
        
                   x = x.T 
        
                   x = x - np.max(x, axis=0) 
        
                   y = np.exp(x) / np.sum(np.exp(x), axis=0) 
        
                   return y.T  
        
               x = x - np.max(x) # オーバーフロー対策 
        
               return np.exp(x) / np.sum(np.exp(x))

上記コードはnp.sumで次元が潰れてしまうため、先にxを転置して操作してまた転置していることかと思います。
下記が改善案のコードです。

def softmax(x):
    x = x - np.max(x, axis=-1, keepdims=True)   # オーバーフロー対策
    return np.exp(x) / np.sum(np.exp(x), axis=-1, keepdims=True)

keepdims=Trueを使うことでxの次元数を保ったまま操作し、
axis=-1を指定することでxが1次元でも2次元でも最後の次元に対して操作しています。
こちらのコードの方が直観的に理解しやすくすっきりするのではないでしょうか。

3章[P.76] pickleライブラリのインストールが必要なようです

初版 3章P.76にてpickleを使う1節がありますが、
事前にimport pickleと記述してライブラリをインストールしておく必要があるようです。
テキストにその記載がないように思われるので、念の為投稿します！

P.74 - 77をまとめて記述すると以下のようになるかと思います。
(P.75の画像表示のくだりを除く)

import os, sys
sys.path.append(os.pardir)
import numpy as np
from dataset.mnist import load_mnist
from PIL import Image
import pickle #ここを新しく記述！

def img_show(img):
    pil_img = Image.fromarray(np.uint8(img))
    pil_img.show()

def get_data():
    (x_train, t_train),(x_test, t_test) = \
    load_mnist(normalize = True, flatten = True, one_hot_label=False)
    return x_test, t_test

def init_network():
    with open("sample_weight.pkl", 'rb') as f:
        network = pickle.load(f)
    return network

def predict(network, x):
    W1, W2, W3 = network['W1'], network['W2'], network['W3']
    b1, b2, b3 = network['b1'], network['b2'], network['b3']
    
    a1 = np.dot(x, W1) + b1
    z1 = sigmoid(a1)
    a2 = np.dot(z1, W2) + b2
    z2 = sigmoid(a2)
    a3 =np.dot(z2, W3) + b3
    y = softmax(a3)
    
    return y

def sigmoid(x):
    return 1/(1 + np.exp(-x))

def softmax(a):
    c = np.max(a)
    exp_a = np.exp(a - c)
    sum_exp_a = np.sum(exp_a)
    y = exp_a / sum_exp_a
    
    return y

x,t = get_data()
network = init_network()

accuracy_cnt = 0
for i in range(len(x)):
    y = predict(network, x[i])
    p = np.argmax(y)
    if p == t[i]:
        accuracy_cnt += 1

print("Accuracy:" + str(float(accuracy_cnt) / len(x)))

(以上)

chapter1 imread question

if you just change the suffix of the picture ,you will get this error in the command

ValueError: invalid PNG header
you should change the picture through the professional picture Processing software

deep_convnet.py

in deep_convnet.py
pre_node_nums = np.array([1*3*3, 16*3*3, 16*3*3, 32*3*3, 32*3*3, 64*3*3, 64*4*4, hidden_size])

how to compute the neurons number in each convolution layer? why not pre_node_nums = np.array([1*3*3, 16, 16, 32, 32, 64, 64*4*4, hidden_size]

thanks

ch06/overfit_weight_decay.pyのweight_decay_lambdaの切り替えについて

「6.4.1 過学習」と「6.4.2 Weight decay」では共に ch06/overfit_weight_decay.py を実験用のコードとして使用しています。が、以下のコードの部分を切り替える必要がありました。

ch06/overfit_weight_decay.py の以下の部分は「6.4.2 Weight decay」用のコードになっています。

#weight_decay_lambda = 0
weight_decay_lambda = 0.1

「6.4.1 過学習」用に使うときは上の箇所を

weight_decay_lambda = 0
#weight_decay_lambda = 0.1

とする必要がありました。本文中のコードと ch06/overfit_weight_decay.py を見比べれば編集が必要なことがわかりましたが、最初気づかずに ch06/overfit_weight_decay.py を実行してグラフの形が実際に実行した結果と合わずに悩みました。

「6.4.1 過学習」の「（該当ファイルはch06/overfit_weight_decay.py）」のところに上記のような編集が必要な旨を追記するか、あるいはファイルを分けたほうが親切かと思います。

deep-learning

P169. 図６−２について

P１６９の図６−２では
「図６−２　f(x,y) = x2/20 + y2の勾配」っと説明があり、矢印（勾配）が図中のy=0付近に向かっています。
しかし、実際に上記fの勾配を計算すると逆方向になります。

// 計算に使用した式とその結果（図６−２とは勾配が逆方向）
def f(x, y):
    return (x**2 / 20.0 + y**2)
    
def df(x, y):
    return x / 10.0, 2.0*y

ソースコード全体

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

def f(x, y):
    return (x**2 / 20.0 + y**2)
    
def df(x, y):
    return x / 10.0, 2.0*y

x = np.arange(-10.0, 10.0, 1)
y = np.arange(-5.0, 5, 1)
X,Y = np.meshgrid(x,y)
Z = f(X,Y)

U, V = df(X, Y)  # 各点での関数 f の勾配を計算する。

###############
# plot
###############

fig = plt.figure(figsize=(20, 5))

# 勾配のベクトル図を作成する。
ax1 = fig.add_subplot(131) # 1行2列目の１目にグラフを描く
ax1.set_title('f gradient')
ax1.set_xlabel('x')
ax1.set_ylabel('y')
quiver = ax1.quiver(U,V)

plt.show()

この図６−２の勾配は、式6.1の損失関数の勾配をイメージされているものでしょうか？

W <- W - η(∂L/∂W)     (6.1)

もしそうであれば、次のようにすれば図６−２の結果が得られました。

def f(x, y):
    return -(x**2 / 20.0 + y**2)
    
def df(x, y):
    return -x / 10.0, -2.0*y

ch3/mnist_show.pyがHTTPErrorとなる

MNISTをダウンロード・表示させるプログラムである，
ch3/mnist_show.pyを実行すると以下のエラーを吐きます．

urllib.error.HTTPError: HTTP Error 403: Forbidden

dataset/mnist.py内の関数_downloadを以下に変更することで解決しました．

def _download(file_name):
    file_path = dataset_dir + "/" + file_name

    if os.path.exists(file_path):
        return

    print("Downloading " + file_name + " ... ")
    headers = {
        "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0"
        }
    request = urllib.request.Request(url_base+file_name, headers=headers)
    response = urllib.request.urlopen(request).read()
    with open(file_path, mode='wb') as f:
        f.write(response)
    print("Done")

DL先のサイトhttp://yann.lecun.com/ へのアクセス権限がないことが原因のようです．
「ブラウザからのアクセスだよ」と伝えてあげる必要があります．
私の場合はヘッダーをFirefoxと偽装することで解決しました．

6章：Batch Normalization サンプルコード

p.189 6.3.2 節（第12刷）
ch06/batch_norm_test.py を実行させると
図6-18 が出てくるように読めるのですが、
実際には図6-19が現れます。

	def col2im(col, input_shape, filter_h, filter_w, stride=1, pad=0):
	"""

	Parameters
	----------
	col :
	input_shape : 入力データの形状（例：(10, 1, 28, 28)）
	filter_h :
	filter_w
	stride
	pad

	Returns
	-------

	"""
	N, C, H, W = input_shape
	out_h = (H + 2*pad - filter_h)//stride + 1
	out_w = (W + 2*pad - filter_w)//stride + 1
	col = col.reshape(N, out_h, out_w, C, filter_h, filter_w).transpose(0, 3, 4, 5, 1, 2)

	img = np.zeros((N, C, H + 2pad + stride - 1, W + 2pad + stride - 1))
	for y in range(filter_h):
	y_max = y + stride*out_h
	for x in range(filter_w):
	x_max = x + stride*out_w
	img[:, :, y:y_max:stride, x:x_max:stride] += col[:, :, y, x, :, :]

	return img[:, :, pad:H + pad, pad:W + pad]

	def softmax(x):
	if x.ndim == 2:
	x = x.T
	x = x - np.max(x, axis=0)
	y = np.exp(x) / np.sum(np.exp(x), axis=0)
	return y.T

	x = x - np.max(x) # オーバーフロー対策
	return np.exp(x) / np.sum(np.exp(x))