Hi,
Thanks for creating this absolutely awesome learning resource!
For context, I'm passing all tests up to and including the task3_1 set.
I've been stuck on 3.2 for a while, and thought I'd double check the "ground truth" for matmul tests marked task3_2
(i.e. the values stored in c2
). However, printing the expected results out and then computing them separately with numpy give different results.
[
[
[0.6272 0.3107 0.0176]
[0.8124 0.4486 0.9398]]] @
[
[
[0.6620 0.4447 0.7729 0.1804]
[0.8839 0.0619 0.2097 0.8598]
[0.7512 0.8540 0.1345 0.8480]]] =
[
[0.8249 0.8901 0.2055 1.2106]
[0.8249 0.8901 0.2055 1.2106]]
>>> import numpy as np
>>> np.array([[[0.6272, 0.3107, 0.0176], [0.8124, 0.4486, 0.9398]]]) @ np.array([[[0.6620, 0.4447, 0.7729, 0.1804], [0.8839, 0.0619, 0.2097, 0.8598], [0.7512, 0.8540, 0.1345, 0.8480]]])
array([[[0.70305525, 0.31317857, 0.55228387, 0.39521154],
[1.6403041 , 1.19163182, 0.84837848, 1.32921364]]])
[
[
[0.0000 0.0000]
[0.0000 0.0000]]
[
[0.0000 0.0000]
[0.0000 0.1000]]] @
[
[
[0.0000 0.0000]
[0.0000 0.1000]]] =
[
[
[0.0000 0.0000]
[0.0000 0.0100]]
[
[0.0000 0.0000]
[0.0000 0.0100]]]
>>> import numpy as np
>>> np.array([[[0.0000, 0.0000], [0.0000, 0.0000]], [[0.0000, 0.0000], [0.0000, 0.1000]]]) @ np.array([[[0.0000, 0.0000],[0.0000, 0.1000]]])
array([[[0. , 0. ],
[0. , 0. ]],
[[0. , 0. ],
[0. , 0.01]]])
So I'm writing this because I'm wondering whether:
- tests prior to 3.2 failed to catch some bug which messes up the "ground truth" target for me personally, or
- tests for 3.2 are buggy
Do you have any thoughts on this?
Thanks again for putting together this masterpiece.
Edit: Also, skipping to run_fast_tensor.py
with cpu backend seems to work (i.e. training takes place, loss goes down, metrics go up), so I'll just ignore the two failing tests for now until I run into seemingly related issues. Next, CUDA!