jerrykurata / machinelearningwithpython Goto Github PK

View Code? Open in Web Editor NEW

259.0 259.0 388.0 359 KB

Starter files for Pluralsight course: Understanding Machine Learning with Python

License: GNU General Public License v3.0

Jupyter Notebook 100.00%

machinelearningwithpython's People

Contributors

Stargazers

Watchers

Forkers

ledomone marcelosaied josan053 rohanpansare ianatoly inthevortex daviddextercharles v5jay amitbend maxkimambo jamesggit kamalnayan777 patrot theodoreharrison jorgecandeias codecofee19 kringen amareshdhal aluruguna dinhjeffrey saishg dvilchez poisoncreed ptyagi911 mohitsethi jerilkuriakose yclombe65 kelvinzch aswanikarteek lackendara lxyea kuite jiribuk michaelogala gsouravlis djcarpio kushalvenkatesh cprice1771 zhuohuwu0603 nikpawar89 hpiquant naveed1992 dalabarda stevestrong tanza9 faviovazquez mcarrera srinivaskolaparthi i9b4r1a9h5i7m6 10221008 chriszeng8 jvaughn575 raskolnikov7 dhrumit14 densom kalpanawarishe dilan rspies gskr82 afridshaikh ma3310 jurinco isaac-playground smokir edzzn cpranav tellingmachine maheshvaikri arnabg0014 ashwiniverma jonimop himanshupandya slashpai ddefontes93 crmdruid sriidata lbl1985 pedromtcosta sreenath886 abdalla rajeshkancharla sanjeevrohila jakkaj guptasakshi01 vishnuchilamakuru sophoana shibli049 ctysingh rajeshdude0 yothesh clustersdata hovhanns ijeb ranjithgoud24 rohitlawange pngan vatchalabakthavatchalu aistanbul slvkmr peng-qd

machinelearningwithpython's Issues

Pima-Prediction-checkpoint.ipynb

Can you update the Pima-Prediction-checkpoint.ipynb checkpoint file with the rest of the notebook from the course? Thanks!

Different Result Although Same Logic and Data Source

I follow your tutorial for Pima-Prediction, and use "./data/pima-data.csv".

My result:

Your result (origin exercise file):

What makes it different?

List Comprehension Error?

y_train[y_train[:] == 1]
and
y_train[y_train[:] == 0]

both return 537 rows.

Code from examples in course?

Hi Jerry, really enjoying the course, but is the code from the video examples available somewhere? If not, it would be helpful to have since it's not always possible to see all of it in the video.

False positives and false negatives swapped?

Hi, I found a small mistake in one of the course videos (sorry if this is the wrong place to post such issues, I tried to leave feedback directly on pluralsight but couldn't figure out how to do so).

Specifically, in the video "Evaluating the Naive Bayes Model", at time 1:30, I believe the FP and FN labels are swapped. FP should be bottom left, FN should be top right. I noticed this when trying to compute the recall, which (if FN were equal to 33) would be TP / (TP + FN) = 52 / (52 + 33) = 0.61, not 0.65 which we should be getting.

I also recomputed the number of true/false positives and negatives from scratch using numpy primitives to verify this:

num_tp = np.logical_and(nb_predict_test == 1, y_test.transpose() == 1).sum()
num_fp = np.logical_and(nb_predict_test == 1, y_test.transpose() == 0).sum()
num_fn = np.logical_and(nb_predict_test == 0, y_test.transpose() == 1).sum()
num_tn = np.logical_and(nb_predict_test == 0, y_test.transpose() == 0).sum()
print("Number of true positives: {0}".format(num_tp))
print("Number of false positives: {0}".format(num_fp))
print("Number of false negatives: {0}".format(num_fn))
print("Number of true negatives: {0}".format(num_tn))

Output:

Number of true positives: 52
Number of false positives: 33
Number of false negatives: 28
Number of true negatives: 118

Incorrect data in training

This code looks wrong

print("Training True : {0} ({1:0.2f}%)".format(len(y_train[y_train[:] == 1]), (len(y_train[y_train[:] == 1])/len(df.index) * 100.0)))
print("Training False : {0} ({1:0.2f}%)".format(len(y_train[y_train[:] == 0]), (len(y_train[y_train[:] == 0])/len(df.index) * 100.0)))
print("Test True : {0} ({1:0.2f}%)".format(len(y_test[y_test[:] == 1]), (len(y_test[y_test[:] == 1])/len(df.index) * 100.0)))
print("Test False : {0} ({1:0.2f}%)".format(len(y_test[y_test[:] == 0]), (len(y_test[y_test[:] == 0])/len(df.index) * 100.0)

Training True : 537 (69.92%)
Training False : 537 (69.92%)
Test True : 231 (30.08%)
Test False : 231 (30.08%)

When counting the occurences of 1, with len(y_train[y_train[:] == 1]), it returns all the items match that. In fact, if you change the condition to ==5, it still returns the full length of the array

Unable to read the file pima-data.csv

Hi,
I'm trying to read the pima-data.csv file
df=pd.read_csv("path")
It returns an error stating
ParserError: Error tokenizing data. C error: Expected 1 fields in line 12, saw 2

Tried specifiying sep as well , no luck.

Also tried read_html("path") and ended up with the error
ValueError: No tables found

Also tried read_excel("path") and ended up with the erro
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'<!DOCTYP'
Could someone help me out !!!!!!

Module sklearn.cross_validation deprecated

I got a notice about the cross_validation module will be removed soon, what is the alternative for data splitting?

Anaconda3\lib\site-packages\sklearn\cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20. "This module will be removed in 0.20.", DeprecationWarning)

jerrykurata / machinelearningwithpython Goto Github PK

machinelearningwithpython's People

Contributors

Stargazers

Watchers

Forkers

machinelearningwithpython's Issues

Pima-Prediction-checkpoint.ipynb

Different Result Although Same Logic and Data Source

List Comprehension Error?

Code from examples in course?

False positives and false negatives swapped?

Incorrect data in training

Unable to read the file pima-data.csv

Module sklearn.cross_validation deprecated

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent