Giter Site home page Giter Site logo

machinelearningwithpython's People

Contributors

jerrykurata avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

machinelearningwithpython's Issues

Incorrect data in training

Hi

This code looks wrong

print("Training True : {0} ({1:0.2f}%)".format(len(y_train[y_train[:] == 1]), (len(y_train[y_train[:] == 1])/len(df.index) * 100.0)))
print("Training False : {0} ({1:0.2f}%)".format(len(y_train[y_train[:] == 0]), (len(y_train[y_train[:] == 0])/len(df.index) * 100.0)))
print("Test True : {0} ({1:0.2f}%)".format(len(y_test[y_test[:] == 1]), (len(y_test[y_test[:] == 1])/len(df.index) * 100.0)))
print("Test False : {0} ({1:0.2f}%)".format(len(y_test[y_test[:] == 0]), (len(y_test[y_test[:] == 0])/len(df.index) * 100.0)

Training True : 537 (69.92%)
Training False : 537 (69.92%)
Test True : 231 (30.08%)
Test False : 231 (30.08%)

When counting the occurences of 1, with len(y_train[y_train[:] == 1]), it returns all the items match that. In fact, if you change the condition to ==5, it still returns the full length of the array

False positives and false negatives swapped?

Hi, I found a small mistake in one of the course videos (sorry if this is the wrong place to post such issues, I tried to leave feedback directly on pluralsight but couldn't figure out how to do so).

Specifically, in the video "Evaluating the Naive Bayes Model", at time 1:30, I believe the FP and FN labels are swapped. FP should be bottom left, FN should be top right. I noticed this when trying to compute the recall, which (if FN were equal to 33) would be TP / (TP + FN) = 52 / (52 + 33) = 0.61, not 0.65 which we should be getting.

I also recomputed the number of true/false positives and negatives from scratch using numpy primitives to verify this:

num_tp = np.logical_and(nb_predict_test == 1, y_test.transpose() == 1).sum()
num_fp = np.logical_and(nb_predict_test == 1, y_test.transpose() == 0).sum()
num_fn = np.logical_and(nb_predict_test == 0, y_test.transpose() == 1).sum()
num_tn = np.logical_and(nb_predict_test == 0, y_test.transpose() == 0).sum()
print("Number of true positives: {0}".format(num_tp))
print("Number of false positives: {0}".format(num_fp))
print("Number of false negatives: {0}".format(num_fn))
print("Number of true negatives: {0}".format(num_tn))

Output:

Number of true positives: 52
Number of false positives: 33
Number of false negatives: 28
Number of true negatives: 118

Module sklearn.cross_validation deprecated

I got a notice about the cross_validation module will be removed soon, what is the alternative for data splitting?

Anaconda3\lib\site-packages\sklearn\cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20. "This module will be removed in 0.20.", DeprecationWarning)

Unable to read the file pima-data.csv

Hi,
I'm trying to read the pima-data.csv file
df=pd.read_csv("path")
It returns an error stating
ParserError: Error tokenizing data. C error: Expected 1 fields in line 12, saw 2

Tried specifiying sep as well , no luck.

Also tried read_html("path") and ended up with the error
ValueError: No tables found

Also tried read_excel("path") and ended up with the erro
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'<!DOCTYP'
Could someone help me out !!!!!!

Code from examples in course?

Hi Jerry, really enjoying the course, but is the code from the video examples available somewhere? If not, it would be helpful to have since it's not always possible to see all of it in the video.
understanding_machine_learning_with_python_

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.