jerrykurata / machinelearningwithpython Goto Github PK
View Code? Open in Web Editor NEWStarter files for Pluralsight course: Understanding Machine Learning with Python
License: GNU General Public License v3.0
Starter files for Pluralsight course: Understanding Machine Learning with Python
License: GNU General Public License v3.0
Can you update the Pima-Prediction-checkpoint.ipynb checkpoint file with the rest of the notebook from the course? Thanks!
Hi, I found a small mistake in one of the course videos (sorry if this is the wrong place to post such issues, I tried to leave feedback directly on pluralsight but couldn't figure out how to do so).
Specifically, in the video "Evaluating the Naive Bayes Model", at time 1:30, I believe the FP and FN labels are swapped. FP should be bottom left, FN should be top right. I noticed this when trying to compute the recall, which (if FN were equal to 33) would be TP / (TP + FN) = 52 / (52 + 33) = 0.61, not 0.65 which we should be getting.
I also recomputed the number of true/false positives and negatives from scratch using numpy primitives to verify this:
num_tp = np.logical_and(nb_predict_test == 1, y_test.transpose() == 1).sum()
num_fp = np.logical_and(nb_predict_test == 1, y_test.transpose() == 0).sum()
num_fn = np.logical_and(nb_predict_test == 0, y_test.transpose() == 1).sum()
num_tn = np.logical_and(nb_predict_test == 0, y_test.transpose() == 0).sum()
print("Number of true positives: {0}".format(num_tp))
print("Number of false positives: {0}".format(num_fp))
print("Number of false negatives: {0}".format(num_fn))
print("Number of true negatives: {0}".format(num_tn))
Output:
Number of true positives: 52
Number of false positives: 33
Number of false negatives: 28
Number of true negatives: 118
Hi
This code looks wrong
print("Training True : {0} ({1:0.2f}%)".format(len(y_train[y_train[:] == 1]), (len(y_train[y_train[:] == 1])/len(df.index) * 100.0)))
print("Training False : {0} ({1:0.2f}%)".format(len(y_train[y_train[:] == 0]), (len(y_train[y_train[:] == 0])/len(df.index) * 100.0)))
print("Test True : {0} ({1:0.2f}%)".format(len(y_test[y_test[:] == 1]), (len(y_test[y_test[:] == 1])/len(df.index) * 100.0)))
print("Test False : {0} ({1:0.2f}%)".format(len(y_test[y_test[:] == 0]), (len(y_test[y_test[:] == 0])/len(df.index) * 100.0)
Training True : 537 (69.92%)
Training False : 537 (69.92%)
Test True : 231 (30.08%)
Test False : 231 (30.08%)
When counting the occurences of 1, with len(y_train[y_train[:] == 1]), it returns all the items match that. In fact, if you change the condition to ==5, it still returns the full length of the array
Hi,
I'm trying to read the pima-data.csv file
df=pd.read_csv("path")
It returns an error stating
ParserError: Error tokenizing data. C error: Expected 1 fields in line 12, saw 2
Tried specifiying sep as well , no luck.
Also tried read_html("path") and ended up with the error
ValueError: No tables found
Also tried read_excel("path") and ended up with the erro
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'<!DOCTYP'
Could someone help me out !!!!!!
I got a notice about the cross_validation module will be removed soon, what is the alternative for data splitting?
Anaconda3\lib\site-packages\sklearn\cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20. "This module will be removed in 0.20.", DeprecationWarning)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.