roamanalytics / roamresearch Goto Github PK

View Code? Open in Web Editor NEW

195.0 195.0 55.0 84.1 MB

License: Apache License 2.0

Jupyter Notebook 12.64% Python 0.71% R 0.02% C 0.01% C++ 0.03% TeX 0.03% JavaScript 0.13% Shell 0.01% HTML 86.43%

roamresearch's People

Contributors

Stargazers

Watchers

Forkers

andnieves gittu4 denson adamsqi tomakant peterwilliams97 sweatyrichard h101010 wn9081 leeway-liu beifeizhou ruizheliuoa ravitejaanantha nelsonli1990 aidanproy sanjeeku supanat jdc08161063 vhcg77 nunofernandes-plight smilexin timniven zahoorahmad mahmud83 kelvinson anuragreddygv323 tiancaohz souvikjana9993 jackd danyalandriano pinussilvestris arbazkhan002 yifengtao balodhi valeman levyforchh shineloveyc srl94 elocinrose dangxuanhong gz16s dbrroxane korterling lhvu2 suranjanas akg0 markgraves lanwei02 vahbuna xrosliang eliekawerk cgpotts jainnitk benjamin-du brant-95 sugar1988

roamresearch's Issues

RXCUI and ICD10 Codes

Hi,

Could you please explain more about the RXCUI codes of drugs and ICD10 codes of diseases? Where are they from? Thank you very much!

Error with weights files hdf5 when trying reproduce Feature4Healthcare

Hi,

I can't download the weigths files well through this link https://s3-us-west-2.amazonaws.com/allennlp/, when files downloaded with name and extension (hdf5) like elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5 and I reproduce this experiment based on Effective Feature Representation for Clinical Text Concept Extraction article, it's generate an error about corrupt weight file, due to it's downloaded corrupt because can't all permissions.

Please help me to solve about this error public denied and please give me an public access to weights files or send me a personal gmail ([email protected]) with this file. I desire to reproduce it!

Thanks!

hammering home the point

Just some comment on this here:
https://github.com/roaminsight/roamresearch/blob/master/BlogPosts/Average_precision/Average_precision_post.ipynb

It's true that linear interpolation for PR curves is overly optimistic and we should fix that. See this paper:
http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_DavisG06.pdf

However, linear interpolation is totally fine for ROC curves and your argument in "Hammering home the point" is actually wrong.
The way you picked the rounding point, you actually ended up with a better classifier. You can achieve any classifier that is linearly interpolated between points by flipping a weighted coin on which of the two end-points to use (for ROC, not PR).
There is a more subtle issue with how you do the interpolation. Choosing any point for interpolation based on their P/R values means that you already observed these points. You can no longer use them as you test set. So interpolation that skips "bad" points is only allowed when using a validation set.

Using hyperopt_search with regressors

Hello,

I am trying to use hyperopt_search with random_forest_regression and gradient_boosting_regressor but I got the following error :

TypeError: estimator should be an estimator implementing 'fit' method, <hyperopt.pyll.base.Apply object at 0x000000000D093048> was passed

And in one of the lines error it checks if it is a classifier :

C:\user\Anaconda3\lib\site-packages\sklearn\cross_validation.py in cross_val_score(estimator, X, y, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch)
   1571 
   1572     cv = check_cv(cv, X, y, classifier=is_classifier(estimator))
-> 1573     scorer = check_scoring(estimator, scoring=scoring)
   1574     # We clone the estimator to make sure that all the folds are
   1575     # independent, and that it is pickle-able.

I am wondering if the error comes from using a regressor and not a classifier.
Thank you for your help.

Regards.

Increasing threshold - recall is increased?

Hello,

In the article about Average Precision Score, you wrote the following:
"If we have a spread of scores, then as we increase the threshold from 0 to 1, we will incrementally improve the recall by moving positive examples from the negative bin B− to the positive bin B+. Notice that the recall never decreases as we increase the threshold."
Is it correct? Maybe when we decrease the threshold from 1 to 0, we increase recall (so while decreasing the threshold positive examples are moving from the negative bin B− to the positive bin B+).

Thank you.

roamanalytics / roamresearch Goto Github PK

roamresearch's People

Contributors

Stargazers

Watchers

Forkers

roamresearch's Issues

RXCUI and ICD10 Codes

Error with weights files hdf5 when trying reproduce Feature4Healthcare

hammering home the point

Using hyperopt_search with regressors

Increasing threshold - recall is increased?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent