Comments (30)
Thanks @neuronalX!
This means the paper is accepted - congratulations to @sukhoy and co-authors!
I will take care of the remaining steps on Monday.
from submissions.
Thanks for the submission! I will invite reviewers soon.
from submissions.
@mlosch @neuronalX @thmosqueiro @xuedong @gdetor
Can any of you review this submission? Note that it is a letter (on reproducibility methodology for machine learning), not a replication.
from submissions.
Thank you @khinsen for the invitation to review.
I will not be available the next 2 weeks, but if it is fine, I can do it for mi-august at latest.
from submissions.
@khinsen I can review it
from submissions.
Thanks @gdetor and @neuronalX for accepting to review this submission! Mid-August is OK.
from submissions.
Summary - General Comments
In this work, the authors propose and implement a patch for the LIBLINEAR library, which implements randomized learning algorithms. The key points of the proposed work are (i) the use of a cross-platform pseudo-random number generator (PRNG) and (ii) the PRNG private state in each thread. By doing so the authors make possible the reproducibility of an experiment using the LIBLINEAR/SFMT libraries.
-
Overall the text is well-written. The authors provide all the details of their approach and methods and they describe well-enough the steps one should follow to implement their method.
-
The instructions for patching and using their source code are well-written.
-
Does the proposed method scales properly? (it's not clear in the text)
-
Page 4, last sentence in the Results section: Are these results obtained from the proposed method? Please clarify.
Source Code
I compiled and ran the source code on a Linux machine (running Ubuntu with GCC 7.4.0) without facing any problem. The example provided by the authors in the main text runs smoothly. A few more tests I ran revealed a robust patched code.
The GCC attribute the authors use in this work is available for GCC versions 4.1.3 and higher (I'm not 100% sure). Can the authors add a list with requirements and dependencies?
Minor Suggestions
-
It would be nice if the authors could provide a shell (or Python) script that downloads, compiles, installs and patches the source code. Furthermore, the same script can download the example training data set.
-
The title of the paper can be more precise (right now is sort of vague since LIBLINEAR solves essentially classification and regression problems).
-
In tables 1 and 2, the authors can make the text, indicating their results, bold.
from submissions.
Thanks @gdetor for your review!
@neuronalX Did you have a chance to look at the paper already?
from submissions.
from submissions.
@neuronalX Thanks for the status report!
from submissions.
General comments
The paper is well written, and it is nice to have a figure to explain the behavior. It is surprising that the results are actually better with the patch. Is the patch planned to be integrated in LIBLINEAR?
I agree with @gdetor on several points:
- the title should be changed in order to include LIBLINEAR
- automatic script in python which would install everything or at least give all the commands to do so (for any OS)
Problem with the code compilation
On Mac OS X which compiler did you use exactly? clang (default replacing gcc)? or forcing gcc?
Because I have some issues installing OPENMP :
- after patching the code and running make I get the following error:
"unsupported option '-fopenmp'
$ make
make -C .. lib
c++ -Wall -Wconversion -O3 -fPIC -fopenmp -c -o linear.o linear.cpp
clang: error: unsupported option '-fopenmp'
make[1]: *** [linear.o] Error 1
make: *** [lib] Error 2
After a search on Internet, several post converge on these this possible answer: to not use the classical redirection to clang, but use a version of gcc directly (here 8). But this solution doesn't work either:
$ gcc-8 -Wall -Wconversion -O3 -fPIC -fopenmp -c -o linear.o linear.cpp
linear.cpp:10:10: fatal error: SFMT/SFMT.h: No such file or directory
#include "SFMT/SFMT.h"
^~~~~~~~~~~~~
compilation terminated.
Then, I found more complex explanations to solve the problem, but they require too much commands and file modification to perform. There should be an easier solution.
Could you please indicate which is the easiest solution?
from submissions.
Thanks @neuronalX !
@sukhoy Can you address the suggestions made by the two reviewers?
from submissions.
Dear Editor and Reviewers,
Thank you for the quick review of our manuscript and for the helpful suggestions
for improvements and clarifications. Our point-by-point responses are given
below.
Response to the Editor
@sukhoy Can you address the suggestions made by the two reviewers?
All changes recommended by the reviewers have been applied in the revised
version of the paper. Our point-by-point responses are given below.
As requested by both reviewers, the title was updated to incude LIBLINEAR.
The new title is "Eliminating the Variability of Cross-Validation Results with
LIBLINEAR due to Randomization and Parallelization".
Response to Reviewer 1
General Comments
Does the proposed method scales properly? (it's not clear in the text)
The proposed modifications do not affect the scalability. In fact, because the
PRNG state is made thread-local, it is no longer necessary to synchronize the
PRNG state accesses in parallel threads to achieve thread-safety.
Two sentences that clarify this issue have been added to the third paragraph in
the Introduction section. Thank you for bringing this up.
Page 4, last sentence in the Results section: Are these results obtained from
the proposed method? Please clarify.
Yes, the results mentioned in the last sentence of the Results section were
obtained with the modified version (i.e., all modifications from Section 3 were
applied). The end of that paragraph was modified to clarify this point.
Source Code
The GCC attribute the authors use in this work is available for GCC versions
4.1.3 and higher (I'm not 100% sure). Can the authors add a list with
requirements and dependencies?
Several sentences that list compiler versions and their distributions were added
to the README.TXT file that is included in the source code repository. The first
paragraph in Section 3 was also modified to include the compiler versions.
Minor Suggestions
It would be nice if the authors could provide a shell (or Python) script that
downloads, compiles, installs and patches the source code. Furthermore, the
same script can download the example training data set.
Thanks for the recommendation. A python script that does this was added to the
repository. It is called 'installer.py'. It was tested on Linux and macOS.
The title of the paper can be more precise (right now is sort of vague since
LIBLINEAR solves essentially classification and regression problems).
This point was also raised by Reviewer 2. The title has been updated.
In tables 1 and 2, the authors can make the text, indicating their results,
bold.
Thank you for the suggestion. The numbers in the last three columns in both
Table 1 and Table 2 are now shown in bold. The first two paragraphs in the
Results section were updated to reflect this.
Response to Reviewer 2
Is the patch planned to be integrated in LIBLINEAR?
It is up to the developers of LIBLINEAR to decide if they want to incorporate
this patch into their code.
the title should be changed in order to include LIBLINEAR
This point was also raised by Reviewer 1. The title has been updated.
automatic script in python which would install everything or at least give all
the commands to do so (for any OS)
A Python script that automatically downloads, patches, and builds LIBLINEAR
using the standard command-line tools has been added to the patch repository.
A similar suggestion was made by Reviewer 1.
On Mac OS X which compiler did you use exactly? clang (default replacing gcc)?
or forcing gcc??
We used a version of clang from the Homebrew package manager that includes
OpenMP support. It can be installed using the command "brew install llvm".
However, a recent version of gcc with OpenMP support would also work. The
compiler version can be set using the 'CC' and 'CXX' environment variables. For
example, gcc can be selected by running 'make' as follows:
$ CXX=g++ CC=gcc make
After a search on Internet, several post converge on these this possible
answer: to not use the classical redirection to clang, but use a version of
gcc directly (here 8). But this solution doesn't work either:
$ gcc-8 -Wall -Wconversion -O3 -fPIC -fopenmp -c -o linear.o linear.cpp
linear.cpp:10:10: fatal error: SFMT/SFMT.h: No such file or directory
#include "SFMT/SFMT.h"
^~~~~~~~~~~~~
compilation terminated.
We believe that the compilation issue might have been triggered because the SFMT
source code had not been deployed to the source code directory. This is
described in step 1 in Section 3.
Could you please indicate which is the easiest solution?
The README.TXT file was updated with additional instructions and version
numbers, which could be used to complement the patching instructions described
in Section 3 of the paper.
from submissions.
Thanks @sukhoy !
@gdetor and @neuronalX , do the changes and comments address all the issues you raised?
from submissions.
@khinsen The authors addressed all my comments and the code runs smoothly. The installer script works out-of-the-box. I endorse the manuscript for publication.
@sukhoy There is only one typo on page 3 item 5 in the listing: There is a missing bracket in the for loop. Other than that everything looks fine.
from submissions.
Thanks @gdetor!
from submissions.
@neuronalX
from submissions.
@neuronalX
from submissions.
I've reminded him as well :) Should be done by the end of this week.
from submissions.
- Thank you for the changes. I have installed llvm with brew and tried the python script.
- The python script works with Python >=3.6 but not below.
- Could the authors notify this requirement?
- If the error below is only because of my mac configuration, then let's move forward if the code work for the other reviewer, because my comments have been taken into account, and it is much easier and faster to test now.
- I obtain an error at the end of the python script or when running "make"
$ make
/usr/local/opt/llvm/bin/clang++ -Wall -Wconversion -O3 -fPIC -fopenmp -o train train.c tron.o linear.o blas/blas.a SFMT/SFMT.o
clang-9: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated [-Wdeprecated]
train.c:77:32: warning: implicit conversion changes signedness: 'int' to
'size_t' (aka 'unsigned long') [-Wsign-conversion]
line = (char *) realloc(line,max_line_len);
~~~~~~~ ^~~~~~~~~~~~
train.c:163:39: warning: implicit conversion changes signedness: 'int' to
'unsigned long' [-Wsign-conversion]
double *target = Malloc(double, prob.l);
~~~~~~~~~~~~~~~~~~~~^~
train.c:8:40: note: expanded from macro 'Malloc'
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
^ ~
train.c:250:79: warning: implicit conversion changes signedness: 'int' to
'unsigned long' [-Wsign-conversion]
...= (int *) realloc(param.weight_label,sizeof(int)*param.nr_weight);
~~~~~~~^~~~~~~~~
train.c:251:73: warning: implicit conversion changes signedness: 'int' to
'unsigned long' [-Wsign-conversion]
...= (double *) realloc(param.weight,sizeof(double)*param.nr_weight);
~~~~~~~^~~~~~~~~
train.c:367:21: warning: implicit conversion changes signedness: 'int' to
'unsigned long' [-Wsign-conversion]
line = Malloc(char,max_line_len);
~~~~~~~~~~~~^~~~~~~~~~~~~
train.c:8:40: note: expanded from macro 'Malloc'
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
^ ~
train.c:387:30: warning: implicit conversion changes signedness: 'int' to
'unsigned long' [-Wsign-conversion]
prob.y = Malloc(double,prob.l);
~~~~~~~~~~~~~~~~~~~^~
train.c:8:40: note: expanded from macro 'Malloc'
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
^ ~
train.c:388:45: warning: implicit conversion changes signedness: 'int' to
'unsigned long' [-Wsign-conversion]
prob.x = Malloc(struct feature_node *,prob.l);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
train.c:8:40: note: expanded from macro 'Malloc'
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
^ ~
train.c:389:53: warning: implicit conversion changes signedness: 'int' to
'unsigned long' [-Wsign-conversion]
x_space = Malloc(struct feature_node,elements+prob.l);
~~~~~~^
train.c:8:40: note: expanded from macro 'Malloc'
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
^
8 warnings generated.
ld: library not found for -lomp
clang-9: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [train] Error 1
from submissions.
Thanks @neuronalX !
@sukhoy Any comments? The OpenMP issue could indeed be specific to macOS or even to homebrew, I have seen this in other contexts.
from submissions.
Dear Editor and Reviewers,
Thank you for the quick review of the revised version of our manuscript and for
the suggestions that helped us to further improve the article and the source
code. Our point-by-point responses are given below.
Response to the Editor
@sukhoy Any comments? The OpenMP issue could indeed be specific to macOS or
even to homebrew, I have seen this in other contexts.
All changes recommended by the reviewers have been made and all issues have been
addressed in the revised versions of the paper and the source code. The
installer script now works with older versions of Python. We were also able to
find a solution for the macOS/Homebrew link error.
Response to Reviewer 1
@sukhoy There is only one typo on page 3 item 5 in the listing: There is a
missing bracket in the for loop. Other than that everything looks fine.
Thank you for the suggestion. The text on page 3 of the paper, in item 5 was
updated to include the missing bracket.
Response to Reviewer 2
The python script works with Python >=3.6 but not below.
Could the authors notify this requirement?
Thank you for reporting this issue. The script was updated to work with older
versions of Python, i.e., 2.7 and later. Python 2.7 is still the version of
Python that is pre-installed on macOS by Apple. The script still runs with more
recent versions of Python, e.g., Python 3.5, 3.6, and 3.7.
I obtain an error at the end of the python script or when running "make"
Thank you for reporting this error. It turns out that with Homebrew on macOS
it is necessary to install two packages to have a working setup of the OpenMP
library and the Clang compiler suite, i.e., both 'libomp' and 'llvm'. If libomp
is missing, then the code fails to link.
The README.TXT file and the message printed by the installer script have been
updated to include libomp together with llvm in the list of packages. That is,
both files now include the "brew install libomp llvm" command line instead of
"brew install llvm". This resolves the link issue.
We haven't noticed this issue during the previous round of corrections because
libomp had been installed on our system much earlier. Unfortunately, the llvm
package doesn't list libomp as an explicit dependency and by default it is not
installed together with the llvm package.
The root cause of this problem is that Apple has excluded OpenMP from the
official distribution of XCode, which makes it necessary to rely on third-party
compiler packages that have their own implicit dependencies.
from submissions.
Thanks @sukhoy ! It's nice to see reviewers contributing to improving a submission.
@gdetor and @neuronalX , does this look good to you?
from submissions.
@khinsen It looks good to me.
from submissions.
@neuronalX Are you happy as well with the current state of the submission?
from submissions.
The authors make the Python script compatible with many versions of Python, so it's fine for me.
After running the following provided command:
CXX=/usr/local/opt/llvm/bin/clang++ CC=/usr/local/opt/llvm/bin/clang python installer.py
it installed and lauched the experiment, and I obtained the same resultats than in the paper:
[...]
Cross Validation Accuracy = 96.9124%
[DONE] Running the demo for parallelized 5-fold CV.
@khinsen In summary, OK for publication.
from submissions.
Dear Editor and Reviewers,
Thank you for the quick and detailed review of our paper.
We are looking forward to seeing the PDF of the final version next week!
from submissions.
The article is published and available:
- On GitHub: https://github.com/ReScience/articles/tree/master/10.5281_zenodo.3528175
- On Zenodo: https://zenodo.org/record/3528175
@sukhoy Please accept pull request sukhoy/RSC_paper#1 which updates the article metadata in your repository.
from submissions.
It's online at https://rescience.github.io/read/
from submissions.
I merged the pull request.
Also, I am updating the PDF file in my repository to match the published version.
It is nice to see it published.
from submissions.
Related Issues (20)
- [Re] Spread of alpha-synuclein pathology through the brain connectome is modulated by selective vulnerability and predicted by network analysis HOT 37
- [Re] Three-dimensional wake topology and propulsive performance of low-aspect-ratio pitching-rolling plates HOT 41
- [Re] The Evolution of Virulence in Pathogens with Vertical and Horizontal Transmission HOT 78
- Reproducibility and reusability limitations in RegulatoryCircuits: analysis and solutions HOT 28
- [Re] Reproductive pair correlations and the clustering of organisms HOT 54
- [Re] Object Detection Meets Knowledge Graphs HOT 50
- [Re] An anatomically constrained neural network model of fear conditioning HOT 27
- [Re] Groups of diverse problem-solvers outperform groups of highest-ability problem-solvers - most of the time HOT 68
- [Re] A general model of hippocampal and dorsal striatal learning and decision making HOT 40
- [¬Re]Simulating socioeconomic-based affirmative action HOT 66
- [Re] A Detailed Data-Driven Network Model of Prefrontal Cortex Reproduces Key Features of In Vivo Activity HOT 25
- [Re] Biodiversity of plankton by species oscillations and chaos HOT 42
- [Re] Periodic forcing of a limit-cycle oscillator: fixed points, Arnold tongues, and the global organization of bifurcations HOT 15
- [¬Re] Setting Inventory Levels in a Bike Sharing Network HOT 18
- [Re] A circuit model of auditory cortex HOT 16
- [Re] Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks HOT 31
- [Re] Predictive model for the size of bubbles and droplets created in microfluidic T-junctions HOT 15
- [Re] Inter-areal balanced amplification enhances signal propagation in a large-scale circuit model of the primate cortex HOT 67
- [Re] Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress HOT 39
- [Re] The Discriminative Kalman Filter for Bayesian Filtering with Nonlinear and Non-Gaussian Observation Models HOT 17
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from submissions.