neurodata / df-dn-paper Goto Github PK
View Code? Open in Web Editor NEWConceptual & empirical comparisons between decision forests & deep networks
Home Page: https://dfdn.neurodata.io
License: Other
Conceptual & empirical comparisons between decision forests & deep networks
Home Page: https://dfdn.neurodata.io
License: Other
Specifically, use a left-out set of Tabular datasets to tune the hyperparameters. Possibly implement 5-fold cross validations (tune on 4 folds, test on 1 fold) to evaluate classifier performances. Within each dataset, the performance is also calculated with 5-fold cross validations.
When creating train+val loaders for tuning CNNs for vision data, the following traceback is received:
INFO 05-12 21:06:37] ax.service.utils.instantiation: Inferred value type of ParameterType.FLOAT for parameter lr. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 05-12 21:06:37] ax.service.utils.instantiation: Inferred value type of ParameterType.FLOAT for parameter momentum. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 05-12 21:06:37] ax.service.utils.instantiation: Inferred value type of ParameterType.INT for parameter epoch. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 05-12 21:06:37] ax.service.utils.instantiation: Inferred value type of ParameterType.STRING for parameter optimizer. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 05-12 21:06:37] ax.service.utils.instantiation: Created search space: SearchSpace(parameters=[RangeParameter(name='lr', parameter_type=FLOAT, range=[1e-06, 0.4], log_scale=True), RangeParameter(name='momentum', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='epoch', parameter_type=INT, range=[15, 40]), ChoiceParameter(name='optimizer', parameter_type=STRING, values=['SGD', 'Adam'], is_ordered=False, sort_values=False)], parameter_constraints=[]).
[INFO 05-12 21:06:37] ax.modelbridge.dispatch_utils: Using Bayesian optimization since there are more ordered parameters than there are categories for the unordered categorical parameters.
[INFO 05-12 21:06:37] ax.modelbridge.dispatch_utils: Using Bayesian Optimization generation strategy: GenerationStrategy(name='Sobol+GPEI', steps=[Sobol for 8 trials, GPEI for subsequent trials]). Iterations after 8 will take longer to generate due to model-fitting.
[INFO 05-12 21:06:37] ax.service.managed_loop: Started full optimization with 20 steps.
[INFO 05-12 21:06:37] ax.service.managed_loop: Running optimization trial 1...
[INFO 05-12 21:06:39] ax.service.managed_loop: Running optimization trial 2...
[INFO 05-12 21:06:40] ax.service.managed_loop: Running optimization trial 3...
[INFO 05-12 21:06:41] ax.service.managed_loop: Running optimization trial 4...
[INFO 05-12 21:06:43] ax.service.managed_loop: Running optimization trial 5...
[INFO 05-12 21:06:44] ax.service.managed_loop: Running optimization trial 6...
[INFO 05-12 21:06:45] ax.service.managed_loop: Running optimization trial 7...
[INFO 05-12 21:06:47] ax.service.managed_loop: Running optimization trial 8...
[INFO 05-12 21:06:48] ax.service.managed_loop: Running optimization trial 9...
[INFO 05-12 21:06:51] ax.service.managed_loop: Running optimization trial 10...
[INFO 05-12 21:06:52] ax.service.managed_loop: Running optimization trial 11...
[INFO 05-12 21:06:54] ax.service.managed_loop: Running optimization trial 12...
[INFO 05-12 21:06:56] ax.service.managed_loop: Running optimization trial 13...
[INFO 05-12 21:06:58] ax.service.managed_loop: Running optimization trial 14...
[INFO 05-12 21:07:00] ax.service.managed_loop: Running optimization trial 15...
[INFO 05-12 21:07:02] ax.service.managed_loop: Running optimization trial 16...
[INFO 05-12 21:07:04] ax.service.managed_loop: Running optimization trial 17...
[INFO 05-12 21:07:06] ax.service.managed_loop: Running optimization trial 18...
[INFO 05-12 21:07:08] ax.service.managed_loop: Running optimization trial 19...
[INFO 05-12 21:07:10] ax.service.managed_loop: Running optimization trial 20...
[INFO 05-12 21:07:10] ax.modelbridge.base: Untransformed parameter 0.40000000000000013 greater than upper bound 0.4, clamping
[WARNING 05-12 21:07:12] ax.modelbridge.cross_validation: Metric accuracy was unable to be reliably fit.
[WARNING 05-12 21:07:12] ax.service.utils.best_point: Model fit is poor; falling back on raw data for best point.
[WARNING 05-12 21:07:12] ax.service.utils.best_point: Model fit is poor and data on objective metric accuracy is noisy; interpret best points results carefully.
(array([], dtype=int64),)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
[<ipython-input-15-8f0b5b112073>](https://localhost:8080/#) in <module>()
44 )
45
---> 46 run_cnn32()
[<ipython-input-14-5a5512f0afa6>](https://localhost:8080/#) in run_cnn32()
369 for i in train_valid_indices:
370 print(np.where(classes_new == dataset_copy.targets[i]))
--> 371 dataset_copy.targets[i] = np.where(classes_new == dataset_copy.targets[i])[0][0]
372
373 train_valid_sampler = torch.utils.data.sampler.SubsetRandomSampler(train_valid_indices)
IndexError: index 0 is out of bounds for axis 0 with size 0
Unify evaluation metrics or justify differences
Saving raw predictions of classification tasks helps improving evaluation metrics (changing/adding/deleting/...). As the test sets are randomly generated, test labels need to be saved at the same time.
A change of dataset is proposed over the FSDD database for the following reasons:
Code that needs to be changed:
df-dn-paper/benchmarks/tabular/toolbox.py
Lines 53 to 76 in 0a7d447
Code that can be used as a reference:
df-dn-paper/benchmarks/vision/toolbox.py
Lines 184 to 209 in 0a7d447
e.g.
Currently Tabular uses MLP. We want something more complex and could utilize GPU.
The current selection is SVHN.
Step of #29
List of candidates to consider (all following sklearn
api):
Currently, the tabular data code is written specifically to work with DL and RF, in that order. Hence, any change in the number of models or in their order requires a change in all the code. Saving the models' results in dictionaries instead of lists may solve this problem.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.