Select features randomly.
Run algorithms with the selected combination.
dataset: DataFrame to be used
model: A model to be used
- Randomly select features from the list of numeric features.
- Encoding and Scaling using scale_encode_combination().
- Run the selected algorithm; one of [ test_kmeans(), test_gaussian(), test_clarans(), test_dbscan() and test_mean_shift() ].
df = pd.read_csv(‘housing.csv’)
df.fillna(df.mean(), inplace=True)
medianHouseValue = df['median_house_value']
df.drop(['median_house_value'], axis=1, inplace=True)
auto_ml(df, ‘kmeans’)
All the results of the selected model.
Scaling and Encoding with 15 combinations.
dataset: DataFrame to be scaled and encoded
numerical_feature_list: Features to scale
categorical_feature_list: Features to encode
- for in scalers [StandardScaler(), MinMaxScaler(), RobustScaler(), MaxAbsScaler(), Normalizer()]
- for in encoders [OrdinalEncoder(), OneHotEncoder(), LabelEncoder()]
- Save each dataset in dictionary
for combination in feature_combination_list:
data_combination = scale_encode_combination(dataset, combination, ['ocean_proximity'])
for data_name, data in data_combination.items():
data = data[combination]
test_kmeans(data)
test_gaussian(data)
test_clarans(data)
test_dbscan(data)
test_mean_shift(data)
Dictionary included all the dataframe combinations of Scalers and Encoders.