<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Answers for the exercises about machine-learning-for-finance HOT 2 OPEN

packtpublishing commented on July 24, 2024

Answers for the exercises

from machine-learning-for-finance.

Comments (2)

JannesKlaas commented on July 24, 2024 1

Hey @tedchou12 ,
there are no solution files published anywhere. Most of the questions are quite open ended, and there is multiple ways to solve them. But once you solved them it should be quite obvious that you did.

If you are stuck anywhere, or unsure, feel free to send me an email! I will give it a look :)

from machine-learning-for-finance.

tedchou12 commented on July 24, 2024

Sorry for asking the question so quickly...
In chapater 2 of the book regarding fraud detection. I am trying to run locally and the code is quite similar to what was described int the book:

import pandas as pd
import numpy as np
from sklearn.metrics import f1_score, confusion_matrix
from keras.layers import Embedding, Dense, Activation, Reshape, Input, Concatenate
from keras.models import Model, Sequential
from keras.optimizers import SGD
from sklearn.model_selection import train_test_split
from imblearn.over_sampling import SMOTE

df = pd.read_csv('PS_20174392719_1491204439457_log.csv')
df = df.rename(columns={'oldbalanceOrg':'oldBalanceOrig', 'newbalanceOrig':'newBalanceOrig', \
                        'oldbalanceDest':'oldBalanceDest', 'newbalanceDest':'newBalanceDest'})


df['type'] = 'type_' + df['type'].astype(str)
dummies = pd.get_dummies(df['type'])
df = pd.concat([df, dummies], axis=1)


df['hour'] = df['step'] % 24
df['isNight'] = np.where((2 <= df['hour']) & (df['hour'] <= 6), 1, 0)

del df['type']
del df['step']
del df['nameOrig']
del df['nameDest']
del df['type_CASH_IN']
del df['type_DEBIT']
del df['type_PAYMENT']
del df['hour']

y_df = df['isFraud']
x_df = df.drop('isFraud', axis=1)

y = y_df.values
X = x_df.values


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.1, random_state=42)

sm = SMOTE(random_state=42)
# print(len(y_train))
X_train_res, y_train_res = sm.fit_resample(X_train, y_train)
# print(len(y_train_res))

# nn - level 1
alpha = 0.00001
model = Sequential()
model.add(Dense(1, input_dim=9))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=alpha), metrics=['acc'])

model.fit(X_train_res, y_train_res, epochs=5, batch_size=256, validation_data=(X_valid, y_valid))

y_pred = model.predict(X_test)
y_pred[y_pred > 0.5] = 1
y_pred[y_pred < 0.5] = 0

f1_s = f1_score(y_pred=y_pred, y_true=y_test)
print(f1_s)

# nn - level 2
alpha = 0.00001
model = Sequential()
model.add(Dense(16, input_dim=9))
model.add(Activation('tanh'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=alpha), metrics=['acc'])

model.fit(X_train_res, y_train_res, epochs=5, batch_size=256, validation_data=(X_valid, y_valid))

y_pred = model.predict(X_test)
y_pred[y_pred > 0.5] = 1
y_pred[y_pred < 0.5] = 0

f1_s = f1_score(y_pred=y_pred, y_true=y_test)
print(f1_s)

It can run, BUT the output and the f1 score seems awkward:

# for level 1
ted.chou@IITPC20-0109 ch2 % python3 fin_keras_predictive.py

2021-05-21 02:54:00.916335: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-05-21 02:54:01.352387: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/5
29935/29935 [==============================] - 27s 583us/step - loss: 479051.4881 - acc: 0.8800 - val_loss: 94051.3516 - val_acc: 0.8695
Epoch 2/5
29935/29935 [==============================] - 17s 554us/step - loss: 429615.8285 - acc: 0.8908 - val_loss: 4390205.5000 - val_acc: 0.5978
Epoch 3/5
29935/29935 [==============================] - 18s 589us/step - loss: 427322.5454 - acc: 0.8919 - val_loss: 117590.3438 - val_acc: 0.8590
Epoch 4/5
29935/29935 [==============================] - 17s 557us/step - loss: 435540.8527 - acc: 0.8911 - val_loss: 120546.6719 - val_acc: 0.8424
Epoch 5/5
29935/29935 [==============================] - 17s 582us/step - loss: 414160.2561 - acc: 0.8923 - val_loss: 124063.6250 - val_acc: 0.8490
0.01581431334622824

# for level 2
Epoch 1/5
29935/29935 [==============================] - 19s 621us/step - loss: 0.8930 - acc: 0.5381 - val_loss: 1.0736 - val_acc: 0.2382
Epoch 2/5
29935/29935 [==============================] - 20s 684us/step - loss: 0.7505 - acc: 0.6186 - val_loss: 0.7672 - val_acc: 0.5732
Epoch 3/5
29935/29935 [==============================] - 19s 636us/step - loss: 0.6106 - acc: 0.7352 - val_loss: 0.6389 - val_acc: 0.7062
Epoch 4/5
29935/29935 [==============================] - 19s 650us/step - loss: 0.5742 - acc: 0.7549 - val_loss: 0.6248 - val_acc: 0.7198
Epoch 5/5
29935/29935 [==============================] - 20s 654us/step - loss: 0.5547 - acc: 0.7691 - val_loss: 0.5958 - val_acc: 0.7760
0.009077803688785245

the 2 layer nn seems to have a lower f1 score as compared to 1 layer, which must have proven I have done something wrong.
Another thing is that the 1 layer training loss number is quite big, also a bit weird.

Unfortunately, I think the code is exactly as it was in the book. One thing I suspect is that I didn't normalize the data? But that wasn't written in the book for this chapter as well.

Thanks!
Ted

from machine-learning-for-finance.

Answers for the exercises about machine-learning-for-finance HOT 2 OPEN

Comments (2)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent