Giter Site home page Giter Site logo

packtpublishing / hands-on-machine-learning-for-algorithmic-trading Goto Github PK

View Code? Open in Web Editor NEW
1.3K 75.0 611.0 111.34 MB

Hands-On Machine Learning for Algorithmic Trading, published by Packt

License: MIT License

Jupyter Notebook 99.82% Python 0.17% Shell 0.01%

hands-on-machine-learning-for-algorithmic-trading's Introduction

Hands-On Machine Learning for Algorithmic Trading

Hands-On Machine Learning for Algorithmic Trading, published by Packt

Hands-On Machine Learning for Algorithmic Trading

This is the code repository for Hands-On Machine Learning for Algorithmic Trading, published by Packt.

Design and implement investment strategies based on smart algorithms that learn from data using Python

What is this book about?

The explosive growth of digital data has boosted the demand for expertise in trading strategies that use machine learning (ML). This book enables you to use a broad range of supervised and unsupervised algorithms to extract signals from a wide variety of data sources and create powerful investment strategies.

This book covers the following exciting features:

  • Implement machine learning techniques to solve investment and trading problems
  • Leverage market, fundamental, and alternative data to research alpha factors
  • Design and fine-tune supervised, unsupervised, and reinforcement learning models
  • Optimize portfolio risk and performance using pandas, NumPy, and scikit-learn
  • Integrate machine learning models into a live trading strategy on Quantopian

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders. For example, Chapter02.

The code will look like the following:

interesting_times = extract_interesting_date_ranges(returns=returns)
interesting_times['Fall2015'].to_frame('pf') \
.join(benchmark_rets) \
.add(1).cumprod().sub(1) \
.plot(lw=2, figsize=(14, 6), title='Post-Brexit Turmoil')

Following is what you need for this book: Hands-On Machine Learning for Algorithmic Trading is for data analysts, data scientists, and Python developers, as well as investment analysts and portfolio managers working within the finance and investment industry. If you want to perform efficient algorithmic trading by developing smart investigating strategies using machine learning algorithms, this is the book for you. Some understanding of Python and machine learning techniques is mandatory.

With the following software and hardware list you can run all code files present in the book (Chapter 1-15).

Software and Hardware List

Chapter Software required OS required
2-20 Python 2.7/3.5, SciPy 0.18, Windows, Mac OS X, and Linux (Any)
Numpy 1.11+, Matplotlib 2.0,
ScikitLearn 0.18+,
Gensim, Keras 2+

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

Related products

Get to Know the Author

Stefan Jansen, CFA is Founder and Lead Data Scientist at Applied AI where he advises Fortune 500 companies and startups across industries on translating business goals into a data and AI strategy, builds data science teams and develops ML solutions. Before his current venture, he was Managing Partner and Lead Data Scientist at an international investment firm where he built the predictive analytics and investment research practice. He was also an executive at a global fintech startup operating in 15 markets, worked for the World Bank, advised Central Banks in emerging markets, and has worked in 6 languages on four continents. Stefan holds Master's from Harvard and Berlin University and teaches data science at General Assembly and Datacamp.

Suggestions and Feedback

Click here if you have any feedback or suggestions.

Download a free PDF

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
Simply click on the link to claim your free PDF.

https://packt.link/free-ebook/9781789346411

hands-on-machine-learning-for-algorithmic-trading's People

Contributors

packt-itservice avatar packtutkarshr avatar sayli2212 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hands-on-machine-learning-for-algorithmic-trading's Issues

Quantopian is closed

Could u please name an analog to make backtests. Now I'm looking at Quantconnect, but i'm not sure that's this is good alternative

package conflicts

Following the directions for creating the environment in the installation.md I get conflicts.

conda env create -f environment_linux.yml

I figured I would try and update a few of these and give a pull request back with updates, however I failed miserably. I tried the suggestion from @TheStoneMX in #9 and followed the directions in the post he gave. I appended '37' to the name to signify python 3.7 as I surmised this might be the issue:

conda create --name ml4t37 python=3.7
conda activate ml4t37
conda env update --file environment_linux.yml

In either case, I get an amazing amount of package conflicts (9,266 lines of conflict errors which I am attaching).

error.log

This enourmous amount of conflicts leads me to believe that I have done something wrong as it seems excessive for this. I have duplicated this on both an arch linux machine and a bionic beaver machine. Any suggestions on what I might have done wrong?

Memory Error (and a few updates to original code)

Hey!

Love the work so far. I've noticed a couple of changes to make though. In the first Jupyter Notebook in Chapter 2 ("01_build_itch_order_book.ipynb"), Seaborn was not imported so I ran into an error in the "Buy-Sell Order Distribution" section, so add:

import seaborn as sns

...to the top. Also, I was having an issue in the Download/Unzip section at the beginning. Even though I had already downloaded and unzipped the sample file, when I went to run the book again, it was starting to unzip the .gz file all over again, so I also changed the line:

unzipped = data_path / (filename.stem + '.bin')\n",

...to:

unzipped = data_path / (os.path.splitext(SOURCE_FILE)[0] + '.bin')

...and also need to add:

import os

...to the top. This will then properly see that there's already an unzipped file there and won't start to unzip the .gz file again.

Now, I'm running into a memory error and I'm wondering if an external hard drive is actually the solution, or if Windows is just running out of memory from trying to work with such a huge file? My traceback leading up to the memory error looks like this:

Empty DataFrame
Columns: [Message Type, # Trades]
Index: []
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2010099 entries, 0 to 2010098
Data columns (total 9 columns):
timestamp 2010099 non-null datetime64[ns]
buy_sell_indicator 1873082 non-null float64
shares 1995581 non-null float64
price 1995581 non-null float64
type 2010099 non-null object
executed_shares 54956 non-null float64
execution_price 500 non-null float64
shares_replaced 14159 non-null float64
price_replaced 14159 non-null float64
dtypes: datetime64ns, float64(7), object(1)
memory usage: 138.0+ MB
<class 'pandas.io.pytables.HDFStore'>
File path: data\order_book.h5
/AAPL/buy frame_table (typ->appendable,nrows->177108242,ncols->2,indexers->[index],dc->[])
/AAPL/messages frame (shape->[2010099,9])
/AAPL/sell frame_table (typ->appendable,nrows->183264614,ncols->2,indexers->[index],dc->[])
/AAPL/trades frame (shape->[59796,3])
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 59796 entries, 2019-03-27 04:00:56.459428646 to 2019-03-27 19:54:05.600648466
Data columns (total 3 columns):
shares 59796 non-null int32
price 59796 non-null int32
cross 59796 non-null int32
dtypes: int32(3)
memory usage: 1.1 MB
None
100,000 0:01:02.811120
200,000 0:01:27.826051
300,000 0:01:28.564991
400,000 0:01:27.096437
500,000 0:01:29.712940
600,000 0:01:29.556783
700,000 0:01:31.360356
800,000 0:01:35.391212
900,000 0:01:35.037438
1,000,000 0:01:56.868902
1,100,000 0:02:12.290003
1,200,000 0:01:42.614183
1,300,000 0:01:51.042871
1,400,000 0:01:41.064018
1,500,000 0:01:48.513128
1,600,000 0:01:41.310528
1,700,000 0:01:52.142803
1,800,000 0:01:44.541379
1,900,000 0:01:48.800505
2,000,000 0:01:48.460462
A 924117
D 869968
X 2789
E 52299
P 6995
F 2282
U 14159
C 473
dtype: int64
<class 'pandas.io.pytables.HDFStore'>
File path: data\order_book.h5
/AAPL/buy frame_table (typ->appendable,nrows->265662363,ncols->2,indexers->[index],dc->[])
/AAPL/messages frame (shape->[2010099,9])
/AAPL/sell frame_table (typ->appendable,nrows->274896921,ncols->2,indexers->[index],dc->[])
/AAPL/trades frame (shape->[59796,3])
Traceback (most recent call last):
File "01_build_itch_order_book.py", line 464, in
sell = store['{}/sell'.format(stock)].reset_index().drop_duplicates()
File "C:\Users\windowshopr\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 4630, in drop_duplicates
duplicated = self.duplicated(subset, keep=keep)
File "C:\Users\windowshopr\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 4687, in duplicated
labels, shape = map(list, zip(*map(f, vals)))
File "C:\Users\windowshopr\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 4668, in f
vals, size_hint=min(len(self), _SIZE_HINT_LIMIT))
File "C:\Users\windowshopr\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\util_decorators.py", line 188, in wrapper
return func(*args, **kwargs)
File "C:\Users\windowshopr\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\algorithms.py", line 613, in factorize
na_value=na_value)
File "C:\Users\windowshopr\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\algorithms.py", line 460, in _factorize_array
na_value=na_value)
File "pandas_libs\hashtable_class_helper.pxi", line 1209, in pandas._libs.hashtable.Int64HashTable.factorize
File "pandas_libs\hashtable_class_helper.pxi", line 1104, in pandas._libs.hashtable.Int64HashTable._unique
MemoryError

So I'm assuming it's because every time I try to run the program, it has to chew up a lot of working memory, so is there a way to work around this/work with a smaller ITCH sample/is there anything else you'd recommend? I'd love to be able to continue on working with this. My laptop has plenty of Hard Drive space, RAM is 8GB though.

Thanks for your input!

Unsupported parameter in pandas read_excel

The chapter 2 01_build_itch_order_book notebook, attempts to read message type definitions from an xlsx file using Pandas read_excel method.

One of the parameters passed is the file's encoding. However the current version of pandas doesn't support the encoding argument, causing it to throw an error.

Happily simply removing the parameter allows the file to be loaded without problem, though I don't know if there might be backwards compatibility issues for people running older versions of Pandas.

data-prep.jpnb

Hi there,
in the book, chapter 2 you mention the existence of the below jupyter notebook but I cant see it in any directory.

thks Angelo

ResolvePackageNotFound: error

Hello. I recently bought a book and am reading it well. Thank you.

I'm going to practice the code.

Using 'conda env create -f environment.yml'

I was trying to install my environment, but the following error occurred:

What should I do? (I am a Windows os user.)

`ResolvePackageNotFound:

  • binutils_impl_linux-64=2.28.1
  • gxx_impl_linux-64=7.2.0
  • gxx_linux-64=7.2.0
  • libgcc-ng=8.2.0
  • libstdcxx-ng=8.2.0
  • readline=7.0
  • gcc_linux-64=7.2.0
  • gmp=6.1.2
  • libuuid=1.0.3
  • gstreamer=1.14.0
  • graphviz=2.40.1
  • dbus=1.13.2
  • binutils_linux-64=7.2.0
  • expat=2.2.6
  • libgfortran-ng=7.3.0
  • gcc_impl_linux-64=7.2.0
  • ncurses=6.1
  • gst-plugins-base=1.14.0
  • libedit=3.1.20170329`

550 error on Chapter 2 notebook 01_build_itch_order_book

I have installed all packages from an updated version from packtpublishing and I am getting an erorr:

Downloading... ftp://emi.nasdaq.com/ITCH/Nasdaq_ITCH/03272019.NASDAQ_ITCH50.gz
---------------------------------------------------------------------------
error_perm                                Traceback (most recent call last)
~\anaconda3\envs\ml4trading\lib\urllib\request.py in ftp_open(self, req)
   1564         try:
-> 1565             fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
   1566             type = file and 'I' or 'D'

~\anaconda3\envs\ml4trading\lib\urllib\request.py in connect_ftp(self, user, passwd, host, port, dirs, timeout)
   1586         return ftpwrapper(user, passwd, host, port, dirs, timeout,
-> 1587                           persistent=False)
   1588 

~\anaconda3\envs\ml4trading\lib\urllib\request.py in __init__(self, user, passwd, host, port, dirs, timeout, persistent)
   2407         try:
-> 2408             self.init()
   2409         except:

~\anaconda3\envs\ml4trading\lib\urllib\request.py in init(self)
   2419         _target = '/'.join(self.dirs)
-> 2420         self.ftp.cwd(_target)
   2421 

~\anaconda3\envs\ml4trading\lib\ftplib.py in cwd(self, dirname)
    630         cmd = 'CWD ' + dirname
--> 631         return self.voidcmd(cmd)
    632 

~\anaconda3\envs\ml4trading\lib\ftplib.py in voidcmd(self, cmd)
    277         self.putcmd(cmd)
--> 278         return self.voidresp()
    279 

~\anaconda3\envs\ml4trading\lib\ftplib.py in voidresp(self)
    250         """Expect a response beginning with '2'."""
--> 251         resp = self.getresp()
    252         if resp[:1] != '2':

~\anaconda3\envs\ml4trading\lib\ftplib.py in getresp(self)
    245         if c == '5':
--> 246             raise error_perm(resp)
    247         raise error_proto(resp)

error_perm: 550 The system cannot find the file specified. 

During handling of the above exception, another exception occurred:

URLError                                  Traceback (most recent call last)
<ipython-input-6-8ab12e3a4ec2> in <module>
----> 1 file_name = may_be_download(urljoin(FTP_URL, SOURCE_FILE))
      2 date = file_name.name.split('.')[0]

<ipython-input-4-4c71a35e1865> in may_be_download(url)
      7     if not filename.exists():
      8         print('Downloading...', url)
----> 9         urlretrieve(url, filename)
     10     unzipped = data_path / (filename.stem + '.bin')
     11     if not (data_path / unzipped).exists():

~\anaconda3\envs\ml4trading\lib\urllib\request.py in urlretrieve(url, filename, reporthook, data)
    246     url_type, path = splittype(url)
    247 
--> 248     with contextlib.closing(urlopen(url, data)) as fp:
    249         headers = fp.info()
    250 

~\anaconda3\envs\ml4trading\lib\urllib\request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    221     else:
    222         opener = _opener
--> 223     return opener.open(url, data, timeout)
    224 
    225 def install_opener(opener):

~\anaconda3\envs\ml4trading\lib\urllib\request.py in open(self, fullurl, data, timeout)
    524             req = meth(req)
    525 
--> 526         response = self._open(req, data)
    527 
    528         # post-process response

~\anaconda3\envs\ml4trading\lib\urllib\request.py in _open(self, req, data)
    542         protocol = req.type
    543         result = self._call_chain(self.handle_open, protocol, protocol +
--> 544                                   '_open', req)
    545         if result:
    546             return result

~\anaconda3\envs\ml4trading\lib\urllib\request.py in _call_chain(self, chain, kind, meth_name, *args)
    502         for handler in handlers:
    503             func = getattr(handler, meth_name)
--> 504             result = func(*args)
    505             if result is not None:
    506                 return result

~\anaconda3\envs\ml4trading\lib\urllib\request.py in ftp_open(self, req)
   1581         except ftplib.all_errors as exp:
   1582             exc = URLError('ftp error: %r' % exp)
-> 1583             raise exc.with_traceback(sys.exc_info()[2])
   1584 
   1585     def connect_ftp(self, user, passwd, host, port, dirs, timeout):

~\anaconda3\envs\ml4trading\lib\urllib\request.py in ftp_open(self, req)
   1563             dirs = dirs[1:]
   1564         try:
-> 1565             fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
   1566             type = file and 'I' or 'D'
   1567             for attr in attrs:

~\anaconda3\envs\ml4trading\lib\urllib\request.py in connect_ftp(self, user, passwd, host, port, dirs, timeout)
   1585     def connect_ftp(self, user, passwd, host, port, dirs, timeout):
   1586         return ftpwrapper(user, passwd, host, port, dirs, timeout,
-> 1587                           persistent=False)
   1588 
   1589 class CacheFTPHandler(FTPHandler):

~\anaconda3\envs\ml4trading\lib\urllib\request.py in __init__(self, user, passwd, host, port, dirs, timeout, persistent)
   2406         self.keepalive = persistent
   2407         try:
-> 2408             self.init()
   2409         except:
   2410             self.close()

~\anaconda3\envs\ml4trading\lib\urllib\request.py in init(self)
   2418         self.ftp.login(self.user, self.passwd)
   2419         _target = '/'.join(self.dirs)
-> 2420         self.ftp.cwd(_target)
   2421 
   2422     def retrfile(self, file, type):

~\anaconda3\envs\ml4trading\lib\ftplib.py in cwd(self, dirname)
    629             dirname = '.'  # does nothing, but could return error
    630         cmd = 'CWD ' + dirname
--> 631         return self.voidcmd(cmd)
    632 
    633     def size(self, filename):

~\anaconda3\envs\ml4trading\lib\ftplib.py in voidcmd(self, cmd)
    276         """Send a command and expect a response beginning with '2'."""
    277         self.putcmd(cmd)
--> 278         return self.voidresp()
    279 
    280     def sendport(self, host, port):

~\anaconda3\envs\ml4trading\lib\ftplib.py in voidresp(self)
    249     def voidresp(self):
    250         """Expect a response beginning with '2'."""
--> 251         resp = self.getresp()
    252         if resp[:1] != '2':
    253             raise error_reply(resp)

~\anaconda3\envs\ml4trading\lib\ftplib.py in getresp(self)
    244             raise error_temp(resp)
    245         if c == '5':
--> 246             raise error_perm(resp)
    247         raise error_proto(resp)
    248 

URLError: <urlopen error ftp error: error_perm('550 The system cannot find the file specified. ',)>

seems like the url changed but I am not sure where I can get the updated one from.

Error when creating ml4t environment using the yml file

Hello
I have tried to create an environment using the environment.yml file provided but I get the following error:

Solving environment: failed

ResolvePackageNotFound:

  • gcc_linux-64=7.2.0
  • binutils_impl_linux-64=2.28.1
  • gxx_linux-64=7.2.0
  • gst-plugins-base=1.14.0
  • gstreamer=1.14.0
  • gmp=6.1.2
  • pango=1.42.4
  • dbus=1.13.2
  • gcc_impl_linux-64=7.2.0
  • binutils_linux-64=7.2.0
  • gxx_impl_linux-64=7.2.0
  • ncurses=6.1
  • libgcc-ng=8.2.0
  • libstdcxx-ng=8.2.0
  • libuuid=1.0.3
  • readline=7.0
  • expat=2.2.6
  • fribidi=1.0.5
  • libgfortran-ng=7.3.0
  • graphviz=2.40.1
  • libedit=3.1.20170329

is there any way I can fix this? thanks for the help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.