Giter Site home page Giter Site logo

nasdaq-itch-5.0-parser's Introduction

NASDAQ-ITCH-5.0-Parser

Parses and prints the NASDAQ ITCH 5.0 data

Thanks to Quannabe. I just modified the YAML file for the nerw ITCH 5.0 format.

#How To

The main function is location in src/parse.java. Just enter the file name and path to the ITCH file that you wanna parse here File Name.

Run Code

Compile the code and run it. Navigate to the src directory and run the following commands in your terminal.

  javac Parse.java
  java Parse [ITCH file path]

(Path can be left blank to read from stdin.)

ITCH Format Variations

Support is included for custom ITCH formats. See itch5.yaml for an example of how to construct an ITCH format configuration. To include a custom ITCH format configuration:

java Parse -y [YAML config] [ITCH file path]

Genium

Genium is supported out of box. Use the genium2.yaml config included in the repo:

java Parse -y ../genium2.yaml [ITCH file path]

#Data

Download raw ITCH 5.0 data from the following link:

ftp://emi.nasdaq.com/ITCH/08022014.NASDAQ_ITCH50.gz

#DATA FORMAT

http://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/NQTVITCHspecification.pdf

(For Nasdaq Genium)

http://business.nasdaq.com/Docs/ITCHRefDataGuideNFXv2_00_tcm5044-18017.pdf

nasdaq-itch-5.0-parser's People

Contributors

amay22 avatar arvindshmicrosoft avatar mister-meeseeks avatar shlomoa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nasdaq-itch-5.0-parser's Issues

timestamp problems

the specs say length is 6 and is nanoseconds after midnight, but the output is giving 9. For example a sample record is [A,? , ,517516613, 1177389, B, 100, MSFT, 445000] where ? is not printable. 517516613 is supposed to be the time stamp. Can you give some clarification on what this time is supposed to be?

problems with code and suggestions

Suggest you try different dataset from nasdaq because the dataset you picked is rather limited in message types and so it crashes and burns.

Data output should look something like using data from 08/22/2014:
[A, 7832, 0, 34383145181253, 9138132, S, 100, WLL, 810000]
[F, 7845, 0, 34383145198745, 9138133, S, 100, WMW, 268800, NITE]
[F, 7845, 0, 34383145201499, 9138134, S, 100, WMW, 268800, ATDF]
[F, 7845, 0, 34383145204380, 9138135, S, 100, WMW, 268800, CANT]
[F, 7845, 0, 34383145207174, 9138136, S, 100, WMW, 268800, TRIM]
[D, 7845, 0, 34383145238037, 8467825]
[D, 7845, 0, 34383145240887, 8467826]
[U, 7845, 0, 34383145288744, 8780722, 9138137, 500, 210300

To interpret the time stamp (col4) divide by 1 million and use the function
=TEXT(E6236/86400000,"hh:mm:ss.000") in excel to view it in milliseconds For example 34383145288744 becomes 09:33:03.145

Missing case for when the field is 2 bytes in length and is an integer. This is why you get non printable characters. Code currently treats it like a char.

Missing case for the time stamp. Code currently treats as 4 bytes Int. So it only treats 4 bytes not 6 leaving a much smaller number!

So switch statement should look something like:
switch ((Integer) fieldArray.get(0)) {
case 1:
value = (String) getChar(arr);
break;
case 2:
value = (String) getInt(arr);
break;
case 3:
value = (String) getString(arr, (Integer) fieldArray.get(1));
break;
case 4:
value = (String) getLong(arr);
break;
case 5: //for 2 byte integer
value = (String) getShort(arr);
break;
case 6: // for timestamp
value = (String) getArbitrayLengthNumber(arr);
break;

public Object getShort(byte[] payload) {
return Integer.toString(ByteBuffer.wrap(payload).getShort());
}

public Object getArbitrayLengthNumber(byte[] payload) {
long value = 0;
for (int i = 0; i < payload.length; i++)
{
value = (value << 8) + (payload[i] & 0xff);
}
return Long.toString(value);
}
Problems in your yaml file:
Missing message "N" which is the Retail Price Improvement indicator Section 4.7 You need this because the datasets are before it was discontinued. Program crashes and burns.

For message type ="K" you are missing the field tracking Number

The additional corrections to the yaml file are as follows (it's ok to leave the decimal changes out and it will run. )

formats:
- MessageType : [1,1]ok
- Timestamp : [2,6] should be integer length 6
- StockLocate : [1,2] should be integer length 2
- TrackingNumber : [1,2] should be integer length 2
- Stock : [3,8] ok
- Shares : [2,4] ok
- Price : [2,4] length is 4 but has 4 decimal places
- BuySellIndicator : [1,1]ok
- OrderReferenceNumber : [4,8]ok
- MatchNumber : [4,8]ok
- Attribution : [3,4]ok
- CanceledShares : [2,4]ok
- CrossPrice : [2,4] length is 4 but has 4 decimal places
- CrossShares : [4,8]ok
- CrossType : [1,1]ok
- CurrentReferencePrice : [2,4] length is 4, but has 4 decimal places
- EventCode : [1,1] ok
- ExecutedShares : [2,4] ok
- ExecutionPrice : [2,4] length is 4, but has 4 decimal places
- FarPrice : [2,4] length is 4, but has 4 decimal places
- FinancialStatusIndicator : [1,1]ok
- Imbalance : [4,8]ok
- ImbalanceDirection : [1,1]ok
- MPID : [3,4]ok
- MarketCatagory : [1,1]ok
- MarketMakerMode : [1,1]ok
- MarketParticipantState : [1,1]ok
- NearPrice : [2,4]length is 4, but has 4 decimal places
- NewOrderReferenceNumber : [4,8]ok
- OriginalOrderReferenceNumber : [4,8]ok
- PairedShares : [4,8]ok
- PriceVariationIndicator : [1,1]ok
- PrimaryMarketMaker : [1,1]ok
- Printable : [1,1]ok
- Reason : [3,4]ok
- RegSHOAction : [1,1]ok
- Reserved : [1,1]ok
- RoundLotSize : [2,4]ok
- RoundLotsOnly : [1,1]ok
- TradingState : [1,1]ok
- IssueClassification : [1,1]ok
- IssueSubType : [1,2]incorrect alpha two bytes length
- Authenticity : [1,1]ok
- SSThresholdIndicator : [1,1]ok
- IPOFlag : [1,1]ok
- LULD : [1,1]ok
- ETPFlag : [1,1]ok
- ETPFactor : [1,4]incorrect integer 4 bytes length
- InverseIndicator : [1,1]ok
- LocateCode : [1,2]incorrect integer 2 bytes length
- Level1 : [3,8]length is 8, but has 8 decimal places
- Level2 : [3,8]length is 8, but has 8 decimal places
- Level3 : [3,8]length is 8, but has 8 decimal places
- BreachedLevel : [1,1]ok
- IPOQuotRelTime : [1,4]incorrect integer 4 bytes length
- IPOQuotRelQual : [1,1]ok
- IPOPrice : [1,4] integer length is 4, but has 4 decimal places

bugs?

Nice improvement!, but there seems to be a bug. There is fArray, but no fArray1 defined in parseDS.java! see below (did you copy the code in right to github?
public void buildFormats() {
fMap = new HashMap<>();
ArrayList fArray = (ArrayList) yMap.get("formats");

    fArray.stream().map((fArray1) -> (Map<Object, Object>) fArray1).forEach((tempMap) -> {
        fMap.put(tempMap.keySet().toArray()[0], tempMap.values().toArray()[0]);
    });
}
  1. you were asking about the dataset I used: This time I used: 02022015.NASDAQ_ITCH50.gz

  2. The specs say for prices 4 decimals. But the output seems to show only 2. I'm not sure that most of the data is only two decimals, but can the code handle 4? Periodically makers will jump in front of another buy posting a price such as 50.0199 instead of 50.0200.

  3. This is not a problem with your code but with the ITCH specs. If you look at the specs for message type "P" and "Q" you will see that one specifies shares as integer 4 and the other as integer 8. In the yaml file, shares is only defined once. So I'm guessing that the code might not report the case of integer 8 correctly. Its worth while checking into.

String index out of range: 0

I was trying it with the following sample file-
http://optionsdata.baruch.cuny.edu/data1/delivery/nasdaq/S20180501-v50_sample.txt
and I am calling parser class from a JFrame, it parses 2 records and faces an out of index issue-

[S, 0, 0, 10949877220447, O]
[R, 1, 0, 14337433606119, A, , N, 536870912, d, N, CZ, , P, N, , 1, 1308622848, ]
Exception in thread "Thread-2" java.lang.StringIndexOutOfBoundsException: String index out of range: 0
at java.lang.String.charAt(String.java:658)
at nasdaqitchparser.Parsers.messageIn(Parsers.java:45)
at nasdaqitchparser.Parse.parse(Parse.java:65)

Can u please suggest what to do?

NullPointerException parsing 10302018.NASDAQ_ITCH50

I am trying to parse the uncompressed file 10302018.NASDAQ_ITCH50 (originally from ftp://emi.nasdaq.com/ITCH/10302018.NASDAQ_ITCH50.gz). It fails consistently with the below stack:

Exception in thread "main" java.lang.NullPointerException
at ParseDS.getFields(ParseDS.java:48)
at Parsers.messageIn(Parsers.java:47)
at Parse.parse(Parse.java:61)
at Parse.main(Parse.java:93)

The tail of the output file shows the below lines:

[D, 1517, 0, 34790347615211, 11491833]
[A, 8592, 0, 34790347635580, 41345596, B, 100, XLP, 54.96]
[A, 8592, 0, 34790347640759, 41345600, S, 100, XLP, 55.03]
[D, 8265, 0, 34790347645719, 41339864]
[U, 7379, 0, 34790347647139, 41344468, 41345608, 14, 104.42]
[A, 3599, 0, 34790347666980, 30131022, B, 100, HAS, 92.23]
[A, 5518, 0, 34790347672809, 30821291, S, 100, NKE, 73.41]
[A, 8265, 0, 34790347706962, 41345616, B, 300, VOOG, 140.6]
[A, 3599, 0, 34790347707980, 30131026, B, 100, HAS, 92.23]
[D, 8216, 0, 34790347708330, 41345308]

I re-ran with -enableassertions and it says:

Exception in thread "main" java.lang.AssertionError: File type missing: J

The YAML file does not seem to have an entry for J. Neither does the spec (https://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/NQTVITCHspecification.pdf). Is this corrupt data?

Since it is a NullPointerException I thought I would report this. Hopefully you can reproduce this and help with a fix. Many thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.