Team members:
Sihan Qi sq543
Sihan Liu sl6964
Ruihan Zhuang rz1391
The PDF file for the answers to the questions are in TAQ2.pdf
Python version: 3.10 Pandas, numpy, matplotlib, gzip, statistics, scipy, ast, statsmodels, unittest, _struct, _collections, glob
All implementation files are in the HW2_calculations folder.
HW2_calculations:
-- Find_largest.py: Find the largest 200 stocks in s&p500, filter out the Nan values and save to No_nan_dates_and_largest_200_tickers_in_20070619-20070921.csv
The test of this file is in Test_Find_largest
-- Reading_loop.py: Implement the reading loop. Output file will be in the output folder.
-- Regression.py: perform regression on the stock data, and use bootstrap to find the p_value. The tests for both the regress function and the bootstrap function are in Test_Regression.py
-- Regression_half.py: split the data into two halfs (larger stocks, smaller stocks) and compare the result of regression. The result is in the pdf.
-- Residual_analysis.py: perform residual analysis for the residuals of regression. The results are in pdf.
All unit test files are in unit_test_2 folder. The unit_test folder.
Change MyDirectories.py in src folder first, then run corresponding unit test file to run the project.