This is a repository for our university training project.
Here are the main features we offer.
Create directories needed.
Simply counts data entries in a file. Deprecated soon.
Fitting total entries based on season. You need to have MatLab to use it.
A simple reader for CSV-format data. In this project, a time praser is implemented to convert common time format(MM/DD/YYYY HH:MM) to computer recognizable format(365 days, 1440 minutes array). Then a summary dataset is created.
The usage is:
g++ csv-reader.cpp -o csv-reader
(chmod +x csv-reader)
./csv-reader
It will generate all 'Gnuplot' files needed.
Data visualization module to show the data more friendly. We make it automatically by a bash shell.
chmod +x plotting.sh
./plotting.sh
- Only generate data from 2014-2017, from a Chinese weather provider.
- Data format are not very well.
- Well formated data by Wunderground.
- Richer dataset, boarder classfication.
It will generate PNG files.
Creating a mapping between dock no. and geographical coordinates.
The usage is:
python3 MapParser.py
Doing stepwise regression from 2011-2016's weather data. Possible data modifications are:
- Linear process
- Quadartic process
- Cubic process
- Exponential process
- Logarithm process
- Fractional process
Select nesserary data in stepwise process. Combine all the data together and verify the model using 2017's data.
This tool contains two parts with different programming languages(C++ / Matlab).
To seperate exactly ONE day in the dataset. No need to call it.
Firstly, the code will run system calls to data_slicer:
g++ ./data_slicer.cpp -o ./data_slicer
./data_slicer [year] [month] [day]
Then, it will read the need files and generate a real-time flux graph.
Finally, it will generate a statistics matrix showing the place with more than 10 bikes on ride per minute.
- About 1.8s to process about 3000000 data entries on a Linux distro.
- About 13s to process about 20000000 data entries on a Linux distro.
- Linux is needed.(untested on Windows or WSL)
- Open source data can be fetched on Capital Bikeshare's website.
- Python requirements: BeautifulSoup, requests, (time, csv, json).
- C++ requirements: g++ version more than 6.0 is recommended.
- MatLab requirements: Latest version is recommended. "+" overrided for string concatenation is needed.
- Gnuplot requirements: version more than 5.0 is needed.