Giter Site home page Giter Site logo

jingqingz / baidutraffic Goto Github PK

View Code? Open in Web Editor NEW
223.0 223.0 79.0 41.32 MB

This repo includes introduction, code and dataset of our paper Deep Sequence Learning with Auxiliary Information for Traffic Prediction (KDD 2018).

Python 100.00%
dataset deep-learning traffic traffic-data traffic-prediction

baidutraffic's People

Contributors

bbliao avatar jingqingz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

baidutraffic's Issues

Some useful intermediate-file do not find

Hi! I find many data loaded in [WideDeep_Controller] can not be found...

coarse_file = open(config.data_path + "wide_features/event_link_set_all_poi_type_feature_coarse_beijing_1km.pkl", "rb")
    fine_file = open(config.data_path + "wide_features/event_link_set_all_poi_type_feature_fine_beijing_1km.pkl", "rb")
    info_file = open(config.data_path + "wide_features/event_link_set_all_beijing_1km_link_info_feature.pkl", "rb")
    time_file = open(config.data_path + "wide_features/time_feature_15min.pkl", "rb")

So how can I get these files, thank you!

questions about the GPS coordinates

I feel a little confused about the GPS coordinates provided in the file 'link_gps'.
I've tried to plot the location of road in the google map, but there existed some offset. Where the ploted points in always seems not like a road. So I'm wondering:
1 what coordinate system are they in?WGS84?GCJ-02?or BD-09 of Baidu?
2 are the gps coordinates given in the dataset aiming at the true road in reality? how can I get the specific road for a pair of gps coordinate?or it need to be kept serect?

Thank you in advance

Issues downloading dataset Baidu

Hello I'm having an issue while downloading the dataset , I need to create an account on Baidu and it does not seem to work cause I do not have a Chinese phone number , could someone help me with an easier way to download the dataset thank you so much.

How to run

Hello, I looked at readme.md, and I find that it did not say how to run this code? How do you write its running command?
python train.py?
Thank you!

Question regarding baseline models

In the baseline models that you implemented(specifically RF and SVR); were they trained solely on the temporal data or the entire spatio-temporal data?

Still questions about enode and snode

#20
Hi, thank you about the issues before, but I still get some questions here.
I am trying to recover the topology of the dataset. I want to construct a graph represent the spatial information with G(V,E) where V is vertices and E is Edges. Firstly I consider snode and enode can be mapped as vertices and links can be mapped as edges. Like this:
bmASDgRwQdK9uiUYJ3Tu1Q_thumb_1867
The outlines represent true road and the thin lines can be seen as links and the big dots can be seen as nodes(snode or enode). But it is said in issue #16 the snode and enode are also links. So that confused me if the snode and enode are also links. It may cause some vague representations when situation like below happens:
zZX7s4Z8R2iIWB+8J%9GBw_thumb_1864
The link in the middle are connect with 6 links. And I need 2 of them to represent the enode and snode of such link.I am not sure which links I should choose to represent.
Using the illustration in issue #16 like below. If we add link2 link3 and link4, for link2, one of the (link1,enode) could be the snode of the link2, one of the (link3,link4) could be the enode of the link2. If so, it will really hard to recover the topology of the dataset in G(V,E) form. Because I can only get Edge information but not Vertices. And I am not sure, if all those snode and enode are links, what should be consider as vertices in graph.
Ed5M0p1bSZy7UvnA2Cs1xQ_thumb_1863

Hoping you reply.
Thank you.

Cant figure out how to run the code

I'd like to thank you first for the code and the paper , but I'm having missing files issues although I changed the repositories paths , is there a way that you can write a quick line by line instructions to do before running the train.py please , it will be really helpful , thank you so much !

如何获取交通流量数据集

您好,
我看到这里做了一个交通数据集的对比,主要有的是道路速度的数据,请问有没有道路流量的数据呢,如果没有的话有没有其他公开数据集提供道路流量数据呢,感谢!

What does the 'event_link_set_all_poi_type_feature_coarse_beijing_1km.pkl' mean?

I check the your codes, in function load_data, you open files: "event_traffic_beijing_1km_mv_avg_15min_completion.pkl" & "event_link_set_all_poi_type_feature_fine_beijing_1km.pkl" &'event_link_set_all_beijing_1km_link_info_feature.pkl&'', but how this data files coming from. In other words, how can i get these data.
Thanks!

What does Link GPS represents?

Hi, I have downloaded the dataset, and confront with the similar problem with the closed issue #5.
In the closed issue #5, you reply with "Yes, the gps of the link_id, snode_id, and enode_id are all in the link_gps file since they are all link."

But the link_id in the file 'link_gps' has 13 digits, where the node_id has only 10 digits. So it seems that the GPS of snode_id and enode_id are not contained in the link_gps file.

I have the following questions.
2) What does the link_gps represent? The middle point of a road segment?
2) How can I get the snodegps and enodegps information of the road segments?
3) According to your paper, "The origin traffic speed dataset contains the traffic speed of ∼450k road segments". But I only see 44,172, not '~450k', unique link_id items in both 'road_network_sub-dataset' and 'link_gps'.

What do snode and enode represent?

Hi. My questions may seem a bit trivial but I could not find explicit explanation in the paper.

  1. The entire network is composed of road segments and snode and enode represent the endpoint of these segments? These are the end points of the edges of the graph?
  2. In paper, it is written that we are given snode and enode gps, but in issue #6 it is said that it represents the middle point of road segment.
  3. Where is the extra information on social attributes such as weekdays, weekends, public holidays, peak hours and off-peak hours described in the paper?

road segment information is not complete

hello dear,

Thanks for impressive work. I am working on similar project and interested to use your great dataset. but in your road-segment dataset there is only start node of the street. in your paper you mentioned about a few items, like roadGps, length, etc. So here are my question. I would be appreciate if you can help me. I will definitely cite your paper and will acknowledge you .
1- how can i get main dataset for roads, i really need to have street start and end points. I see you have some intermediate files but i am out side of china and cannot get those files. also in your intermediate files there is nothing about gps coordination of streets. i mean start and end both.
2- i already downloaded your files from backup link, but in that roead-network-subset the sample is this : "1144134225930 116.391026 39.922581". it is not complete as you mentioned in paper, is there any way i can get end point of streets?

highly appreciated in advance

Can't find intermediate file: query_distribution_beijing_1km_k_150_filtfilt.pkl

I can't find intermediate file : query_distribution_beijing_1km_k_150_filtfilt.pkl as wrote in the dataloader :

   data = pickle.load(open(config.data_path + "query_distribution_beijing_1km_k_%d_filtfilt.pkl" % config.impact_k, "rb"), 
   encoding='latin1')

I tried to run the get_query_distribution_feature_beijing_1km_sqe.py to get it, but it takes a lot of time.
Is there a file provided directly ?

Questions about Event Discovery Algorithm

Hello, I've read your paper "Deep Sequence Learning with Auxiliary Information for Traffic Prediction" and find that the code of Event Discovery Algorithm may be not in accordance with its describtion in paper.
@ /src/preprocessing/new_anomalty1109.py line 38-43
The code shows that rules to discover event is:
(d(x,y,t) - d(x,y,t-7d))) > 300 && (d(x,y,t) / d(x,y,t-7d))) > 0.2 (The second condition make no sense)
In paper this rule to discover event is :
(d(x,y,t) - d(x,y,t-7d))) > 300 && (d(x,y,t) - d(x,y,t-7d))/d(x,y,t-7d) > 0.2

Could not find GPS of node?

Hi, I have downloaded the dataset, and confront with the similar problem with the closed issue #6.
I have mapped snode_id, enode_id to 13 digits new link_id. However they are not contained in the link_gps file.

So how can I get the GPS of these node_id?

Can't find data files

As written in dataloader.py (shown as below), these data files, such as event_link_set_beijing_1km, event_traffic_beijing_1km_mv_avg_15min_completion.pkl, etc., can't be found in BaiduTraffic and dataset (which are got from https://ai.baidu.com/broad/download?dataset=traffic)

event_set_file = open(config.data_path + "event_link_set_beijing_1km", "r")
traffic_data_file = open(config.data_path + "event_traffic_beijing_1km_mv_avg_15min_completion.pkl", "rb")

  1. Could you give suggestions on how to get these files?
  2. Is there a brief of use steps?

Thanks.

link_gps中无snode与enode的坐标信息

大神,你好!我今天在使用数据的时候惊奇的发现,road_network_sub-dataset.v2数据中的snodeid与enodeid的经纬度坐标信息在link_gps中是不存在的...请问是什么原因?谢谢!

Duplicate links in Road_network_sub-dataset but with different properties

I want to process your data to reveal some graph structure. But I found there are duplicate link_ids in road_network_sub-dataset and those links are totally with different value of width snodeid enodeid or length. I am quite confusing how this could happen.
Following are some link_ids I found duplicated in road_network_sub-dataset
<class 'list'>: [1014574024344, 1163808777119, 1573596569624, 1490171293608, 1144042225780, 1934286704917]

One file wasn't existed

The file "pagerank_1km.txt" was not existed in dir. Could you tell me where i can find it?

Dataset

Hello, may I ask if the dataset is data from all road sections in Beijing, or has it been processed to only have data adjacent to special event road sections?

No node GPS in Road Network Sub-dataset

Hi, I have downloaded the dataset. In the road_network_sub-dataset, I can not see the snodegps and enodegps mentioned in ReadMe, or any other files which can map the node id with the corresponding gps.
Could you upload this part?

traffic data missing for some road_segment_id

I was reading the road_segment_id from the neighbour_1km.txt. Using the road_segment_id as key to fetch the speed value from the traffic_speed_sub-dataset. some road_segment_id can not be found in traffic_speed_sub-dataset. for example:
read one line in neighbour_1km.txt. -> all the ids has been mapped from the link_id_hash_map
['1597566463414', '1462215565312', '1503121983912', '1770727782763', '1462073565097', '1503110983921', '1597550463401', '1682917053527', '1687870507191', '1597560463393', '1770727782763']
1503110983921 is missing in traffic_speed_sub-dataset.
we assmue that all the id in neighbour_1km.txt should be contained in the traffic_speed_sub-dataset. could you please explain that?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.