Giter Site home page Giter Site logo

dtd2mysql's People

Contributors

dependabot[bot] avatar eoghanmurray avatar jasonpaige avatar linusnorton avatar miklcct avatar mk-fg avatar orzeuek avatar qinwenshi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dtd2mysql's Issues

Error extracting Fare Data

Unable to extract latest Fare Data - RJFAF851.ZIP
Command:
DATABASE_USERNAME=root DATABASE_PASSWORD=*** DATABASE_NAME=fares dtd2mysql --fares ~/Desktop/RJFAF851.ZIP
Error:
Error processing RJFAF851.RST with data RTRCOF0001R00000359D TANError: Non-nullable field received null value: "0000"

Missing stops when train reverses movement direction / cif location suffixes

When checking resulting GTFS feed for train_uid Q21052 against southeasternrailway.co.uk API, found this curious mismatch:

Diff details:
  Matching journey trip [ gtfs -vs- api ]:
  ...
  TWI 18:22:00          TWI 18:22:00
  STW 18:25:00          STW 18:25:00
  TED 18:29:00          TED 18:29:00
  HMW 18:32:00          HMW 18:32:00
  KNG 18:42:00          KNG 18:42:00
                      > HMW 18:44:00
                      > TED 18:46:00
  FLW 18:51:00          FLW 18:51:00
  HMP 18:54:00          HMP 18:54:00
  ...

Note how at KNG train apparently reverses direction and passes two previous stops (in reverse order) then goes off somewhere else.

This is also indicated by "RN" activity flag for KNG stop (see #14 for more details on these) in the CIF data, and both MCA file and data imported into MySQL (via --timetable) has these reverse-stops in there, but with a "suffix":

+----------+----------+----------+
| location | ts_arr   | ts_dep   |
+----------+----------+----------+
...
| SHCKLGJ  | NULL     | NULL     |
| TEDNGTN  | 18:28:00 | 18:29:00 |
| HAMWICK  | 18:31:00 | 18:32:00 |
| KGSTON   | 18:34:00 | 18:42:00 |
| HAMWICK2 | 18:44:00 | 18:44:00 |
| TEDNGTN2 | 18:46:00 | 18:46:00 |
| SHCKLGJ2 | NULL     | NULL     |
| FULWELL  | 18:51:00 | 18:51:00 |
...

Such numeric suffix seem to be a part of CIF specification, as per page 21 of "CIF USER SPEC v29 FINAL.pdf" (see #14 for URL) or page 14 of RSPS5046.

Both basically say that CIF field should be "tiploc + suffix", but it seem to be parsed without normalization (splitting suffix into its own value) and resulting field used with JOIN as tiploc when building GTFS, which it technically isn't.

Not sure if this "suffix" is ever useful, as given that stops are sequential anyway, it can be easily derived if necessary, so I'd suggest dropping it entirely in the parser, by using first 7 chars of that field, ignoring the 8th one.
Or, for completeness, maybe it can be stored in a separate db field on --timetable operation.

Doing either of these should fix produced GTFS data in such "multiple passes through same stop(s)" cases.

max_old_space_size too small and not configurable?

@i've been getting the FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory memory error and traced it to the node max_old_space_size argument.

This appears to be set to 3000 in ./bin/dtd2mysql.sh

I tried increasing to 4096 and then finally to 16384 and that seems to have solved it.

This was from calling dtd2mysql as follows:

$ DATABASE_USERNAME=dtd2mysql DATABASE_NAME=timetable NODE_OPTIONS='--max_old_space_size=16384' /usr/bin/ts-node ./src/index.ts --gtfs-zip ~/gtfs.zip

However, when I run it on the system-installed version, as follows, it seems to still get the error:
$ DATABASE_USERNAME=dtd2mysql DATABASE_NAME=timetable NODE_OPTIONS='--max_old_space_size=16384' dtd2mysql --gtfs-zip ~/gtfs.zip

Which I presume is due to the presence of --max_old_space_size=3000 in /usr/bin/dtd2mysql (just checking that now...)

Incorrect public departure time for join-association point

Looking at following mismatch from gtfs-webcheck (G80218_G80219 on 2017-10-22):

[OfaA] Trip: <Trip G80218_G80219 [- -] [GLQ - DEE - ARB - MTS - STN - ABD]>
[OfaA] Date/time: 2017-10-22 10:40:00
[OfaA] Diff details:
[OfaA]   Matching journey trip [ gtfs -vs- api ]:
[OfaA]     <Trip G80218_G80219 [2017-10-22 10:45:00 2017-10-22 13:36:00] [GLQ - BBG - LNZ - CRO - FKK - PMT - LIN - DAM - INK - BTS - KDY - MNC - LDY - CUP - LEU - DEE - CAN - ARB - MTS - LAU - STN - ABD]>
[OfaA]   GLQ 10:45:00                                                   GLQ 10:45:00
[OfaA]   DEE 12:24:00                                                 | DEE 12:18:00
[OfaA]   ARB 12:40:00                                                   ARB 12:40:00
[OfaA]   MTS 12:55:00                                                   MTS 12:55:00
[OfaA]   STN 13:16:00                                                   STN 13:16:00
[OfaA]   ABD 13:36:00                                                   ABD 13:36:00

And this CIF data for this day/stop:
https://gist.github.com/mk-fg/879ef92c50709071e1d30d0f673ec2f5
https://gist.github.com/mk-fg/60a3ef765ae7588c8ba8d0097e6c0473

...
  G80219  206494 O 2017-10-22 2017-10-22 ......7 A LEU 11:46 11:47 T
  G80219  206494 O 2017-10-22 2017-10-22 ......7 A DEE --:-- --:-- K/-U
    scheduled_arrival_time: 11:59:30
    scheduled_departure_time: 12:24:00
    scheduled_pass_time: NULL
    public_arrival_time: NULL
    public_departure_time: NULL
  G80219  206494 O 2017-10-22 2017-10-22 ......7 A CAN --:-- --:--
...
  G80218  206493 O 2017-10-22 2017-10-22 ......7 A LEU --:-- --:--
  G80218  206493 O 2017-10-22 2017-10-22 ......7 A DEE 12:18 --:-- TF
    scheduled_arrival_time: 12:18:00
    scheduled_departure_time: NULL
    scheduled_pass_time: NULL
    public_arrival_time: 12:18:00
    public_departure_time: NULL
...
cif associations:
  ...
  G80219 G80218    2087 O JJ - DEE 2017-10-22 2017-10-22 ......7
gtfs trips:
  ...
  [189522] headsign=G80218_G80219 svc=379 svc_days=2:
    <TS ......7 [2017-10-15 2017-10-22] {}>
    stop sequence:
      GLQ 10:45 10:45 P-
      LIN 11:08 11:09 --
      INK 11:28 11:29 --
      DEE 12:18 12:24 PD
      ARB 12:40 12:40 PD
      ...

Currently "DEE 12:18 12:24" seem to be produced from public arrival time of G80218 and scheduled departure time for G80219, which seem to be incorrect due to mixing of public/scheduled times and disagrees with southeasternrailway.co.uk interpretation of the time, where departure is apparently set from arrival time, without using such fallback.

I think scheduled departure time should pretty much always be after public departure, and using it in GTFS can be misleading, should there be a significant gap between the two for whatever technical reasons.

TOC from Z-records aren't imported

The BX records in the Z-train files aren't processed in the code. This results in the imported schedule data lacking the operator for the Z-trains.

Missing calendar entries for schedule after applying association

Hey,

Implemented associations in mk-fg/open-track-dtd2mysql-gtfs, and running it and dtd2mysql on the same data, noticing quite a few differences in schedules with associations, which look like a bug in dtd2mysql.

In particular, with all schedules for these trains from my data (which have assocs with each other): C73290 C74089 C74104 C74105

Trips/stop_times match, and calendars match between implementations for all trips but this one:

EUS[21:15] - WFJ[21:32/21:33] - CRE[23:53/23:56] - WBQ[1+00:22/1+00:24] - PRE[1+00:58/1+01:00] -
CAR[1+02:24] - EDB[1+03:57/1+04:43] - INK[1+05:01] - KDY[1+05:20/1+05:21] - LEU[1+05:48/1+05:49]
- DEE[1+06:11] - CAN[1+06:25/1+06:26] - ARB[1+06:34] - MTS[1+06:50/1+06:52] -
STN[1+07:15/1+07:17] - ABD[1+07:39]

For which py gives 142 running days and ts only 10:

gtfs_py:
  <TS ....5.. [2017-05-26 2017-12-08] {2017-08-25}>
  <TS .23.... [2017-12-05 2017-12-06] {}>
  <TS 1...... [2017-05-22 2017-12-04] {2017-08-28}>
  <TS .234... [2017-05-23 2017-11-30] {}>

gtfs_ts:
  <TS .23.... [2017-12-05 2017-12-06] {}>
  <TS 1...... [2017-06-05 2017-06-12] {}>
  <TS .234... [2017-06-06 2017-06-15] {}>

(format is: <TS $weekdays [$date_start $date_end] {$exception_dates}>)

Checking e.g. where <TS ....5.. [2017-05-26 2017-12-08] ...> came from in py output, there're these overlapping entries not cancelled by anything (mismatched weekdays due to "over next-midnight" association):

C74089   41627 P 2017-05-27 2017-12-09 .....6.
C74104   39484 P 2017-05-26 2017-12-08 ....5..

And relevant association entry seem to be:

C74104 C74089     370 P VV N EDB 2017-05-26 2017-12-08 ....5..

So there should be "C74104_C74089" association trip(s) for this timespan in TS gtfs output, right?

What's also interesting about C74089 is that it has this schedule:

C74089   39294 P 2017-05-24 2017-12-01 ..345..

Which would overlap with C74104_C74089 one, created via next-midnight association from ".....6.", as it'd start on previous day from stops in C74104 schedule, if I'm not mixing anything up.

Wonder if maybe that's what might be causing the issue here.

Just in case, here's a link to somewhat unnecessary long diff with all the db entries (from my CIF data):
https://gist.github.com/mk-fg/f4da61e5d753be55870e99949d9752bc

Incorrect association date range - stp=C entries get ignored

For latest timetable from atoc (ttis653.zip), had these GTFS trips created for P76279_P77109 association:
https://gist.github.com/mk-fg/bee3f96ac9815f6f5bb344d0081ab005#file-query-cif-gtfs-p76279_p77109-txt-L80-L85

Note the calendar timespan on the last highlighted line - it goes from 2017-05-21 to 2017-09-10 (on sundays) without any exception dates, while CIF timetable data has clear stp=C cancellation entries for this association, highlighted in the link just above that.

Both gtfs-webcheck and python implementation disagree with that (latter producing "2017-05-21 2017-08-27" range for gtfs trip), so looks like a bug somewhere in processing stp=C association entries, as they don't seem to have any effect on results in this particular case.

Data too long for log filename

Fatal error when running --timetable with a bespoke filename that is greater than 12 characters long. Suggest truncating the supplied filename to match the table definition.

...
Finished processing ttisf893.mca
Error: Data too long for column 'filename' at row 1
at PromisePool.query (...mysql2\promise.js:340:22)
at ImportFeedCommand.updateLastFile (...dtd2mysql\dist\src\cli\ImportFeedCommand.js:105:24)
at ImportFeedCommand.doImport (...dtd2mysql\dist\src\cli\ImportFeedCommand.js:67:20)
at processTicksAndRejections (internal/process/task_queues.js:93:5)
at async ImportFeedCommand.run (...dtd2mysql\dist\src\cli\ImportFeedCommand.js:38:13) {
code: 'ER_DATA_TOO_LONG',
errno: 1406,
sqlState: '22001',
sqlMessage: "Data too long for column 'filename' at row 1"
}

--gtfs-zip error failing to build files and zip file

Another issue sorry!

--gtfs-zip seems to throw up the following errors:

Writing agency.txt (node:12658) UnhandledPromiseRejectionWarning: Error: Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 't1.tiploc_code' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by at PromisePool.query (/usr/local/lib/node_modules/dtd2mysql/node_modules/mysql2/promise.js:344:20) at CIFRepository.getStops (/usr/local/lib/node_modules/dtd2mysql/dist/src/gtfs/repository/CIFRepository.js:35:41) at OutputGTFSCommand.run (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/OutputGTFSCommand.js:26:50) at OutputGTFSZipCommand.run (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/OutputGTFSZipCommand.js:18:28) at container.getCommand.then.c (/usr/local/lib/node_modules/dtd2mysql/dist/src/index.js:7:18) at <anonymous> at process._tickCallback (internal/process/next_tick.js:188:7) at Function.Module.runMain (module.js:695:11) at startup (bootstrap_node.js:191:16) at bootstrap_node.js:612:3 (node:12658) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1) (node:12658) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code. (node:12658) UnhandledPromiseRejectionWarning: Error: Expression #4 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'trains.physical_station.minimum_change_time' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by at PromisePool.query (/usr/local/lib/node_modules/dtd2mysql/node_modules/mysql2/promise.js:344:20) at CIFRepository.getTransfers (/usr/local/lib/node_modules/dtd2mysql/dist/src/gtfs/repository/CIFRepository.js:20:41) at OutputGTFSCommand.run (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/OutputGTFSCommand.js:25:54) at OutputGTFSZipCommand.run (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/OutputGTFSZipCommand.js:18:28) at container.getCommand.then.c (/usr/local/lib/node_modules/dtd2mysql/dist/src/index.js:7:18) at <anonymous> at process._tickCallback (internal/process/next_tick.js:188:7) at Function.Module.runMain (module.js:695:11) at startup (bootstrap_node.js:191:16) at bootstrap_node.js:612:3 (node:12658) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2) Writing links.txt Writing trips.txt, stop_times.txt and routes.txt Writing calendar.txt Writing calendar_dates.txt { Error: Expression #4 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'trains.physical_station.minimum_change_time' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by at PromisePool.query (/usr/local/lib/node_modules/dtd2mysql/node_modules/mysql2/promise.js:344:20) at CIFRepository.getTransfers (/usr/local/lib/node_modules/dtd2mysql/dist/src/gtfs/repository/CIFRepository.js:20:41) at OutputGTFSCommand.run (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/OutputGTFSCommand.js:25:54) at OutputGTFSZipCommand.run (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/OutputGTFSZipCommand.js:18:28) at container.getCommand.then.c (/usr/local/lib/node_modules/dtd2mysql/dist/src/index.js:7:18) at <anonymous> at process._tickCallback (internal/process/next_tick.js:188:7) at Function.Module.runMain (module.js:695:11) at startup (bootstrap_node.js:191:16) at bootstrap_node.js:612:3 message: 'Expression #4 of SELECT list is not in GROUP BY clause and contains nonaggregated column \'trains.physical_station.minimum_change_time\' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by', code: 'ER_WRONG_FIELD_WITH_GROUP', errno: 1055, sqlState: '42000', sqlMessage: 'Expression #4 of SELECT list is not in GROUP BY clause and contains nonaggregated column \'trains.physical_station.minimum_change_time\' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by' } (node:12658) PromiseRejectionHandledWarning: Promise rejection was handled asynchronously (rejection id: 2) (node:12658) PromiseRejectionHandledWarning: Promise rejection was handled asynchronously (rejection id: 1)

As a result the some text files, such as the stops.txt, are not created. Thus, the zip file is not created.

Michael

list both stations and platforms in the GTFS output

The stops.txt generated in the GTFS now only lists the train stations with an empty location_type field without the platform definitions, and doesn't distinguish between different part of the station (such as St Pancras high and low levels).

This makes the generated output useless for given step-by-step directions to access the platform in order to take trains, or to provide directions to transfer between trains within a complex station.

As the source data contains the platform information, it should be possible to do it properly in the GTFS file, by listing all stations with location_type = 1 and all platforms with location_type = 0, and referencing the appropriate platform from the trips, such that if we can match the coordinates of the platforms from OpenStreetMap, we can provide routing to the level that how the passenger can access the platform from the train station and how should the passenger change between trains, and to guide them to the lifts for step-free access.

GTFS produced by timetable revision 930 contains invalid calendar

The GTFS produced by National Rail timetable revision 930 contains an invalid calendar which crashes OpenTripPlanner: Invalid interval, the end 2023-11-17 is before the start 2023-11-20.

The following output is from the validator:

start_and_end_range_out_of_order

Two date or time fields are out of order.

Date or time fields have been found out of order in calendar.txt, feed_info.txt and stop_times.txt.

You can see more about this notice here.

filename (?)The name of the faulty file. csvRowNumber (?)The row number of the faulty record. entityId (?)The faulty service id. startFieldName (?)The start value's field name. startValue (?)The start value. endFieldName (?)The end value's field name. endValue (?)The end value.
"calendar.txt" 5050 "5049" "start_date" "20231120" "end_date" "20231117"

Invalid or unsupported zip format

There is an error happening with parsing today's fares which seems to be related to cthackers/adm-zip#268

Extracting /tmp/fares.zip to /tmp/dtd/fares/
Error: Invalid or unsupported zip format. No END header found
    at readMainHeader (/usr/local/lib/node_modules/dtd2mysql/node_modules/adm-zip/zipFile.js:107:10)
    at new module.exports (/usr/local/lib/node_modules/dtd2mysql/node_modules/adm-zip/zipFile.js:19:3)
    at new module.exports (/usr/local/lib/node_modules/dtd2mysql/node_modules/adm-zip/adm-zip.js:20:11)
    at ImportFeedCommand.doImport (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:52:9)
    at ImportFeedCommand.run (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:39:24)
    at container.getCommand.then.c (/usr/local/lib/node_modules/dtd2mysql/dist/src/index.js:7:18)
    at process.runNextTicks [as _tickCallback] (internal/process/task_queues.js:52:5)
    at Function.Module.runMain (internal/modules/cjs/loader.js:880:11)
    at internal/main/run_main_module.js:21:11

Version 6.4.2

"00:00" parsed to public_departure_time parsed from CIF instead of NULL

Seem to be an issue with fix for #16, as reverting 1ee1a61 fixes that.

Looks like this in CIF database:

  N26072  250761 N 2017-11-13 2017-11-13 1...... A YRK --:-- 00:35 TB
  N26072  250761 N 2017-11-13 2017-11-13 1...... A THI --:-- 00:00
  N26072  250761 N 2017-11-13 2017-11-13 1...... A NTR 01:06 00:00 D
  N26072  250761 N 2017-11-13 2017-11-13 1...... A DAR 01:20 00:00 D
  N26072  250761 N 2017-11-13 2017-11-13 1...... A DHM 01:38 00:00 D
  N26072  250761 N 2017-11-13 2017-11-13 1...... A NCL 02:08 --:-- TF

(where "--:--" are NULL values)

Instead of this (with 1ee1a61 reverted):

  N26072  250761 N 2017-11-13 2017-11-13 1...... A YRK --:-- 00:35 TB
  N26072  250761 N 2017-11-13 2017-11-13 1...... A THI --:-- --:--
  N26072  250761 N 2017-11-13 2017-11-13 1...... A NTR 01:06 --:-- D
  N26072  250761 N 2017-11-13 2017-11-13 1...... A DAR 01:20 --:-- D
  N26072  250761 N 2017-11-13 2017-11-13 1...... A DHM 01:38 --:-- D
  N26072  250761 N 2017-11-13 2017-11-13 1...... A NCL 02:08 --:-- TF

Seem to be due to using 5-char TimeField for all *_time fields, while it's only scheduled_*_time that have that optional "H" marker, not public_*_time ones.
Hence nullable/nullChars check on TimeField for public times failing when there's an extra random char being captured from the next sibling field.

Guess some special TimeFieldWithHalfMinute can be introduced for scheduled-time fields instead of extending TimeField everywhere.

Some public stops/stations are missing in GTFS as they're missing in physical_station table

For example, L35246 (and about ~1.2k other entries, according to grep) appears to have UPMNLT ("Upminster L.T.") station as its origin, with public departure time.
This station is also present in "TIPLOC Eastings and Northings.xlsx.gz" list, and seem to be a real google-able thing.

It's also missing in physical_station mysql table, which is why it doesn't make it into GTFS, but seem to be greppable in CIF MCA file (ttis653.zip data) as a TI (tiploc insert) entry and is parsed to a corresponding entry in tiploc table:

                 id: 9998
        tiploc_code: UPMNLT
           capitals: 08
              nalco: 073600
nlc_check_character: N
    tps_description: UPMINSTER L.T.
             stanox: 51353
        po_mcp_code: 0
           crs_code: ZUM
        description: UPMINSTER UND

Maybe it'd make sense to add another JOIN to that massive query for CIF schedules and check both tiploc.crs_code and physical_location.crs_code in case one or other is missing?

--fares-clean error (not sure if vital)

Hi there,

Small error with --fares-clean, producing the following error:

Applied restriction dates { Error: Expression #4 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'trains.flow.direction' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by at PromisePool.query (/usr/local/lib/node_modules/dtd2mysql/node_modules/mysql2/promise.js:344:20) at CleanFaresCommand.queryWithRetry (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/CleanFaresCommand.js:111:27) at CleanFaresCommand.queryWithRetry (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/CleanFaresCommand.js:118:28) at <anonymous> at process._tickCallback (internal/process/next_tick.js:188:7) message: 'Expression #4 of SELECT list is not in GROUP BY clause and contains nonaggregated column \'trains.flow.direction\' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by', code: 'ER_WRONG_FIELD_WITH_GROUP', errno: 1055, sqlState: '42000', sqlMessage: 'Expression #4 of SELECT list is not in GROUP BY clause and contains nonaggregated column \'trains.flow.direction\' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by' } Removed old and irrelevant fares data

Regards,

Michael

Association stop has departure time shifted by +1 day incorrectly

As initially mentioned in #3, this stop sequence looks obviously wrong and non-sequential:

... - CDQ[19:04/19:06] - CDF[19:12/1+19:11] - CDQ[19:14/19:15] - ...

("1+19:11" means "19:11 on +1 day")
(full diff: https://gist.github.com/mk-fg/f3f7db87f84485b3aad1d6ab7af137de )

Problem seem to be due to CDF being stop where "join" association happens, and has 19:12 as public arrival time for P76348 (joinee) assoc_uid, and 19:11 as public departure time for P75734 base_uid.

Skipping stops with N ("stop not advertised") activity flag

Mentioned this one earlier here: #14 (comment)

gtfs-webcheck occasionally stumbles upon these stops as non-existing for corresponding trips in serw api data.

Checking data for all stops that have public time and "N" activity flag turns up schedules for ~200 train_uid's, full list for ttis653.zip:
https://gist.github.com/mk-fg/5fd942d0ca0a880e06cbb239e7a7eae4

All but outliers at the very start/end are very similar and have start/end at HVW (HEATHROW TERM 5) marked with "N" for some dates.
Some of these also have AML with "N" flag for some dates (e.g. P25417).
Including these stops doesn't match trips in gtfs-webcheck, adding e.g. --test-skip-stops 1-0 to skip stops that are marked as such returns matching trip there.

Few remaining trips that can be checked in similar way (some are too far in the future), like L02139 (with starting BTN stop having that activity flag), also match in a similar fashion.

Discussion thread linked in the comment above also suggests that these stops are not actually public - from the initial comment (which gets confirmed) to Peter Hicks' suggestion of what these might mean:

Another examples of unadvertised stops are on the 0415 Northampton - London Euston at Queens Park (Main Line). My guess is the unadvertised call there is for staff purposes, maybe a 'grandfather rights' thing - but it's unadvertised because the train might not call there if it has to run on the fast lines.

https://groups.google.com/forum/#!topic/openraildata-talk/eBTiB1BxrRw

Which all seem to suggest to me that these should not be in GTFS data as public stops.

When implementing processing for these flags, you seem to have omitted this one, despite it being mentioned in the issue there, so wanted to bring it up again separately, in case it wasn't skipped intentionally.

Few strange outlier CIF schedules that have flag are kinda like this one:
https://gist.github.com/mk-fg/371a73b484cedf9efda5fa365f2a3886#file-gistfile1-txt-L28-L32
Which - given how weird it looks compared to others for same train_uid - probably shouldn't make it into GTFS, and is maybe some kind of staff-only run, as Peter suggested.

TypeError: Object.values is not a function

Hi Linus! I think I need some hand holding to get this running for the first time.

I'm using mariadb and node v6.11.1
I've created a database user 'dtd2mysql' and checked that they can log in without a password. I've created the 'fares' database.

Here's the error I'm getting upon first run:

Extracting ~/downloads/RJFAF499.ZIP to /tmp/dtd/fares/
TypeError: Object.values is not a function
    at ImportFeedCommand.get fileArray [as fileArray] (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:37:23)
    at ImportFeedCommand.<anonymous> (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:64:37)
    at next (native)
    at /usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:13:71
    at __awaiter (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:9:12)
    at ImportFeedCommand.doImport (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:61:16)
    at ImportFeedCommand.<anonymous> (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:45:28)
    at next (native)
    at /usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:13:71
    at __awaiter (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:9:12)
    at ImportFeedCommand.run (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:43:16)
    at container.getCommand.then.c (/usr/local/lib/node_modules/dtd2mysql/dist/src/index.js:7:18)
    at process._tickCallback (internal/process/next_tick.js:109:7)
    at Module.runMain (module.js:606:11)
    at run (bootstrap_node.js:389:7)
    at startup (bootstrap_node.js:149:9)
Done

I'm not yet familiar with TypeScript and the error doesn't give me much of a clue as to whether the problem is with the db connection or elsewhere.

Any pointers much appreciated, cheers for making the project open!

"-U" stop activity flag seem to be incorrectly matched as "U " for non-public stop

One of the schedules for G80225 has this sequence of stops in raw CIF file:

BSNG802251709031709100000001 PXX1A850252123540003 DMUE   100      B S T        O
...
LIERROLLC           1258 00000000                     1
LIDUNDCJ            1306H00000000
CRDUNDETB XX1A850252123540003 DMUE   090      B S T                SR025200
LIDUNDETB 1307H1311H     000000004        K -U
LICMPRDNJ           1314 00000000
...

Here "LIDUNDETB" is a location=DUNDETB crs=DEE stop, with 0000 for public stop times and "K -U" for activity flags, which gets translated to this GTFS trip:

  [81568] headsign=G80225 svc=275 svc_days=2:
    <TS ......7 [2017-09-03 2017-09-10] {}>
    stop sequence:
      GLQ 11:45 11:45 P-
      STG 12:11 12:11 PD
      DBL 12:17 12:18 PD
      GLE 12:29 12:29 PD
      PTH 12:45 12:46 PD
      DEE 13:07 13:11 P-
      ARB 13:28 13:28 PD
      ...

More info (this train has a lot of non-public stops):
https://gist.github.com/mk-fg/1659e402c7b98fead612a497c07a0e52

DEE 13:07 13:11 P- ("P-" = "pickup only") seem to be derived from scheduled times, with pickup-only matched from "-U" ("stops to attach vehicles") as "U " ("take up passengers"), which are different activity flags.

This GTFS trip for this train_uid should not have "DEE" stop at all, as it doesn't have neither passenger pickup nor dropoff.

Incorrect schedules for ~1200 trips (associations)

Hi @linusnorton ,
I noticed that there is a trip in stop_times of a resulting GTFS which consists only of 1 station.
Digging deeper I found that the issue might be in the trips which have association by train uid.

The example train uids are: L07257 and L07493.
In the records in mysql table "stop_time" for the train_uid='43237' I can see that it is 100% correct and is equal to what I see in the source .mca file. However when the GTFS is generated based on this trip and it's association with another train_uid it results in a trip with a trip_headsign='L07257_L07493' and it consists only of one stop which is the "assoc_locaction=DONC".

Looks like the two trips which are outbound and inbound according to the used stops in the source .mca file are being merged in a strange way.
Since there are ~1200 trips in the resulting GTFS which have concatinated train uids in the trip_headsign field I made a conclusion that ~1200 trips are incorrectly converted. I also checked some other randomly and the issue is the same as described above.

I attached the original file
https://drive.google.com/file/d/1eiRbPhwpN56ba98rQhTHm5DwZ84jStth/view?usp=sharing
and the one we converted (or it is better that you convert with your latest version)
https://drive.google.com/file/d/1nMV5kmpkMXN3YQJNtrpJzHAjMYXauYrs/view?usp=sharing

gtfs stop_times.txt missing arrival/departure times

Mentioned it before, but thought I'd open an issue in case you might forget it :)

Current stop_times.txt has stuff like this:

trip_id,arrival_time,departure_time,stop_id,stop_sequence,stop_headsign,pickup_type,drop_off_type,shape_dist_traveled,timepoint
500001,,09:30:00,QXO,1,,0,1,,1
500001,23:59:00,,QXD,2,,1,0,,1

While https://developers.google.com/transit/gtfs/reference/stop_times-file says:

If you don't have separate times for arrival and departure at a stop,
enter the same value for arrival_time and departure_time.

I.e. should probably be:

500001,09:30:00,09:30:00,QXO,1,,0,1,,1
500001,23:59:00,23:59:00,QXD,2,,1,0,,1

Unless you want to auto-fill these with some projected values (which is probably better to do in feed parser, not in gtfs).

CR record in MCA file isn't handled

A CR record in an MCA file indicates a change in the service property en-route, for example, from an EMU to a DMU. It isn't currently handled in the importer.

For example, the service L65563 changes its property two times en-route, including the addition of catering at Swansea, and the power switch at Cardiff. I am looking to rewrite this timetable website using MySQL as the backend to improve performance however it isn't currently possible.

RGK MODE_CODE incorrectly specified as not nullable?

I'm getting an error with an import of the test Routeing download file RJRG0281.RGK

The third line of data contains the following:
00003,D,I,SRA,N,,
with an empty 5th position in the CSV record.
From the feed spec (RSPS5047) this seems valid as the ENTRY_TYPE is I; ("The Mode Code must be specified where Entry Type is ‘L’ or ‘N’.")

Should false for nullable be changed to true here:
https://github.com/open-track/dtd2mysql/blob/master/config/routeing/file/RGK.ts#L24
?

I can create a pull request unless there's something I'm missing.

Stops with weird arrival_time/departure_time combinations

Spotted one such for C74098, where one of the produced trips have this stop sequence:

GLC[23:15] - MTH[23:29/23:30] - **CRS[23:49/23:47]** -
CAR[1+01:39/1+01:44] - PRE[1+03:03/1+03:09] - WBQ[1+03:39/1+03:41] -
WFJ[1+06:43/1+06:44] - EUS[1+07:07]

Emphasis on "CRS[23:49/23:47]".

CIF data has no public_arrival_time, but has public_departure_time=23:47 and scheduled_arrival_time=23:49.

Didn't check how many of such quirks are in the data, but I think it indicates that correct logic for interpreting arr/dep times might be:

Either use scheduled-time (for non-public stops) or public-time
(for public ones), but don't mix them for same stop.

Based purely on this mismatch in CIF data though.

Setting environment variables.

Hi,

I'm Having trouble setting the environment variables. i have ran:

dtd2mysql DATABASE_NAME=fares
dtd2mysql DATABASE_USERNAME=root

dtd2mysql SET DATABASE_NAME=fares
dtd2mysql SET DATABASE_USERNAME=root

and the only response i get back is the default list of things to type

when i run

dtd2mysql --fares-clean

i get

Error: Please set the DATABASE_NAME environment variable. at Container.get databaseConfiguration [as databaseConfiguration] (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/Container.js:111:19) at Container.getDatabaseConnection (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/Container.js:104:82) at Container.descriptor.(anonymous function) (/usr/local/lib/node_modules/dtd2mysql/node_modules/memoized-class-decorator/index.js:28:42) at Container.getCleanFaresCommand (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/Container.js:60:69) at Container.descriptor.(anonymous function) (/usr/local/lib/node_modules/dtd2mysql/node_modules/memoized-class-decorator/index.js:28:42) at Container.getCommand (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/Container.js:29:47) at Container.descriptor.(anonymous function) (/usr/local/lib/node_modules/dtd2mysql/node_modules/memoized-class-decorator/index.js:28:42) at Object.<anonymous> (/usr/local/lib/node_modules/dtd2mysql/dist/src/index.js:6:6) at Module._compile (module.js:652:30) at Object.Module._extensions..js (module.js:663:10)

Could i get some assistance please.

Replacement bus GTFS route_type

Not critical. But, from what I can deduce, replacement bus services are being given route_type 6 when converted to GTFS. The spec says 6 is for suspended cable cars, while 3 is for buses. The extended spec has a specific code for Rail Replacement Bus Service, 714, which is technically preferable here and makes it easier to filter regular from temporary bus routes within the rail data, but support for these extensions is not universal.

npm install - looking for more instructions!

Some notes (first time with npm so forgive my ignorance!):

  • When in the cloned folder, running npm install -g dtd2mysql as instructed actually downloads and installs the package from npmjs.com. This is obviously npm's fault for overloading the meaning of 'install', I presume it's intended to be executed from the parent folder of the project root folder (which feels weird!)

  • got an error npm WARN checkPermissions Missing write access to ~/projects/dtd2mysql/node_modules/minimist not sure why, but I fixed it with sudo npm i -g minimist

  • how is the dtd2mysql/dist/ folder generated? It looks to have installed correctly, but hangs when I run dtd2mysql -v ... running node /usr/lib/node_modules/dtd2mysql/bin/dtd2mysql.sh gives Error: Cannot find module '../dist/src/index.js', and indeed there is no dist folder in ~/projects/dtd2mysql/ am I missing a step?

    • Update: I saw the prepublishOnly under scripts in package.json and I tried running tsc -p ./ --outDir dist/ manually and found I was missing tsc. Running sudo apt install typescript as suggested by my shell was the wrong thing and I eventually got it with sudo npm install -g typescript. The dist folder still didn't get generated or give an error message after npm install -g dtd2mysql, but a manual run of tsc -p ./ --outDir dist/ outputs that directory now
  • Now running the global dtd2mysql still hangs with no error message, but I can run it manually with node /usr/lib/node_modules/dtd2mysql/bin/dtd2mysql.sh

Anyhow, thanks for your patience — I know it's onerous converting a project that just works on your own machine into something reproducible elsewhere; hopefully something that I missed from above can be added to the build instructions or package.json file.

2017-08-27 date for C10254

As mentioned in email, stumbled upon this quirk when comparing specific train-uid/date from current (or pre-last current) ttis632.zip vs ticket.southeasternrailway.co.uk data:

% ./gtfs-webcheck.py -f - -d gtfs_ts -u C10254 -t 2017-08-27
[wqSH] API [serw] data mismatch for gtfs trip: GWCTestFailNoJourney
[wqSH] Trip: <Trip C10254 [- -] [TAU - BWT - HIG - WSM - WNM - WOR - YAT - NLS - BRI - FIT - STJ - NWP - CDF]>
[wqSH] Date/time: 2017-08-27 11:16:00
[wqSH] Diff details:
[wqSH]   [ <Jn [11:52 14:20] [P31905(11:52+4) - C10254(12:48+6) - V62937(13:40+2)]>,
[wqSH]     <Jn [12:50 15:20] [P30898(12:50+4) - P00240(13:47+7) - V62938(14:40+2)]>,
[wqSH]     <Jn [13:23 15:45] [P31787(13:23+6) - P00090(14:41+5) - K74283(15:15+2)]>,
[wqSH]     <Jn [11:36 14:20] [C10254(11:36+15) - V62937(13:40+2)]>,
[wqSH]     <Jn [12:00 14:45] [P31780(12:00+7) - P00087(13:39+5) - K74282(14:15+2)]>]

I.e. no direct trains for that date there.

CIF and produced GTFS data looks like this: https://gist.github.com/mk-fg/06dcbc6a2fb10f4f181040f7c16b8d50
And I don't really see why 2017-08-27 will be special there off-hand.

But somehow it is also special in python implementation results: https://gist.github.com/mk-fg/daadec018cf8246a0ce71ea7adca3130
So it looks like it should be special according to both serw and py data, and code here does the wrong thing, i.e. a bug.

Still looking into it myself, will post here once I fugire out what happens there...

Out of memory

Fresh install using npm, when trying to launch dtd2mysql with any parameters throws out of memory error.

~$ dtd2mysql

<--- Last few GCs --->

 105 ms: Mark-sweep 1.4 (5.5) -> 1.4 (5.5) MB, 3.1 / 0.0 ms [allocation failure] [GC in old space requested].
 108 ms: Mark-sweep 1.4 (5.5) -> 1.4 (5.5) MB, 2.6 / 0.0 ms [allocation failure] [GC in old space requested].
 110 ms: Mark-sweep 1.4 (5.5) -> 1.3 (5.5) MB, 2.1 / 0.0 ms [allocation failure] [GC in old space requested].
 112 ms: Mark-sweep 1.3 (5.5) -> 1.3 (4.5) MB, 2.1 / 0.0 ms [last resort gc].
 114 ms: Mark-sweep 1.3 (4.5) -> 1.3 (4.5) MB, 2.1 / 0.0 ms [last resort gc].

Joins at the beginning of the route should not be processed

In the current December 2023 timetable, there is a "join" entry of base W04046 (Mo - Fr 18:22 Ramsgate - Dover - Charing Cross) and association W03086 (Mo - Fr 15:29 Charing Cross - Canterbury - Ramsgate) at Ramsgate.

These are separate services in the timetable confirmed in National Rail Enquiries and various other websites. However, dtd2mysql joins them together to form a "circular" service on the long way round Kent.

In my opinion, if the "join" location is at the beginning of the route for the base UID, it should not be processed, similarly, if the "split" location is at the end of the route, it should not be processed.

Add error handlers for non-interactive operation

I've kinda mentioned it before in #7 (comment) so apologies if it's a non-issue, feel free to close, won't mention it again.

Problem looks like this:

% ./node_modules/.bin/ts-node --max_old_space_size=8000 --max_new_space_size=8000 ./src/index.ts --fares _tmp/RJFAF499.ZIP
Extracting _tmp/RJFAF499.ZIP to /tmp/dtd/fares/
Finished RJFAF499.RCM
Finished RJFAF499.RLC
Finished RJFAF499.TAP
Finished RJFAF499.TOC
Finished RJFAF499.TVL
Finished RJFAF499.SUP
Finished RJFAF499.TSP
Finished RJFAF499.TTY
Finished RJFAF499.RTE
Finished RJFAF499.DIS
Finished RJFAF499.FSC
Finished RJFAF499.RST
Finished RJFAF499.LOC
Finished RJFAF499.FNS
{ Error: Duplicate entry 'C-5h' for key 'restriction_header_key'
    at PromisePool.query (dtd2mysql/node_modules/mysql2/promise.js:272:20)
    at MySQLTable.flush (dtd2mysql/src/database/MySQLTable.ts:34:29)
    at MySQLTable.close (dtd2mysql/src/database/MySQLTable.ts:44:10)
    at console.log.Promise.all.Object.values.map.t (dtd2mysql/src/cli/ImportFeedCommand.ts:101:128)
    at Array.map (native)
    at Interface.readStream.on (dtd2mysql/src/cli/ImportFeedCommand.ts:101:117)
    at emitNone (events.js:110:20)
    at Interface.emit (events.js:207:7)
    at Interface.close (readline.js:366:8)
    at ReadStream.onend (readline.js:146:10)
    at emitNone (events.js:110:20)
    at ReadStream.emit (events.js:207:7)
    at endReadableNT (_stream_readable.js:1045:12)
    at _combinedTickCallback (internal/process/next_tick.js:138:11)
    at process._tickDomainCallback (internal/process/next_tick.js:218:9)
  message: 'Duplicate entry \'C-5h\' for key \'restriction_header_key\'',
  code: 'ER_DUP_ENTRY',
  errno: 1062,
  sqlState: '#23000' }
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: Pool is closed.
(node:16662) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 2): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 3): Error: Pool is closed.
Done
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 4): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 5): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 6): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 7): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 8): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 9): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 10): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 11): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 12): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 13): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 14): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 15): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 16): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 17): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 18): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 19): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 20): Error: Pool is closed.
(node:16662) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 21): Error: Pool is closed.
...

And so on for hundreds more lines instead of an exit with error code as soon as actual fatal error happens.

This effectively prevents the tool to be used in a usual unix-y non-interactive fashion - e.g. a script that does --fares && --timetable && other stuff, requiring user to "baby-sit" it in case it throws bunch of errors like that instead of failing, which looks like a bug.

Critical problem with route_type. Bus and rail trips under one route

Hi,
I found a critical problem which has an impact on companies using atoc rail data to create shapes and in some other cases.

The problem is that the same route can have trips which belong to different route_types.

I looked throught the .MCA file provided by http://data.atoc.org and I can see that the trips of train category OO/XX (route_type=2) and BR (route_type=714) are linked to the same route which has a route_type=2. I checked the data by train_uid and retail_train_id.
The result of the issue is that in GTFS we get a mixed mode route.

Impact: Companies using such GTFS files to generate shapes will have problems trying to generate rail shapes for bus trips. The stop locations of such replacement bus stops are far off the rail tracks

Filename mismatch

The filenames for the files from ATOC seem to be ignored. They are:
Timetable RJTTC345.ZIP
Routeing RJRG0281.ZIP - works
Fares RJFAC280.ZIP

Line 64 of DownloadCommands.ts seems to be the culprit:
const lastRefresh = dir.findIndex(i => i.filename.charAt(4) === "F" || i.filename.startsWith("RJRG"));

Times for splitting train at split-association stop set from incorrect fields

Looking at W34496_W34952 association with pre-latest timetable data (ttis653.zip):
https://gist.github.com/mk-fg/93a5acda3e710d215665d2a7657dcef5

For date 2017-09-10:

cif schedules/stops:
  ...
  W34496  110466 O 2017-09-10 2017-09-10 ......7 A GRT 13:42 13:43 T
  W34496  110466 O 2017-09-10 2017-09-10 ......7 A SAL 13:56 13:57 T
  W34496  110466 O 2017-09-10 2017-09-10 ......7 A TIS 14:11 14:11 T
  ...
  W34952  110619 O 2017-09-10 2017-09-10 ......7 A SAL --:-- 13:55 TB
  W34952  110619 O 2017-09-10 2017-09-10 ......7 A WMN 14:14 14:15 T
  ...
cif associations:
  ...
  W34496 W34952 1023 N VV - SAL 2017-09-10 2017-09-10 ......7
  ...
gtfs trips:
  [110619] headsign=W34496_W34952 svc=284 svc_days=1:
    <TS ......7 [2017-09-10 2017-09-10] {}>
    stop sequence:
      ...
      GRT 13:42 13:43 PD
      SAL 13:56 13:56 PD
      WMN 14:14 14:15 PD
      ...

southeasternrailway.co.uk has (or had) 13:55 as a departure time for that SAL stop, which seem to make sense here - it looks like W34952 might split from W34496 before latter makes public stop, hence public arrival for W34496 is after W34952.

Which is why I think in case of splitting trains, it'd seem to be more correct to only use arrival/departure times of the splitting part on that stop, completely ignoring times for other - potentially already separated - part, and not combine arrival/departure like it's currently done.

dtd2mysql --timetable Error: Column count doesn't match value count at row 1

Hi there,

I've been looking for a CIF/DTD to GTFS converter and came across this code. I've managed to setup my SQL databases and extracted the fare data successfully; however, when trying to run the following:

DATABASE_USERNAME=root DATABASE_NAME=timetable dtd2mysql --timetable ttis926.ZIP

Using the latest timetable data from http://data.atoc.org/ I get the following error:

Extracting ttis926.ZIP to /tmp/dtd/timetable/ Finished ttisf926.alf { Error: Column count doesn't match value count at row 1 at PromisePool.query (/usr/local/lib/node_modules/dtd2mysql/node_modules/mysql2/promise.js:344:20) at MySQLTable.query (/usr/local/lib/node_modules/dtd2mysql/dist/src/database/MySQLTable.js:68:32) at MySQLTable.queryWithRetry (/usr/local/lib/node_modules/dtd2mysql/dist/src/database/MySQLTable.js:53:24) at MySQLTable.flush (/usr/local/lib/node_modules/dtd2mysql/dist/src/database/MySQLTable.js:35:30) at MySQLTable.close (/usr/local/lib/node_modules/dtd2mysql/dist/src/database/MySQLTable.js:45:14) at Promise.all.Object.values.map.t (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:114:59) at Array.map (<anonymous>) at Interface.readStream.on (/usr/local/lib/node_modules/dtd2mysql/dist/src/cli/ImportFeedCommand.js:114:48) at emitNone (events.js:111:20) at Interface.emit (events.js:208:7) message: 'Column count doesn\'t match value count at row 1', code: 'ER_WRONG_VALUE_COUNT_ON_ROW', errno: 1136, sqlState: '21S01', sqlMessage: 'Column count doesn\'t match value count at row 1' }

The --gtfs-zip process subsequently fails.

Any help would be appreciated.

Thanks

Michael

Use of "activity" field flags for GTFS pickup/dropoff

Currently, scheduled_arrival_time / public_arrival_time fields are used to set GTFS flags, which seem to work, but looking at other CIF fields recently for some weird mismatches, found that "activity" field seem to be one that should indicate these things.

Its values (up to 6 2-char codes) are described in Appendix A on page 35 of "CIF USER SPEC v29 FINAL.pdf":
http://www.atoc.org/download/clientfiles/files/RSPDocuments/CIF%20USER%20SPEC%20v29%20FINAL.pdf
(linked from http://nrodwiki.rockshore.net/index.php/Schedule_Records )
(btw, at the bottom of page 36 in the same Appendix A, there's description of how to interpret BHX - "Bank Holiday" field values, all of which is missing from e.g. RSPS5046 doc)

In common case, flags there are set like printed in the rightmost column here:

cif schedules/stops:
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   BHM --:-- 10:12 TB
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   XOS --:-- --:--
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   XRA --:-- --:--
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   TAB 10:26 10:27 T
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   BSC --:-- --:--
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   WSL 10:34 10:35 T
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   BLX --:-- --:--
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   CAO 10:48 10:49 T
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   HNF 10:53 10:54 T
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   RGT 11:01 11:02 T
  P46355   49222 P 2017-05-27 2017-12-09 .....6.   RGL 11:07 --:-- TF
  P46355   72138 O 2017-08-19 2017-08-19 .....6.   BHM --:-- 10:12 TB
  P46355   72138 O 2017-08-19 2017-08-19 .....6.   XOS --:-- --:--
  P46355   72138 O 2017-08-19 2017-08-19 .....6.   XRA --:-- --:--
  P46355   72138 O 2017-08-19 2017-08-19 .....6.   TAB 10:26 --:-- TF
  P46355   92807 O 2017-08-26 2017-08-26 .....6.   BHM --:-- 10:12 TB
  P46355   92807 O 2017-08-26 2017-08-26 .....6.   XOS --:-- --:--
  P46355   92807 O 2017-08-26 2017-08-26 .....6.   XRA --:-- --:--
  P46355   92807 O 2017-08-26 2017-08-26 .....6.   TAB 10:26 10:26 T
  P46355   92807 O 2017-08-26 2017-08-26 .....6.   BSC --:-- --:--
  P46355   92807 O 2017-08-26 2017-08-26 .....6.   WSL 10:35 --:-- TF

With "TB" for origin stop, "TF" for destination one, and U/D/T for pickup/dropoff/both stops, respectively, as well as "R" for "stop when required" and such.

Looking at whether these flags always line up with the current algo, found that they seem to be, and queries like select * from cif.stop_time where activity like '%D%' and activity not like '%-D%' and public_arrival_time is null all give no results, so no skipped stops in gtfs.

But for "R" flags, GTFS drop_off_type isn't set correctly, and should be 3 (coordinated) instead of 0 (normal).
And in general, I think it's somewhat more flimsy to rely on NULL in times instead of flags dedicated to exactly this purpose.

So maybe worth using "activity" field flags for setting GTFS pickup_type/drop_off_type fields in stop_times.txt instead of current approach?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.