Comments (6)
Hello @Htorne , thanks for filing this Bug report.
I do not know about the runneradmin path, but you can look at the referenced file here: https://github.com/pacman82/odbc-api/blob/d6d83449d2164d7e8a47dcb437071584782da663/odbc-api/src/handles/diagnostics.rs#L55
This Bug must be fixed upstream in odbc-api
, but it is also a crate I do maintain.
I have a hypothesis on what is happening:
The rec_number
variable is an i16
used to index a diagnostic record. My guess is it overflows, because an API call causes more than 32767 warnings. I further assume what that many warnings are likely generated by something which causes one warning by row. Which also fits neatly with you describing the tool working fine with the smaller tenant.
The fix I have in mind is a one liner, but I must think about how I do best reproduce it.
In the meanwhile you could help me, by verifying my assumptions. Please let the command run with the -vvvv
flag to get more verbose output. Do you see a lot of messages?
Also you likely have a job at hand, which you want to get done: Reducing the --batch-size
to something smaller than 32767 might solve your problem, provided only one warning per row is generated. Don't worry, it will still process the entire 60GB+ data set, just in smaller junks. If this works it would further support the diagnosis. Please let me know.
from odbc2parquet.
PS. if it turns out to be an issue with too much warnings, I would also like to know what the warning is about. Maybe there is another issue hiding underneath.
Thanks!
from odbc2parquet.
Released a version 0.4.2 which includes the upstream fix. Please tell me if this solves your issue.
from odbc2parquet.
Thanks pacman82 it worked!
from odbc2parquet.
Mind telling me about the warnings you got?
from odbc2parquet.
2020-12-03T13:05:09+00:00 - WARN - State: 01000, Native error: 5701, Message: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Changed database context to 'master'.
2020-12-03T13:05:09+00:00 - WARN - State: 01000, Native error: 5703, Message: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Changed language setting to us_english.
2020-12-03T13:05:09+00:00 - INFO - Batch size set to 100000
2020-12-03T13:05:09+00:00 - DEBUG - ODBC column description for column 1: ColumnDescription { name: [73, 100], data_type: Numeric { precision: 15, scale: 1 }, nullable: Nullable }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC buffer description for column 1: BufferDescription { nullable: true, kind: Text { max_str_len: 17 } }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC column description for column 2: ColumnDescription { name: [70, 105, 114, 115, 116, 78, 97, 109, 101], data_type: Other { data_type: SqlDataType(-9), column_size: 40, decimal_digits: 0 },
nullable: NoNulls }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC buffer description for column 2: BufferDescription { nullable: true, kind: Text { max_str_len: 40 } }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC column description for column 3: ColumnDescription { name: [76, 97, 115, 116, 78, 97, 109, 101], data_type: Other { data_type: SqlDataType(-9), column_size: 40, decimal_digits: 0 },
nullable: NoNulls }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC buffer description for column 3: BufferDescription { nullable: true, kind: Text { max_str_len: 40 } }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC column description for column 4: ColumnDescription { name: [67, 105, 116, 121], data_type: Other { data_type: SqlDataType(-9), column_size: 40, decimal_digits: 0 }, nullable: Nullable }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC buffer description for column 4: BufferDescription { nullable: true, kind: Text { max_str_len: 40 } }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC column description for column 5: ColumnDescription { name: [67, 111, 117, 110, 116, 114, 121], data_type: Other { data_type: SqlDataType(-9), column_size: 40, decimal_digits: 0 },
nullable: Nullable }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC buffer description for column 5: BufferDescription { nullable: true, kind: Text { max_str_len: 40 } }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC column description for column 6: ColumnDescription { name: [80, 104, 111, 110, 101], data_type: Other { data_type: SqlDataType(-9), column_size: 20, decimal_digits: 0 }, nullable:
Nullable }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC buffer description for column 6: BufferDescription { nullable: true, kind: Text { max_str_len: 20 } }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC column description for column 7: ColumnDescription { name: [71, 101, 111, 109, 101, 116, 114, 121], data_type: Other { data_type: SqlDataType(-151), column_size: 0, decimal_digits: 0 },
nullable: Nullable }
2020-12-03T13:05:09+00:00 - DEBUG - ODBC buffer description for column 7: BufferDescription { nullable: true, kind: Text { max_str_len: 0 } }
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
Then I get around 300.000 errors like this
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
2020-12-03T13:05:45+00:00 - WARN - State: 01004, Native error: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]String data, right truncation
My guess is that it has to do with conversion of the MSSQL data type Geography to Utf8 Byte Array?
from odbc2parquet.
Related Issues (20)
- Support for data type timezone conversion to UTC HOT 11
- Warnings shown when quiet flag HOT 4
- Automatic change of batch size when memory error occurs HOT 7
- Converted type not written to output file for timestamps without timezone HOT 9
- Support MSSQL data type TIME HOT 10
- Compression SNAPPY not possible since version 0.13.2 HOT 5
- export in chunks? HOT 3
- Flag to support legacy converted types HOT 1
- Option to not generate file if row count is 0 HOT 4
- setup types for particular column HOT 2
- Issue with MySQL JSON columns HOT 8
- Reserved Column Names not Supported HOT 1
- Feature Request - Support column encryption in the generated parquet file HOT 4
- JobName as .sql file in config file HOT 4
- Parquet format version support HOT 9
- Feature suggestion: connect to URL `postgresql://username:pass@host/database` HOT 1
- What permissions are needed? - State: 42501, Native error: 1, Message: ERROR: permission denied HOT 4
- StarRocks parquet file import of parquet file generated by odbc2parquet fails with encoding error HOT 11
- Memory allocation with column-length-limit HOT 11
- Build for alpine HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from odbc2parquet.