I have a line with, say 15 columns, but I'm interested only in the first 5. Howeve

+1 <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

How to parse less columns than in the line when using `set_header()` about fast-cpp-csv-parser HOT 3 CLOSED

ben-strasser commented on July 2, 2024

How to parse less columns than in the line when using `set_header()`

from fast-cpp-csv-parser.

Comments (3)

pallavagarwal07 commented on July 2, 2024

+1 @ben-strasser I seem to need the opposite of this. If set_header specifies more columns (say n) and the csv file has m columns, then the last m-n columns should get default value as in ignore_missing_columns

from fast-cpp-csv-parser.

ben-strasser commented on July 2, 2024

Hi,

I originally considered adding a more general set_header however decided not to do it.

If you do not know the number of columns in the file then how do you know which you need? You might say that you need the first x columns. But why not the last x? or the every other column? When writing the parser you cannot know how someone will modify the file format. Where will he add his new column? Will he remove columns? All these question can be handled transparently and missing columns detected when the CSV file has a header. If it does not then this is not possible. I therefore argue that if the CSV format changes and does not have a header that in any case the programmer will have to check manually whether the parsing code still works. Having a set_header with an ignore_policy that only reads the first x parameters is therefore a bug in the making.

If you know the CSV format and the number of columns but only want to read some columns then you can use dummy char* variables. These pointers point directly into the memory buffer. There is therefore nearly no overhead associated. You can argue that for this usecase the interface is ugly and you are right. However, I think that this usecase is sufficiently rare that we can live with the current inferface, especially I do not see how to design an interface that is both flexible and elegant. Using a complicated interface is no prettier than the current situation.

Further having an ugly interface for CSV files without header has its use: It pushes people towards adding headers, which will help them down the line when the CSV file format is updated.

Best Regards
Ben Strasser

from fast-cpp-csv-parser.

adishavit commented on July 2, 2024

I get what you're saying. For argument's sake, consider an C/C++ function with default values. You can specify only the first k<n arguments and the rest get the defaults.
There is no syntactic option for using just the last k or interleaving.
I could argue the same here. If you want just the first k columns, it is a valid use case. Otherwise, use dummy variables.
I ended up using dummies too.

Sent from my iPhone

On 25 May 2016, at 08:59, ben-strasser [email protected] wrote:

Hi,

I originally considered adding a more general set_header however decided not to do it.

If you do not know the number of columns in the file then how do you know which you need? You might say that you need the first x columns. But why not the last x? or the every other column? When writing the parser you cannot know how someone will modify the file format. Where will he add his new column? Will he remove columns? All these question can be handled transparently and missing columns detected when the CSV file has a header. If it does not then this is not possible. I therefore argue that if the CSV format changes and does not have a header that in any case the programmer will have to check manually whether the parsing code still works. Having a set_header with an ignore_policy that only reads the first x parameters is therefore a bug in the making.

If you know the CSV format and the number of columns but only want to read some columns then you can use dummy char* variables. These pointers point directly into the memory buffer. There is therefore nearly no overhead associated. You can argue that for this usecase the interface is ugly and you are right. However, I think that this usecase is sufficiently rare that we can live with the current inferface, especially I do not see how to design an interface that is both flexible and elegant. Using a complicated interface is no prettier than the current situation.

Further having an ugly interface for CSV files without header has its use: It pushes people towards adding headers, which will help them down the line when the CSV file format is updated.

Best Regards
Ben Strasser

—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub

from fast-cpp-csv-parser.

How to parse less columns than in the line when using `set_header()` about fast-cpp-csv-parser HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent