Giter Site home page Giter Site logo

Comments (3)

pallavagarwal07 avatar pallavagarwal07 commented on July 2, 2024

+1 @ben-strasser I seem to need the opposite of this. If set_header specifies more columns (say n) and the csv file has m columns, then the last m-n columns should get default value as in ignore_missing_columns

from fast-cpp-csv-parser.

ben-strasser avatar ben-strasser commented on July 2, 2024

Hi,

I originally considered adding a more general set_header however decided not to do it.

If you do not know the number of columns in the file then how do you know which you need? You might say that you need the first x columns. But why not the last x? or the every other column? When writing the parser you cannot know how someone will modify the file format. Where will he add his new column? Will he remove columns? All these question can be handled transparently and missing columns detected when the CSV file has a header. If it does not then this is not possible. I therefore argue that if the CSV format changes and does not have a header that in any case the programmer will have to check manually whether the parsing code still works. Having a set_header with an ignore_policy that only reads the first x parameters is therefore a bug in the making.

If you know the CSV format and the number of columns but only want to read some columns then you can use dummy char* variables. These pointers point directly into the memory buffer. There is therefore nearly no overhead associated. You can argue that for this usecase the interface is ugly and you are right. However, I think that this usecase is sufficiently rare that we can live with the current inferface, especially I do not see how to design an interface that is both flexible and elegant. Using a complicated interface is no prettier than the current situation.

Further having an ugly interface for CSV files without header has its use: It pushes people towards adding headers, which will help them down the line when the CSV file format is updated.

Best Regards
Ben Strasser

from fast-cpp-csv-parser.

adishavit avatar adishavit commented on July 2, 2024

I get what you're saying. For argument's sake, consider an C/C++ function with default values. You can specify only the first k<n arguments and the rest get the defaults.
There is no syntactic option for using just the last k or interleaving.
I could argue the same here. If you want just the first k columns, it is a valid use case. Otherwise, use dummy variables.
I ended up using dummies too.

Sent from my iPhone

On 25 May 2016, at 08:59, ben-strasser [email protected] wrote:

Hi,

I originally considered adding a more general set_header however decided not to do it.

If you do not know the number of columns in the file then how do you know which you need? You might say that you need the first x columns. But why not the last x? or the every other column? When writing the parser you cannot know how someone will modify the file format. Where will he add his new column? Will he remove columns? All these question can be handled transparently and missing columns detected when the CSV file has a header. If it does not then this is not possible. I therefore argue that if the CSV format changes and does not have a header that in any case the programmer will have to check manually whether the parsing code still works. Having a set_header with an ignore_policy that only reads the first x parameters is therefore a bug in the making.

If you know the CSV format and the number of columns but only want to read some columns then you can use dummy char* variables. These pointers point directly into the memory buffer. There is therefore nearly no overhead associated. You can argue that for this usecase the interface is ugly and you are right. However, I think that this usecase is sufficiently rare that we can live with the current inferface, especially I do not see how to design an interface that is both flexible and elegant. Using a complicated interface is no prettier than the current situation.

Further having an ugly interface for CSV files without header has its use: It pushes people towards adding headers, which will help them down the line when the CSV file format is updated.

Best Regards
Ben Strasser


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub

from fast-cpp-csv-parser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.