Giter Site home page Giter Site logo

Wrongly parsed hash about xsv HOT 5 CLOSED

martijn avatar martijn commented on August 20, 2024
Wrongly parsed hash

from xsv.

Comments (5)

voxik avatar voxik commented on August 20, 2024

Of course this might fall into "and does not deal with most formatting or more advanced functionality." category ...

from xsv.

martijn avatar martijn commented on August 20, 2024

The problem in this file is that there are multiple cells in the header that are empty (B1, G1, K1, N1, P1). Hash keys in Ruby are unique, so there can be only one 'nil' key and that ends up with the value from P2. This is the unpredicted result the README mentiond:

Be aware that hash mode will lead to unpredictable results if the worksheet has multiple columns with the same header.

As it stands, your best option is to use array mode (parse_headers: false) to read this file.

If anybody can come up with a pull request to ignore the columns with empty header cells in hash mode, I'd be open to merging that. I don't have time to work on that myself right now, unfortunately.

from xsv.

voxik avatar voxik commented on August 20, 2024

Right, thx for confirming my understanding.

Not sure if I'll have motivation to dig into this myself ATM, because, as you suggested, I have already went with array mode any parsing headers by hand for the moment (and I could probably live also with just indices).

Nevertheless, if I would have time, what would be your preferred solution?

  1. Document this as expected behavior, because any other solution will likely impact the performance which seems to be key goal of this project.
  2. Just detect the there are nils in headers and fail.
  3. Ignore the columns with nil headers.
  4. Something else?

from xsv.

shkm avatar shkm commented on August 20, 2024

I think it's reasonable to throw an error in cases where there are duplicate headers in hash mode. I could imagine wanting to be alerted of this rather than proceeding with potentially unintended behavior.

My two cents.

from xsv.

martijn avatar martijn commented on August 20, 2024

Right, I guess it's fair to raise an exception in this situation since the resulting data is useless as illustrated by this issue. I'll leave this issue open until we implement at least that.

Alternatively I'd opt for option 3 out of @voxik 's suggestions above: by implementing a col_skip array in SheetRowHandler we could skip the columns with empty headers. Any data in those colums would be lost. Or we replace the empty headers with a generated header name "Column B", "Column G" so data can be salvaged. That is probably easier to implement as well (in Sheet#parse_headers).

from xsv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.