Giter Site home page Giter Site logo

enpnetwork / simple_xlsx_reader Goto Github PK

View Code? Open in Web Editor NEW

This project forked from woahdae/simple_xlsx_reader

0.0 0.0 0.0 244 KB

Ruby xlsx reader to parse cell values into plain ruby primitives and dates/times.

License: MIT License

Ruby 100.00%

simple_xlsx_reader's Introduction

SimpleXlsxReader Build Status

An xlsx reader for Ruby that parses xlsx cell values into plain ruby primitives and dates/times.

This is not a rewrite of excel in Ruby. Font styles, for example, are parsed to determine whether a cell is a number or a date, then forgotten. We just want to get the data, and get out!

Usage

Summary:

doc = SimpleXlsxReader.open('/path/to/workbook.xlsx')
doc.sheets # => [<#SXR::Sheet>, ...]
doc.sheets.first.name # 'Sheet1'
doc.sheets.first.rows # [['Header 1', 'Header 2', ...]
                         ['foo', 2, ...]]

That's it!

Load Errors

By default, cell load errors (ex. if a date cell contains the string 'hello') result in a SimpleXlsxReader::CellLoadError.

If you would like to provide better error feedback to your users, you can set SimpleXlsxReader.configuration.catch_cell_load_errors = true, and load errors will instead be inserted into Sheet#load_errors keyed by [rownum, colnum].

More

Here's the totality of the public api, in code:

module SimpleXlsxReader
  def self.open(file_path)
    Document.new(file_path: file_path).tap(&:sheets)
  end

  def self.parse(string_or_io)
    Document.new(string_or_io: string_or_io).tap(&:sheets)
  end

  class Document
    attr_reader :string_or_io

    def initialize(legacy_file_path = nil, file_path: nil, string_or_io: nil)
      ((file_path || legacy_file_path).nil? ^ string_or_io.nil?) ||
        fail(ArgumentError, 'either file_path or string_or_io must be provided')

      @string_or_io = string_or_io || File.new(file_path || legacy_file_path)
    end

    def sheets
      @sheets ||= Mapper.new(xml).load_sheets
    end

    def to_hash
      sheets.inject({}) {|acc, sheet| acc[sheet.name] = sheet.rows; acc}
    end

    def xml
      Xml.load(string_or_io)
    end

    class Sheet < Struct.new(:name, :rows)
      def headers
        rows[0]
      end

      def data
        rows[1..-1]
      end

      # Load errors will be a hash of the form:
      # {
      #   [rownum, colnum] => '[error]'
      # }
      def load_errors
        @load_errors ||= {}
      end
    end
  end
end

Installation

Add this line to your application's Gemfile:

gem 'simple_xlsx_reader'

And then execute:

$ bundle

Or install it yourself as:

$ gem install simple_xlsx_reader

Versioning

This project follows semantic versioning 1.0

Contributing

Remember to write tests, think about edge cases, and run the existing suite.

Note that as of commit 665cbafdde, the most extreme end of the linear-time performance test, which is 10,000 rows (12 columns), runs in ~4 seconds on Ruby 2.1 on a 2012 MBP. If the linear time assertion fails or you're way off that, there is probably a performance regression in your code.

Then, the standard stuff:

  1. Fork this project
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

simple_xlsx_reader's People

Contributors

woahdae avatar bwlang avatar quoideneuf avatar til avatar bforma avatar marilynfranklin avatar kalsan avatar strnadj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.