Giter Site home page Giter Site logo

szyhf / go-excel Goto Github PK

View Code? Open in Web Editor NEW
189.0 4.0 35.0 296 KB

A simple and light excel file reader to read a standard excel as a table faster | 一个轻量级的Excel数据读取库,用一种更`关系数据库`的方式解析Excel。

License: BSD 3-Clause "New" or "Revised" License

Go 100.00%
xlsx reader go golang

go-excel's Introduction

Intro | 简介

Go Report Card cover.run go GoDoc

Expect to create a reader library to read relate-db-like excel easily. Just like read a config.

This library can read all xlsx file correctly from our project now.

go get github.com/szyhf/go-excel

Example | 用例

Here is a simple example.

Assume you have a xlsx file like below:

ID NameOf Age Slice UnmarshalString
1 Andy 1 1|2 {"Foo":"Andy"}
2 Leo 2 2|3|4 {"Foo":"Leo"}
3 Ben 3 3|4|5|6 {"Foo":"Ben"}
4 Ming 4 1 {"Foo":"Ming"}
  • the first row is the title row.
  • other row is the data row.

All examples list in https://godoc.org/github.com/szyhf/go-excel#pkg-examples.

// defined a struct
type Standard struct {
	// use field name as default column name
	ID      int
	// column means to map the column name
	Name    string `xlsx:"column(NameOf)"`
	// you can map a column into more than one field
	NamePtr *string `xlsx:"column(NameOf)"`
	// omit `column` if only want to map to column name, it's equal to `column(AgeOf)`
	Age     int     `xlsx:"AgeOf"`
	// split means to split the string into slice by the `|`
	Slice   []int `xlsx:"split(|)"`
	// *Temp implement the `encoding.BinaryUnmarshaler`
	Temp    *Temp `xlsx:"column(UnmarshalString)"`
	// support default encoding of json
	TempEncoding *TempEncoding `xlsx:"column(UnmarshalString);encoding(json)"`
	// use '-' to ignore.
	Ignored string `xlsx:"-"`
}

// func (this Standard) GetXLSXSheetName() string {
// 	return "Some other sheet name if need"
// }

type Temp struct {
	Foo string
}

// self define a unmarshal interface to unmarshal string.
func (this *Temp) UnmarshalBinary(d []byte) error {
	return json.Unmarshal(d, this)
}

func simpleUsage() {
	// will assume the sheet name as "Standard" from the struct name.
	var stdList []Standard
	err := excel.UnmarshalXLSX("./testdata/simple.xlsx", &stdList)
	if err != nil {
		panic(err)
	}
}

func defaultUsage(){
	conn := excel.NewConnecter()
	err := conn.Open("./testdata/simple.xlsx")
	if err != nil {
		panic(err)
	}
	defer conn.Close()

	// Generate an new reader of a sheet
	// sheetNamer: if sheetNamer is string, will use sheet as sheet name.
	//             if sheetNamer is int, will i'th sheet in the workbook, be careful the hidden sheet is counted. i ∈ [1,+inf]
	//             if sheetNamer is a object implements `GetXLSXSheetName()string`, the return value will be used.
	//             otherwise, will use sheetNamer as struct and reflect for it's name.
	// 			   if sheetNamer is a slice, the type of element will be used to infer like before.
	rd, err := conn.NewReader(stdSheetName)
	if err != nil {
		panic(err)
	}
	defer rd.Close()

	for rd.Next() {
		var s Standard
		// Read a row into a struct.
		err:=rd.Read(&s)
		if err!=nil{
			panic(err)
		}
		fmt.Printf("%+v",s)
	}

	// Read all is also supported.
	// var stdList []Standard
	// err = rd.ReadAll(&stdList)
	// if err != nil {
	//   panic(err)
	//	 return
	// }
	// fmt.Printf("%+v",stdList)

	// map with string key is support, too.
	// if value is not string
	// will try to unmarshal to target type
	// but will skip if unmarshal failed.

	// var stdSliceList [][]string
	// err = rd.ReadAll(&stdSliceList)
	// if err != nil {
	//   panic(err)
	//	 return
	// }
	// fmt.Printf("%+v",stdSliceList)

	// var stdMapList []map[string]string
	// err = rd.ReadAll(&stdMapList)
	// if err != nil {
	//   panic(err)
	//	 return
	// }
	// fmt.Printf("%+v",stdMapList)

	// Using binary instead of file.
	// xlsxData, err := ioutil.ReadFile(filePath)
	// if err != nil {
	// 	log.Println(err)
	// 	return
	// }

	// conn := excel.NewConnecter()
	// err = conn.OpenBinary(xlsxData)
}

See the simple.xlsx.Standard in testdata and code in ./standard_test.go and ./standard_example_test.go for details.

While read into a []string, row of title can have duplicated titles, otherwise ErrDuplicatedTitles will be return.

Advance | 进阶用法

The advance usage can make more options.

Config | 配置

Using a config as "excel.Config":

type Config struct {
	// sheet: if sheet is string, will use sheet as sheet name.
	//        if sheet is a object implements `GetXLSXSheetName()string`, the return value will be used.
	//        otherwise, will use sheet as struct and reflect for it's name.
	// 		  if sheet is a slice, the type of element will be used to infer like before.
	Sheet interface{}
	// Use the index row as title, every row before title-row will be ignore, default is 0.
	TitleRowIndex int
	// Skip n row after title, default is 0 (not skip), empty row is not counted.
	Skip int
	// Auto prefix to sheet name.
	Prefix string
	// Auto suffix to sheet name.
	Suffix string
}

Tips:

  • Empty row will be skipped.
  • Column larger than len(TitleRow) will be skipped.
  • Only empty cell can fill with default value, if a cell can not parse into a field it will return an error.
  • Default value can be unmarshal by encoding.BinaryUnmarshaler, too.
  • If no title row privoded, the default column name in exce like 'A', 'B', 'C', 'D' ......, 'XFC', 'XFD' can be used as column name by 26-number-system.

For more details can see the code in ./test/advance_test.go and file in simple.xlsx.Advance.suffx sheet.

XLSX Tag | 标签使用

column

Map to field name in title row, by default will use the field name.

default

Set default value when no value is filled in excel cell, by default is 0 or "".

split

Split a string and convert them to a slice, it won't work if not set.

nil

Will not skip scan value in the cell equals to this 'nil value'

req

Will return error if clomun title not exist in excel.

encoding

Will decode the string value into the field can be unmarshal by the encoding type, currently, only json is supported, will use encoding/json package.

XLSX Field Config | 字段的解析配置

Sometimes it’s a bit cumbersome to deal with escape characters (exp. #6), so implement the interface of GetXLSXFieldConfigs() map[string]FieldConfig will take a high priority than tag to provide the field config, more info to see the test file.

RoadMap | 开发计划

  • Read xlsx file and got the expect xml. √
  • Prepare the shared string xml. √
  • Get the correct sheetX.xml. √
  • Read a row of a sheet. √
  • Read a cell of a row, fix the empty cell. √
  • Fill string cell with value of shared string xml. √
  • Can set the column name row, default is the first row. √
  • Read a row into a slice. √
  • Read a row to a struct by column name. √
  • Read a row into a map with string key. √
  • Read a row into a map by primary key.

Thinking | 随想

在复杂的系统中(例如游戏)

有时候为了便于非专业人员设置一些配置

会使用Excel作为一种轻量级的关系数据库或者配置文件

毕竟对于很多非开发人员来说

配个Excel要比写json或者yaml什么简单得多

这种场景下

读取特定格式(符合关系数据库特点的表格)的数据会比各种花式写入Excel的功能更重要

毕竟从编辑上来说微软提供的Excel本身功能就非常强大了

而现在我找到的Excel库的功能都过于强大了

用起来有点浪费

于是写了这个简化库

这个库的工作参考了tealeg/xlsx的部分实现和读取逻辑。

感谢tealeg

go-excel's People

Contributors

szyhf avatar yoavgur avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

go-excel's Issues

Is there a way to read sub column ?

I got a excel file to read, and it has a columns look like this:

Imgur

I want to read sub column a's and b's value ( both are 0.11)

Is there a way to do things like this ?

Can we have a New Method in interface<Reader>?

requesting a new Method in interface that can allow us to get Raw data

// Reader to read excel
type Reader interface {
	// RawRow return current row data
        RawRow() ([]string, error)
}

sheetName Unable to read properly

config Integer default 0 like :

&excel.Config{
		// Sheet name as string or sheet model as object or a slice of object.
		Sheet: 0,
		// Use the index row as title, every row before title-row will be ignore, default is 0.
		TitleRowIndex: 1,
		// Skip n row after title, default is 0 (not skip), empty row is not counted.
		Skip: 0,
		// Auto prefix to sheet name.
		Prefix: "",
		// Auto suffix to sheet name.
		Suffix: "",
	}

error : can not find worksheet named =

try set Sheet to 1

error : runtime error: index out of range [-1]

I have many files and sheetName is different. They all have only one sheet, How to Set the Default Read First?

相同标题会error

问题描述:

  1. xxx.xlsx 文件格式

image

2. 代码

image

3. 报错信息 `title row has duplicated key and can not read into a map or struct`

诉求:

  1. 提供 struct 时候 只需要严格按照 struct 字段定义去解析就行
  2. 缺少字段可以设置默认值或者 error 也可以
  3. 相同标题时,让用户自己决定是不是返回error
  4. 相同标题时,可以制定覆盖策略,比如前面覆盖后面,后面覆盖前面等

谢谢谢!!!

读取正常的xlsx文件报错

parse xlsx file failed: xl/sharedStringPaths.xml not exist,但是用github.com/tealeg/xlsx 这个框架可以正确读取

新版本读取xlsx报错

panic: title row has duplicated key and can not read into a map or struct

使用v1.4.3版本可以正常读取,使用最新的v1.5.1版本读取就出现上面的error,麻烦帮忙看一下

can we use column name as title?

can we use column names, such as A, B, AB, for some excel doesn't contain any title or the title may be updated from day to day by uploader.
such as can we define structs like:

type Demo struct {
	ID  string `xlsx:"A"`
	Sku string `xlsx:"B"`
}

found this:
If no title row privoded, the default column name in exce like 'A', 'B', 'C', 'D' ......, 'XFC', 'XFD' can be used as column name by 26-number-system.

what value should pass to TitleRowIndex

【建议】关于指定sheetName建议带上指针类型

不支持带指针的构造函数

func (s Scopus) GetXLSXSheetName() string {
	return "scopus_1500"
}

这样是可以的,但是带有指针是失败的,比如:下面是错误的

func (s *Scopus) GetXLSXSheetName() string {
	return "scopus_1500"
}

区别在于*Scopus

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.