Giter Site home page Giter Site logo

fumiama / go-docx Goto Github PK

View Code? Open in Web Editor NEW
74.0 2.0 11.0 1.52 MB

One of the most functional libraries to partially read and write .docx files (a.k.a. Microsoft Word documents or ECMA-376 Office Open XML) in Go.

License: GNU Affero General Public License v3.0

Go 100.00%
docx openxml openxml-word word xml docx-converter docx-files docx-generator openxml-format openxml-sdk

go-docx's Introduction

Docx library

One of the most functional libraries to read and write .docx (a.k.a. Microsoft Word documents or ECMA-376 Office Open XML) files in Go.

This is a variant optimized and expanded by fumiama. The original repo is gonfva/docxlib.

Introduction

As part of my work for Basement Crowd and FromCounsel, we were in need of a basic library to manipulate (both read and write) Microsoft Word documents.

The difference with other projects is the following:

  • UniOffice is probably the most complete but it is also commercial (you need to pay). It also very complete, but too much for my needs.
  • gingfrederik/docx only allows to write.

There are also a couple of other projects kingzbauer/docx and nguyenthenguyen/docx

gingfrederik/docx was a heavy influence (the original structures and the main method come from that project).

However, those original structures didn't handle reading and extending them was particularly difficult due to Go xml parser being a bit limited including a 6 year old bug.

Additionally, my requirements go beyond the original structure and a hard fork seemed more sensible.

The plan is to evolve the library, so the API is likely to change according to my company's needs. But please do feel free to send patches, reports and PRs (or fork).

In the mean time, shared as an example in case somebody finds it useful.

The Introduction above is copied from the original repo. I had evolved that repo again to fit my needs. Here are the supported functions now.

  • Parse and save document
  • Edit text (color, size, alignment, link, ...)
  • Edit picture
  • Edit table
  • Edit shape
  • Edit canvas
  • Edit group

Quick Start

go run cmd/main/main.go -u

And you will see two files generated under pwd with the same contents as below.

p1 p2

Use Package in your Project

go get -d github.com/fumiama/go-docx@latest

Generate Document

package main

import (
	"os"
	"strings"

	"github.com/fumiama/go-docx"
)

func main() {
	w := docx.New().WithDefaultTheme()
	// add new paragraph
	para1 := w.AddParagraph()
	// add text
	para1.AddText("test").AddTab()
	para1.AddText("size").Size("44").AddTab()
	f, err := os.Create("generated.docx")
	// save to file
	if err != nil {
		panic(err)
	}
	_, err = w.WriteTo(f)
	if err != nil {
		panic(err)
	}
	err = f.Close()
	if err != nil {
		panic(err)
	}
}

Parse Document

package main

import (
	"fmt"
	"os"
	"strings"

	"github.com/fumiama/go-docx"
)

func main() {
	readFile, err := os.Open("file2parse.docx")
	if err != nil {
		panic(err)
	}
	fileinfo, err := readFile.Stat()
	if err != nil {
		panic(err)
	}
	size := fileinfo.Size()
	doc, err := docx.Parse(readFile, size)
	if err != nil {
		panic(err)
	}
	fmt.Println("Plain text:")
	for _, it := range doc.Document.Body.Items {
		switch it.(type) {
		case *docx.Paragraph, *docx.Table: // printable
			fmt.Println(it)
		}
	}
}

License

AGPL-3.0. See LICENSE

go-docx's People

Contributors

fumiama avatar github-actions[bot] avatar gonfva avatar gonfva-bcl avatar junwen-k avatar mabiao0525 avatar shadowmimosa avatar yangge2333 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

go-docx's Issues

.docx get corrupted after writing file

I'm not sure of what's happening here. I have a perfectly functioning .docx as input file. But after parse it (didn't mody a thing) and write it as new file.. the output file gets corrupted)

func OpenDocx(path string) *docx.Docx {
	readFile, err := os.Open(path)
	util.Panic(err)
	fileInfo, err := readFile.Stat()
	util.Panic(err)
	size := fileInfo.Size()
	doc, err := docx.Parse(readFile, size)
	util.Panic(err)
	return doc
}

func WriteDocx(doc *docx.Docx, path string) {
	f, err := os.Create(path)
	util.Panic(err)
	_, err = doc.WriteTo(f)
	util.Panic(err)
	err = f.Close()
	util.Panic(err)
}

and somwhere in my main.go

...
...
doc := docxLib.OpenDocx(inputFile)
docxLib.WriteDocx(doc, "copy.docx")
...
...

when I try to open copy.docx in Word, I cannot because of corrupted file.

Still need help appending files...

Hello! I'll try my best to make this issue the last one, so I won't litter issues with my questions that much. Thanks for the help in advance!!!
code:

	readFile, err := os.Open(filename)
        if err != nil {
	        slog.Error("error while oppening docx file", "unit_guid", devInfo.UnitGuid, "err", err)
        }
        fileinfo, err := readFile.Stat()
        if err != nil {
	        slog.Error("error while reading docx file stat file", "unit_guid", devInfo.UnitGuid, "err", err)
        }
        size := fileinfo.Size()
        oldDoc, err := docx.Parse(readFile, size)
        if err != nil {
	        slog.Error("error while parsing docx file", "unit_guid", devInfo.UnitGuid, "err", err)
        }
        newDoc := docx.NewA4()
        p := newDoc.AddParagraph()
        p.AddText(text).Size("10")
        // ADD OLD FILE CONTENT HERE
        newDoc.AppendFile(oldDoc)
        if err != nil {
	        slog.Error("error while creating file", "unit_guid", devInfo.UnitGuid, "err", err)
        }
        _, err = newDoc.WriteTo(readFile)
        if err != nil {
	        slog.Error("error while writing file", "unit_guid", devInfo.UnitGuid, "err", err)
        }
        err = readFile.Close()
        if err != nil {
	        slog.Error("error while saving file", "unit_guid", devInfo.UnitGuid, "err", err)
        }

this whole block of code is executed about 3-4 times, and should add one line to the file at a time
however, after the program exits, the created file has only one out of 3-4 lines it should
I tried literally everything at this point, but still cant get it to work...

可以设置页面方向为横向吗?

你好,看了下代码,没有找到设置页面方向 portrait、landscape 相关的代码。

请问可以设置页面方向为横向或者考虑支持吗?

how to parse math?

docx documents support omath and Equation.DSMT4, how to extract them?
OMath like this:

m:oMathPara>
<m:oMath>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>f</m:t>
</m:r>
<m:d>
<m:dPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:dPr>
<m:e>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>x</m:t>
</m:r>
</m:e>
</m:d>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>=</m:t>
</m:r>
<m:sSub>
<m:sSubPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:sSubPr>
<m:e>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>a</m:t>
</m:r>
</m:e>
<m:sub>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>0</m:t>
</m:r>
</m:sub>
</m:sSub>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>+</m:t>
</m:r>
<m:nary>
<m:naryPr>
<m:chr m:val="∑"/>
<m:grow m:val="1"/>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:naryPr>
<m:sub>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>n=1</m:t>
</m:r>
</m:sub>
<m:sup>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>∞</m:t>
</m:r>
</m:sup>
<m:e>
<m:d>
<m:dPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:dPr>
<m:e>
<m:sSub>
<m:sSubPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:sSubPr>
<m:e>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>a</m:t>
</m:r>
</m:e>
<m:sub>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>n</m:t>
</m:r>
</m:sub>
</m:sSub>
<m:func>
<m:funcPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:funcPr>
<m:fName>
<m:r>
<m:rPr>
<m:sty m:val="p"/>
</m:rPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>cos</m:t>
</m:r>
</m:fName>
<m:e>
<m:f>
<m:fPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:fPr>
<m:num>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>nπx</m:t>
</m:r>
</m:num>
<m:den>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>L</m:t>
</m:r>
</m:den>
</m:f>
</m:e>
</m:func>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>+</m:t>
</m:r>
<m:sSub>
<m:sSubPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:sSubPr>
<m:e>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>b</m:t>
</m:r>
</m:e>
<m:sub>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>n</m:t>
</m:r>
</m:sub>
</m:sSub>
<m:func>
<m:funcPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:funcPr>
<m:fName>
<m:r>
<m:rPr>
<m:sty m:val="p"/>
</m:rPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>sin</m:t>
</m:r>
</m:fName>
<m:e>
<m:f>
<m:fPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
</m:ctrlPr>
</m:fPr>
<m:num>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>nπx</m:t>
</m:r>
</m:num>
<m:den>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
</w:rPr>
<m:t>L</m:t>
</m:r>
</m:den>
</m:f>
</m:e>
</m:func>
</m:e>
</m:d>
</m:e>
</m:nary>
</m:oMath>
</m:oMathPara>

Equation like this:

<w:object w:dxaOrig="4320" w:dyaOrig="1520" w14:anchorId="194DA0B5">
<v:shapetype id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
<v:stroke joinstyle="miter"/>
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0"/>
<v:f eqn="sum @0 1 0"/>
<v:f eqn="sum 0 0 @1"/>
<v:f eqn="prod @2 1 2"/>
<v:f eqn="prod @3 21600 pixelWidth"/>
<v:f eqn="prod @3 21600 pixelHeight"/>
<v:f eqn="sum @0 0 1"/>
<v:f eqn="prod @6 1 2"/>
<v:f eqn="prod @7 21600 pixelWidth"/>
<v:f eqn="sum @8 21600 0"/>
<v:f eqn="prod @7 21600 pixelHeight"/>
<v:f eqn="sum @10 21600 0"/>
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
<o:lock v:ext="edit" aspectratio="t"/>
</v:shapetype>
<v:shape id="_x0000_i1025" type="#_x0000_t75" style="width:3in;height:76.4pt" o:ole="">
<v:imagedata r:id="rId9" o:title=""/>
</v:shape>
<o:OLEObject Type="Embed" ProgID="Equation.DSMT4" ShapeID="_x0000_i1025" DrawAspect="Content" ObjectID="_1753715041" r:id="rId10"/>
</w:object>

how to set table merge

I have a requirement to merge certain rows in a table. Do you have any plans to support this feature in the future?

Cant figure out how to append to an existing file, need help!

I try to append new text to an existing docx file like this:

readFile, err := os.Open(filename)
if err != nil {
	// ...
}
fileinfo, err := readFile.Stat()
if err != nil {
	//...
}
size := fileinfo.Size()
oldDoc, err := docx.Parse(readFile, size)
if err != nil {
	//...
}
doc := docx.LoadBodyItems(oldDoc.Document.Body.Items, []docx.Media{})
p := doc.AddParagraph()
p.AddText(text).Size("10")
if err != nil {
	//...
}
_, err = doc.WriteTo(readFile)
if err != nil {
	//...
}
err = readFile.Close()
if err != nil {
	//...
}

but it just adds new content and erases the old file...

It also seems to ignore '\n' in the text - no new lines are made, how do I insert a new line?

Document Parse fails if Table has width with decimal value

Hi, thank you for writing up this library.

Currently, I am facing issue trying to Parse an existing docx file which consists of table with decimal point width as follows:

...
<w:tblW w:w="11116.8" w:type="dxa" />
...
doc, err := docx.Parse(readFile, size)
if err != nil {
	panic(err)
}

And got error output:

strconv.ParseInt: parsing "11116.8": invalid syntax

Suspect is due to this line

https://github.com/fumiama/go-docx/blob/master/structtable.go#L280


I am no expert in working with docx file. Should the library supports decimal points value by using strconv.ParseFloat instead? This might also apply to other components, not just table width.

Thank you!

Documentation: Add documentation for different use cases and examples

A good open source library cannot go without a good documentation!

Suggest to add in documentation for different use cases and examples.
A good starting point could be based on the existing features:

  1. Text
  2. Image
  3. Table
  4. Shape
  5. Canvas
  6. Group

Maybe it can also be linked back to Microsoft OpenXML's documentation for easier references.

Easier way to get started is by directly writing into the repository's README file but it could get cluttered real quick.

Perhaps having a Nextra documentation site would be helpful.
https://nextra.site/docs/docs-theme/start

go-docx use

go-docx的表格设计, 可以支持表格中的部分表格合并吗,就是类似单元格合并的功能

不支持删除线

<w:strike>子元素会被跳过
可以在RunProperties加一个Strike成员
并在structrun.go:300加一个case "strike"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.