Comments (4)
Thank you for reporting this bug. I just reproduced this problem and the saved docx file does not work.
I need more time to investigate the root cause.
from docx.
Hi @satoryu! That's interesting - this was generated using Google Docs.
from docx.
Ah... I confirmed that Google Docs generates a docx file with the problem I mentioned.
I'll try to fix this issue.
from docx.
I have analyzed sample.docx
, mainly document.xml
.
FYI: docx file is a zip file including some XML documents, one of which is document.xml
.
Its document.xml
looks strange.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"
xmlns:sl="http://schemas.openxmlformats.org/schemaLibrary/2006/main"
xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main"
xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture"
xmlns:c="http://schemas.openxmlformats.org/drawingml/2006/chart"
xmlns:lc="http://schemas.openxmlformats.org/drawingml/2006/lockedCanvas"
xmlns:dgm="http://schemas.openxmlformats.org/drawingml/2006/diagram"
xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape"
xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup"
xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml"
xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml">
<w:background w:color="FFFFFF"/>
<w:body>
<w:bookmarkStart w:colFirst="0" w:colLast="0" w:name="cypsldshvlxb" w:id="0"/>
<w:bookmarkEnd w:id="0"/>
<w:p w:rsidR="00000000" w:rsidDel="00000000" w:rsidP="00000000" w:rsidRDefault="00000000" w:rsidRPr="00000000" w14:paraId="00000001">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Arial" w:cs="Arial" w:eastAsia="Arial" w:hAnsi="Arial"/>
<w:b w:val="1"/>
</w:rPr>
</w:pPr>
<w:r w:rsidDel="00000000" w:rsidR="00000000" w:rsidRPr="00000000">
<w:rPr>
<w:rFonts w:ascii="Arial" w:cs="Arial" w:eastAsia="Arial" w:hAnsi="Arial"/>
<w:b w:val="1"/>
<w:rtl w:val="0"/>
</w:rPr>
<w:t xml:space="preserve">bookmark_1</w:t>
</w:r>
</w:p>
w:bookmarkStart
and w:bookmarkEnd
tags are not included in paragraph w:p
tag.
According to some articles, one of which, these tags should be in w:p
tag.
And this gem expects bookmarkStart
is in w:p
.
I have one question: How is this docx file generated?
If it was generated by popular software, this gem should support it.
from docx.
Related Issues (20)
- ENOENT error because internal doc is word/document22.xml HOT 7
- to_s crash, presumably when doc has word/document22.xml inside HOT 2
- I'm trying to test a docx download and I hoped I could use this for parsing but I don't understand how HOT 4
- Detect and trim blank page
- ZIP library isn't working as expected in the document.rb
- Replace different placeholder text in each table cell HOT 4
- Replace placeholders in a paragraph HOT 6
- Order of paragraphs and tables HOT 4
- Can watermark be supported
- to_html only considers some styles and not others HOT 1
- Fuzzer + various crashes
- Request to remove monkeypatching on Module HOT 2
- Please cut a new gem release to include a fix HOT 2
- When a file other than docx is provided, an unexpected error is thrown. HOT 1
- accept docm file
- add revision mark (text) in paragraph text
- RFC: Ability to create a new document file HOT 1
- Exception thrown when calling to_html on file with internal hyperlinks HOT 2
- Text Replacement not working as Expected HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docx.