staale / py-xlsx Goto Github PK
View Code? Open in Web Editor NEWTiny python code for parsing data from an Office Open XML Spreadsheet - xlsx
Tiny python code for parsing data from an Office Open XML Spreadsheet - xlsx
There is a merge conflict not resolved.
Downloading/unpacking py-xlsx
Downloading py-xlsx-0.4.tar.gz
Running setup.py (path:/tmp/pip_build_leexiaolan/py-xlsx/setup.py) egg_info for package py-xlsx
Installing collected packages: py-xlsx
Running setup.py install for py-xlsx
File "/home/leexiaolan/lib/python2.7/site-packages/xlsx/init.py", line 223
<<<<<<< HEAD
^
SyntaxError: invalid syntax
To recreate:
>>> s = xlsx.Workbook('myfile.xlsx')[0]
>>> s.rows
{}
If you manually load the sheet, it works.
>>> s._Sheet__load()
>>> s.rows # returns rows.
For example, here's an attempt to open a non-existent file:
>>> import xlsx
>>> z = xlsx.DomZip('foo.txt')
Exception AttributeError: "'DomZip' object has no attribute 'ziphandle'" in <bound method DomZip.__del__ of <xlsx.DomZip object at 0x7f1ff4b76f10>> ignored
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "xlsx/__init__.py", line 25, in __init__
self.ziphandle = zipfile.ZipFile(filename, 'r')
File "/usr/lib/python2.7/zipfile.py", line 756, in __init__
self.fp = open(file, modeDict[mode])
IOError: [Errno 2] No such file or directory: 'foo.txt'
Depending on when the object gets cleaned up, you'll see:
Exception AttributeError: "'DomZip' object has no attribute 'ziphandle'" in <bound method DomZip.__del__ of <xlsx.DomZip object at 0x7f1ff4b76f10>> ignored
With commit 976cb84, workbooks without a modified date property raise an IndexError when the a workbook object is instantiated.
For example, an xlsx file I tested that was created using LibreOffice didn't have any dcterms:modified element:
docPropsCoreDoc.firstChild.getElementsByTagName("dcterms:modified")
[]
Which causes:
self.dcterms_modified = docPropsCoreDoc.firstChild.getElementsByTagName("dcterms:modified")[0].childNodes[0].data
to fail with an IndexError.
This is a big one. If you load a sheet into memory, there's no way to remove the sheet data from memory - even after deleting the sheet. To recreate:
>>> s = xlsx.Workbook('myfile.xlsx')[0]
>>> s._Sheet__load() # see http://github.com/staale/python-xlsx/issues#issue/1
>>> s.rows # returns rows
>>> del s
>>> s = xlsx.Workbook('anotherfile.xlsx')[0]
>>> s._Sheet__load()
>>> s.rows # returns data from the first file loaded
If I'm not mistaken, the xldate_as_tuple
supports the (dreaded) datemode, but the Workbook
class has no means of obtaining it. Is there any workaround or is this project not usable if you want to have multi-platform support?
/edit: xlrd has this (http://www.lexicon.net/sjmachin/xlrd.html#xlrd.Book-class), plus the same xldate_as_tuple
. Can someone explain me how these projects "interact"?
Thanks.
Hi! I am novice in Python. I have found your great tool to parse various XLSX files and compare them with template files. Comparison aim is that in both files there should not be changes between formulas. After finishing implementation, on testing phase we have found an issue that the package do not read all formulas from one of the files, however in both files on the same cell we have the identical formulas. Could you please help me to find solution for this issue? If you will need more information, please contact me and I will provide full info for you.
Thank you in advance.
Regards,
David Razmadze
Can we push this package to PyPI? It would be nice to install this package via pip in a requirements.txt file instead of the clone/install two-step.
FYI when I try to open an xlsx file, I get the following error:
Traceback (most recent call last):
File "<pyshell#9>", line 1, in <module>
book = Workbook(path)
File "C:\Python27\lib\site-packages\xlsx\__init__.py", line 54, in __init__
self.domzip["xl/sharedStrings.xml"])
File "C:\Python27\lib\site-packages\xlsx\__init__.py", line 94, in __init__
self.__getIfInline(text))
File "C:\Python27\lib\site-packages\xlsx\__init__.py", line 101, in __getIfInline
for node in nodes])
AttributeError: 'NoneType' object has no attribute 'nodeValue'
I have verified that the file is in the specified path (the library throws an IOError if the path is wrong).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.