Giter Site home page Giter Site logo

pyexcel-xlsx's Introduction

pyexcel-xlsx - Let you focus on data, instead of xlsx format

https://raw.githubusercontent.com/pyexcel/pyexcel.github.io/master/images/patreon.png https://pepy.tech/badge/pyexcel-xlsx/month https://img.shields.io/static/v1?label=continuous%20templating&message=%E6%A8%A1%E7%89%88%E6%9B%B4%E6%96%B0&color=blue&style=flat-square https://img.shields.io/static/v1?label=coding%20style&message=black&color=black&style=flat-square

pyexcel-xlsx is a tiny wrapper library to read, manipulate and write data in xlsx and xlsm format using read_only mode reader, write_only mode writer from openpyxl. You are likely to use it with pyexcel.

Please note:

  1. auto_detect_int flag will not take effect because openpyxl detect integer in python 3 by default.
  2. skip_hidden_row_and_column will get a penalty where read_only mode cannot be used.

Support the project

If your company has embedded pyexcel and its components into a revenue generating product, please support me on github, patreon or bounty source to maintain the project and develop it further.

If you are an individual, you are welcome to support me too and for however long you feel like. As my backer, you will receive early access to pyexcel related contents.

And your issues will get prioritized if you would like to become my patreon as pyexcel pro user.

With your financial support, I will be able to invest a little bit more time in coding, documentation and writing interesting posts.

Known constraints

Fonts, colors and charts are not supported.

Nor to read password protected xls, xlsx and ods files.

Installation

You can install pyexcel-xlsx via pip:

$ pip install pyexcel-xlsx

or clone it and install it:

$ git clone https://github.com/pyexcel/pyexcel-xlsx.git
$ cd pyexcel-xlsx
$ python setup.py install

Usage

As a standalone library

.. testcode::
   :hide:

    >>> import os
    >>> import sys
    >>> if sys.version_info[0] < 3:
    ...     from StringIO import StringIO
    ... else:
    ...     from io import BytesIO as StringIO
    >>> PY2 = sys.version_info[0] == 2
    >>> if PY2 and sys.version_info[1] < 7:
    ...      from ordereddict import OrderedDict
    ... else:
    ...     from collections import OrderedDict


Write to an xlsx file

Here's the sample code to write a dictionary to an xlsx file:

>>> from pyexcel_xlsx import save_data
>>> data = OrderedDict() # from collections import OrderedDict
>>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
>>> data.update({"Sheet 2": [["row 1", "row 2", "row 3"]]})
>>> save_data("your_file.xlsx", data)

Read from an xlsx file

Here's the sample code:

>>> from pyexcel_xlsx import get_data
>>> data = get_data("your_file.xlsx")
>>> import json
>>> print(json.dumps(data))
{"Sheet 1": [[1, 2, 3], [4, 5, 6]], "Sheet 2": [["row 1", "row 2", "row 3"]]}

Write an xlsx to memory

Here's the sample code to write a dictionary to an xlsx file:

>>> from pyexcel_xlsx import save_data
>>> data = OrderedDict()
>>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
>>> data.update({"Sheet 2": [[7, 8, 9], [10, 11, 12]]})
>>> io = StringIO()
>>> save_data(io, data)
>>> # do something with the io
>>> # In reality, you might give it to your http response
>>> # object for downloading

Read from an xlsx from memory

Continue from previous example:

>>> # This is just an illustration
>>> # In reality, you might deal with xlsx file upload
>>> # where you will read from requests.FILES['YOUR_XLSX_FILE']
>>> data = get_data(io)
>>> print(json.dumps(data))
{"Sheet 1": [[1, 2, 3], [4, 5, 6]], "Sheet 2": [[7, 8, 9], [10, 11, 12]]}

Pagination feature

Let's assume the following file is a huge xlsx file:

>>> huge_data = [
...     [1, 21, 31],
...     [2, 22, 32],
...     [3, 23, 33],
...     [4, 24, 34],
...     [5, 25, 35],
...     [6, 26, 36]
... ]
>>> sheetx = {
...     "huge": huge_data
... }
>>> save_data("huge_file.xlsx", sheetx)

And let's pretend to read partial data:

>>> partial_data = get_data("huge_file.xlsx", start_row=2, row_limit=3)
>>> print(json.dumps(partial_data))
{"huge": [[3, 23, 33], [4, 24, 34], [5, 25, 35]]}

And you could as well do the same for columns:

>>> partial_data = get_data("huge_file.xlsx", start_column=1, column_limit=2)
>>> print(json.dumps(partial_data))
{"huge": [[21, 31], [22, 32], [23, 33], [24, 34], [25, 35], [26, 36]]}

Obvious, you could do both at the same time:

>>> partial_data = get_data("huge_file.xlsx",
...     start_row=2, row_limit=3,
...     start_column=1, column_limit=2)
>>> print(json.dumps(partial_data))
{"huge": [[23, 33], [24, 34], [25, 35]]}
.. testcode::
   :hide:

   >>> os.unlink("huge_file.xlsx")


As a pyexcel plugin

No longer, explicit import is needed since pyexcel version 0.2.2. Instead, this library is auto-loaded. So if you want to read data in xlsx format, installing it is enough.

Reading from an xlsx file

Here is the sample code:

>>> import pyexcel as pe
>>> sheet = pe.get_book(file_name="your_file.xlsx")
>>> sheet
Sheet 1:
+---+---+---+
| 1 | 2 | 3 |
+---+---+---+
| 4 | 5 | 6 |
+---+---+---+
Sheet 2:
+-------+-------+-------+
| row 1 | row 2 | row 3 |
+-------+-------+-------+

Writing to an xlsx file

Here is the sample code:

>>> sheet.save_as("another_file.xlsx")

Reading from a IO instance

You got to wrap the binary content with stream to get xlsx working:

>>> # This is just an illustration
>>> # In reality, you might deal with xlsx file upload
>>> # where you will read from requests.FILES['YOUR_XLSX_FILE']
>>> xlsxfile = "another_file.xlsx"
>>> with open(xlsxfile, "rb") as f:
...     content = f.read()
...     r = pe.get_book(file_type="xlsx", file_content=content)
...     print(r)
...
Sheet 1:
+---+---+---+
| 1 | 2 | 3 |
+---+---+---+
| 4 | 5 | 6 |
+---+---+---+
Sheet 2:
+-------+-------+-------+
| row 1 | row 2 | row 3 |
+-------+-------+-------+

Writing to a StringIO instance

You need to pass a StringIO instance to Writer:

>>> data = [
...     [1, 2, 3],
...     [4, 5, 6]
... ]
>>> io = StringIO()
>>> sheet = pe.Sheet(data)
>>> io = sheet.save_to_memory("xlsx", io)
>>> # then do something with io
>>> # In reality, you might give it to your http response
>>> # object for downloading

License

New BSD License

Developer guide

Development steps for code changes

  1. git clone https://github.com/pyexcel/pyexcel-xlsx.git
  2. cd pyexcel-xlsx

Upgrade your setup tools and pip. They are needed for development and testing only:

  1. pip install --upgrade setuptools pip

Then install relevant development requirements:

  1. pip install -r rnd_requirements.txt # if such a file exists
  2. pip install -r requirements.txt
  3. pip install -r tests/requirements.txt

Once you have finished your changes, please provide test case(s), relevant documentation and update CHANGELOG.rst.

Note

As to rnd_requirements.txt, usually, it is created when a dependent library is not released. Once the dependecy is installed (will be released), the future version of the dependency in the requirements.txt will be valid.

How to test your contribution

Although nose and doctest are both used in code testing, it is adviable that unit tests are put in tests. doctest is incorporated only to make sure the code examples in documentation remain valid across different development releases.

On Linux/Unix systems, please launch your tests like this:

$ make

On Windows systems, please issue this command:

> test.bat

Before you commit

Please run:

$ make format

so as to beautify your code otherwise travis-ci may fail your unit test.

.. testcode::
   :hide:

   >>> import os
   >>> os.unlink("your_file.xlsx")
   >>> os.unlink("another_file.xlsx")

pyexcel-xlsx's People

Contributors

benoit-pierre avatar chfw avatar fuhrysteve avatar jayvdb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyexcel-xlsx's Issues

v0.5.8 specifc error when saving data to xlsx: StopIteration

I encountered an error when writing to xlsx with the latest version 0.5.8. In the previous version, everything works fine.

When executing following example:

from pyexcel import Book, get_book
test = {
    'Sheet 1':
        [
            [1.0, 2.0, 3.0],
            [4.0, 5.0, 6.0],
            [7.0, 8.0, 9.0]
        ]
}

book = get_book(bookdict=test)
book.save_as("test.xlsx")

I receive:

Traceback (most recent call last):
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel_io\io.py", line 148, in store_data       
    writer.write(data)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel_io\book.py", line 220, in write
    sheet_writer.write_array(incoming_dict[sheet_name])
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel_io\sheet.py", line 174, in write_array   
    self.write_row(row)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel_xlsx\xlsxw.py", line 29, in write_row    
    self._native_sheet.append(array)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\worksheet\_write_only.py", line 113, in 
append
    self._get_writer()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\worksheet\_write_only.py", line 82, in _get_writer
    self._writer.write_top()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\worksheet\_writer.py", line 98, in write_top
    self.write_properties()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\worksheet\_writer.py", line 60, in write_properties
    self.xf.send(props.to_tree())
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\worksheet\_writer.py", line 294, in get_stream
    xf.write(el)
  File "src\lxml\serializer.pxi", line 1652, in lxml.etree._IncrementalFileWriter.write
TypeError: got invalid input value of type <class 'xml.etree.ElementTree.Element'>, expected string or Element

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel\internal\meta.py", line 327, in save_as  
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel\internal\core.py", line 56, in save_book 
    return _save_any(a_source, book)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel\internal\core.py", line 60, in _save_any 
    a_source.write_data(instance)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel\plugins\sources\file_output.py", line 44, in write_data
    self._file_name, book, **self._keywords
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel\plugins\renderers\excel.py", line 31, in 
render_book_to_file
    save_data(file_name, book.to_dict(), **keywords)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel_io\io.py", line 131, in save_data        
    **keywords
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel_io\io.py", line 148, in store_data       
    writer.write(data)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel_io\book.py", line 57, in __exit__        
    self.close()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\pyexcel_xlsx\xlsxw.py", line 55, in close        
    self._native_book.save(filename=self._file_alike_object)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\workbook\workbook.py", line 408, in save
    save_workbook(self, filename)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\writer\excel.py", line 293, in save_workbook
    writer.save()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\writer\excel.py", line 275, in save     
    self.write_data()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\writer\excel.py", line 75, in write_data
    self._write_worksheets()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\writer\excel.py", line 215, in _write_worksheets
    self.write_worksheet(ws)
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\writer\excel.py", line 196, in write_worksheet
    ws.close()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\worksheet\_write_only.py", line 92, in close
    self._writer.write_rows()
  File "C:\Users\kerst\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\worksheet\_writer.py", line 120, in write_rows
    xf = self.xf.send(True)
StopIteration

It would be very important for us to fix this issue. Thanks a lot and best wishes

'ReadOnlyWorksheet' object has no attribute 'merged_cells'

Opening xlsx file with rs = pyexcel.iget_book(file_name='./some_file.xlsx', skip_hidden_row_and_column=False, detect_merged_cells=True) causes an error 'ReadOnlyWorksheet' object has no attribute 'merged_cells'.
May be at

if self.skip_hidden_row_and_column:
should be

if self.skip_hidden_row_and_column or self.detect_merged_cells:

not

if self.skip_hidden_row_and_column:

?
Thank you for the excellent library, by the way :-)

Indexerror when loading Excel sheet

I process cost reports every month, but something seems to have changed in the generated Excel sheets, hence:

Traceback (most recent call last):
  File "check-duplicates.py", line 13, in <module>
    sheets = get_data(sys.argv[1])
  File "C:\ProgramData\Miniconda3\lib\site-packages\pyexcel_xlsx\__init__.py", line 41, in get_data
    return read_data(afile, file_type=file_type, **keywords)
  File "C:\ProgramData\Miniconda3\lib\site-packages\pyexcel_io\io.py", line 73, in get_data
    afile, file_type=file_type, streaming=False, **keywords
  File "C:\ProgramData\Miniconda3\lib\site-packages\pyexcel_io\io.py", line 91, in _get_data
    return load_data(**keywords)
  File "C:\ProgramData\Miniconda3\lib\site-packages\pyexcel_io\io.py", line 204, in load_data
    reader.open(file_name, **keywords)
  File "C:\ProgramData\Miniconda3\lib\site-packages\pyexcel_xlsx\xlsxr.py", line 145, in open
    self._load_the_excel_file(file_name)
  File "C:\ProgramData\Miniconda3\lib\site-packages\pyexcel_xlsx\xlsxr.py", line 209, in _load_the_excel_file
    read_only=read_only_flag,
  File "C:\ProgramData\Miniconda3\lib\site-packages\openpyxl\reader\excel.py", line 196, in load_workbook
    apply_stylesheet(archive, wb) # bind styles to workbook
  File "C:\ProgramData\Miniconda3\lib\site-packages\openpyxl\styles\stylesheet.py", line 180, in apply_stylesheet
    stylesheet = Stylesheet.from_tree(node)
  File "C:\ProgramData\Miniconda3\lib\site-packages\openpyxl\styles\stylesheet.py", line 102, in from_tree
    return super(Stylesheet, cls).from_tree(node)
  File "C:\ProgramData\Miniconda3\lib\site-packages\openpyxl\descriptors\serialisable.py", line 100, in from_tree
    return cls(**attrib)
  File "C:\ProgramData\Miniconda3\lib\site-packages\openpyxl\styles\stylesheet.py", line 93, in __init__
    self.named_styles = self._merge_named_styles()
  File "C:\ProgramData\Miniconda3\lib\site-packages\openpyxl\styles\stylesheet.py", line 113, in _merge_named_styles
    self._expand_named_style(style)
  File "C:\ProgramData\Miniconda3\lib\site-packages\openpyxl\styles\stylesheet.py", line 123, in _expand_named_style
    xf = self.cellStyleXfs[named_style.xfId]
  File "C:\ProgramData\Miniconda3\lib\site-packages\openpyxl\styles\cell_style.py", line 186, in __getitem__
    return self.xf[idx]
IndexError: list index out of range

I don't know what could be the reason, maybe you can understand better.

sheet order

Excellent tool, I like it very much.

And I saw the result of get_data() is OrderedDict, but the order doesn't exactly match the sheet order in excel. Is it a bug? Thank you.

Issues with Pyinstaller

I am having issues getting PyInstaller to load in the plugins for pyexcel (specifically pyexcel-xlsx). I have looked at both pyexcel and pyexcel-io pages that explain what I need to add as the --hidden-import. PyInstaller does successfully create an executable, but no matter what I put as my --hidden-import I end up with the error pyexcel_io.exceptions.SupportingPluginAvailableButNotInstalled: Please install one of these plugins for read data in 'xlsx': pyexcel-xls,pyexcel-xlsx. I found a question asking about this same problem on stackoverflow and they answered their own question by just importing the plugins directly. This worked for me as well and I just added import pyexcel_xlsx to my python file. I was wondering if their is a bug here as I noticed this has been asked before across the repositories, but people seemed to have figured it out, so maybe I am missing something.
Thanks for the help.

Support for images

Support for images within a sheet.

example: cell D3 contains an image, when requesting the value of this cell then one of the following should be returned:

  • image object
  • or reference to image that could be used to open image with other python library

string argument expected, got 'bytes' error when using StringIO

The below code does not work on Python 3.9.6. Returns following error. Changing to BytesIO solved the issue. Perhaps docs should be updated.

from pyexcel_xlsx import save_data  
data = OrderedDict()  
data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})  
data.update({"Sheet 2": [[7, 8, 9], [10, 11, 12]]})  
io = StringIO()  
save_data(io, data) 

"string argument expected, got 'bytes'"

"Server error: Traceback (most recent call last):\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\flask\\app.py\", line 1516, in full_dispatch_request\n rv = self.dispatch_request()\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\flask\\app.py\", line 1502, in dispatch_request\n return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)\n File \"C:\\cvd-prevent\\cvd-prevent-api\\cvd_prevent_api\\indicator_route.py\", line 366, in get_indicator_meta_data_excel\n save_data(io, data)\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\pyexcel_xlsx\\__init__.py\", line 39, in save_data\n write_data(afile, data, file_type=file_type, **keywords)\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\pyexcel_io\\io.py\", line 143, in save_data\n writer.write(to_store)\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\pyexcel_io\\writer.py\", line 51, in __exit__\n self.close()\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\pyexcel_io\\writer.py\", line 45, in close\n self.writer.close()\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\pyexcel_xlsx\\xlsxw.py\", line 50, in close\n self._native_book.save(filename=self._file_alike_object)\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\openpyxl\\workbook\\workbook.py\", line 407, in save\n save_workbook(self, filename)\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\openpyxl\\writer\\excel.py\", line 293, in save_workbook\n writer.save()\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\openpyxl\\writer\\excel.py\", line 275, in save\n self.write_data()\n File \"c:\\cvd-prevent\\cvd-prevent-api\\venv\\lib\\site-packages\\openpyxl\\writer\\excel.py\", line 67, in write_data\n archive.writestr(ARC_APP, tostring(props.to_tree()))\n File \"C:\\Program Files\\Python39\\lib\\zipfile.py\", line 1802, in writestr\n with self.open(zinfo, mode='w') as dest:\n File \"C:\\Program Files\\Python39\\lib\\zipfile.py\", line 1505, in open\n return self._open_to_write(zinfo, force_zip64=force_zip64)\n File \"C:\\Program Files\\Python39\\lib\\zipfile.py\", line 1600, in _open_to_write\n self.fp.write(zinfo.FileHeader(zip64))\nTypeError: string argument expected, got 'bytes'\n"

Python 2.7.18:No handlers could be found for logger "lml.utils"

C:\Python27\Scripts>pip2 install pyexcel pyexcel-xls pyexcel-xlsx xlrd xlwt openpyxl
DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Requirement already satisfied: pyexcel in c:\python27\lib\site-packages (0.5.15)
Requirement already satisfied: pyexcel-xls in c:\python27\lib\site-packages (0.6.0)
Requirement already satisfied: pyexcel-xlsx in c:\python27\lib\site-packages (0.5.8)
Requirement already satisfied: xlrd in c:\python27\lib\site-packages (1.2.0)
Requirement already satisfied: xlwt in c:\python27\lib\site-packages (1.3.0)
Requirement already satisfied: openpyxl in c:\python27\lib\site-packages (2.6.4)
Requirement already satisfied: texttable>=0.8.2 in c:\python27\lib\site-packages (from pyexcel) (1.6.3)
Requirement already satisfied: pyexcel-io>=0.5.19 in c:\python27\lib\site-packages (from pyexcel) (0.5.20)
Requirement already satisfied: lml>=0.0.4 in c:\python27\lib\site-packages (from pyexcel) (0.0.9)
Requirement already satisfied: jdcal in c:\python27\lib\site-packages (from openpyxl) (1.4.1)
Requirement already satisfied: et_xmlfile in c:\python27\lib\site-packages (from openpyxl) (1.0.1)

import pyexcel
No handlers could be found for logger "lml.utils"

Possible incomplete hidden columns detection

If we have multiple hidden columns in a row only first column letter will be stored in column_dimensions by openpyxl, and pyexcel checks the hidden column state by it's letter, thus skipping the subsequent columns:

if self._native_sheet.column_dimensions[letter].hidden is False:

Shouldn't they also be checked for hidden state with something like:

    def is_column_hidden(self, column_index):
        letter = openpyxl.utils.get_column_letter(column_index)
        for l, d in self._native_sheet.column_dimensions.items():
            if l > letter:
                break
            if d.hidden is True and (l == letter or (column_index >= d.min and column_index <= d.max)):
                return True

        return False

?
Then

if self._native_sheet.column_dimensions[letter].hidden is False:

can be replaced with

if not self.is_column_hidden(column_index):

get_data() mem leak

if col active at XFD get_data() mem leak

my sheet i use field at col E. but field active at XFD in cell is null.

i use get_data() and view task manager is function run mem push many and leak.

solution
if i del F - XFD is Work!

Packaging with PyInstaller not work even following the doc

I add below command in PyInstaller:

--hidden-import pyexcel_xlsx
--hidden-import pyexcel_xlsx.xlsxr
--hidden-import pyexcel_xlsx.xlsxw

Here is the PyInstaller output:

3446 INFO: Analyzing hidden import 'pyexcel_xlsx'
3599 INFO: Analyzing hidden import 'pyexcel_xlsx.xlsxr'
5194 INFO: Analyzing hidden import 'pyexcel_xlsx.xlsxw'

But when I run, I still get
Please install one of these plugins for read data in 'xlsx': pyexcel-xls,pyexcel-xlsx

Mac OS
Python3.6.4
pyexcel (0.5.13)
pyexcel-io (0.5.16)
pyexcel-webio (0.1.4)
pyexcel-xlsx (0.5.7)
PyInstaller (3.4)

ImportError: No module named 'texttable'

Hey I want to merge multiple csv files into one excel file.
I used this code.

from pyexcel.cookbook import merge_all_to_a_book
import pyexcel.ext.xlsx # needed to support xlsx format, pip install pyexcel-xlsx
import glob

merge_all_to_a_book(glob.glob("*.csv"), "output.xlsx")


And I am getting this error

from pyexcel.cookbook import merge_all_to_a_book

File "C:\Python34\lib\site-packages\pyexcel__init__.py", line 12, in
from .book import Book
File "C:\Python34\lib\site-packages\pyexcel\book.py", line 11, in
from .sheets import Sheet, SheetStream
File "C:\Python34\lib\site-packages\pyexcel\sheets__init__.py", line 10, in
from .sheet import (
File "C:\Python34\lib\site-packages\pyexcel\sheets\sheet.py", line 11, in
from .nominablesheet import NominableSheet
File "C:\Python34\lib\site-packages\pyexcel\sheets\nominablesheet.py", line 10, in
from texttable import Texttable
ImportError: No module named 'texttable'

Python 2.7 Support Issue

$ pip install pyexcel-xlsx

Downloading/unpacking pyexcel-xlsx
Downloading pyexcel-xlsx-0.2.1.zip
Running setup.py (path:/tmp/pip_build_root/pyexcel-xlsx/setup.py) egg_info for package pyexcel-xlsx

Downloading/unpacking openpyxl>=2.2.2 (from pyexcel-xlsx)
Downloading openpyxl-2.3.5.tar.gz (141kB): 141kB downloaded
Running setup.py (path:/tmp/pip_build_root/openpyxl/setup.py) egg_info for package openpyxl

no previously-included directories found matching 'openpyxl/tests'
no previously-included directories found matching 'openpyxl/sample'
no previously-included directories found matching 'openpyxl/benchmarks'
no previously-included directories found matching 'openpyxl/develop'
warning: no previously-included files matching 'test_*.py' found under directory 'openpyxl'
warning: no previously-included files matching 'tests/*.py' found under directory 'openpyxl'

Downloading/unpacking pyexcel-io>=0.1.0 (from pyexcel-xlsx)
Downloading pyexcel-io-0.2.1.zip
Running setup.py (path:/tmp/pip_build_root/pyexcel-io/setup.py) egg_info for package pyexcel-io
error in pyexcel-io setup command: Invalid environment marker: python_version<"2.7"
Complete output from command python setup.py egg_info:
error in pyexcel-io setup command: Invalid environment marker: python_version<"2.7"


file descriptors are not closed

For example, when running the following code on Linux:

import os

import pyexcel

pyexcel.get_book_dict(file_name='test.xlsx')

fd_dir = '/proc/%u/fd' % os.getpid()
for fd_name in os.listdir(fd_dir):
    print(fd_name, '-> ', end='')
    try:
        print(os.readlink('%s/%s' % (fd_dir, fd_name)))
    except FileNotFoundError:
        print()

The last file descriptor open point to test.xlsx.

Beside the file descriptor leak, this is really problematic in Windows, as it make it impossible to concurrently modify the spreadsheet in Office when it has been read in another (still running) application.

mechanism of dynamic load module, confilict with pyinstaller/cx_freeze's execution

Hi, I‘ve test an pyqt GUI app with pyexcel_xlsx, my application is finally packaged as an exectuion file by pyinstaller OR cx_freeze,

pyinstaller.exe  --distpath dist_pyinstaller  -F mainc.py 

and get dist_pyinstaller\mainc.exe binary file.
the application can start well. but will fail when it's time to use pyexcel_xlsx functionility.

Traceback (most recent call last):
  File "pyexcel2csv_gui\winmain\winmain.py", line 72, in  on_btnSelectExcelFileAndExport2Csv_clicked
  File "site-packages\pyexcel_xlsx\__init__.py", line 38, in get_data
  File "site-packages\pyexcel_io\io.py", line 65, in get_data
  File "site-packages\pyexcel_io\io.py", line 83, in _get_data
  File "site-packages\pyexcel_io\io.py", line 173, in load_data
  File "site-packages\pyexcel_io\plugins.py", line 78, in get_a_plugin
  File "site-packages\lml\plugin.py", line 275, in load_me_now
  File "site-packages\lml\plugin.py", line 303, in dynamic_load_library
  File "site-packages\lml\utils.py", line 60, in do_import_class
ImportError: No module named xlsxr

anlyse the source code

def do_import_class(plugin_class):
    """dynamically import a class"""
    try:
        plugin_module_name = plugin_class.rsplit('.', 1)[0]
        #print plugin_module_name,   plugin_class 
        #shows "pyexcel_xlsx.xlsxr"    , shows  "pyexcel_xlsx.xlsxr.XLSXBook"
        plugin_module = __import__(plugin_module_name)
        modules = plugin_class.split('.')
        for module in modules[1:]:
            plugin_module = getattr(plugin_module, module)
        return plugin_module
    except ImportError:
        log.exception("Failed to import %s", plugin_module_name)
        raise

pyexcel_xlsx has been installed by pip. and

site-packages\pyexcel-xlsx\xlsxr.py

file exists.

if directory call by CLI way

python.exe  mainc.py

to run same mainc.py script, ALL works ok.
ENV is python 2.7.9.

so , seems the dynamic loading not works well in pyinstaller OR cx_freeze situation ?

Detect an empty cell

I have imported my .xlsx file.
The empty cells are imported as 0.
Is it possible to detect these empty cells?

Thanks

Compatibility issue with openpyxl 3.1.0 - read_sheet - TypeError: 'set' object is not subscriptable

Recent update of openpyxl to 3.1.0 breaks read_sheet.

Below is the traceback

Traceback (most recent call last):
  File "/code/base/tests/tests.py", line 3318, in test_sheet
    sheet = pe.get_sheet(
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel/core.py", line 37, in get_sheet
    named_content = sources.get_sheet_stream(**keywords)
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel/internal/core.py", line 21, in get_sheet_stream
    sheets = a_source.get_data()
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel/plugins/sources/memory_input.py", line 40, in get_data
    sheets = self.__parser.parse_file_content(
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel/plugins/parsers/excel.py", line 27, in parse_file_content
    return self._parse_any(
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel/plugins/parsers/excel.py", line 40, in _parse_any
    sheets = get_data(anything, file_type=file_type, **keywords)
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel_io/io.py", line 86, in get_data
    data, _ = _get_data(
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel_io/io.py", line 105, in _get_data
    return load_data(**keywords)
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel_io/io.py", line 205, in load_data
    result = reader.read_all()
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel_io/reader.py", line 95, in read_all
    content_dict = self.read_sheet_by_index(sheet_index)
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel_io/reader.py", line 84, in read_sheet_by_index
    sheet_reader = self.reader.read_sheet(sheet_index)
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel_xlsx/xlsxr.py", line 148, in read_sheet
    sheet = SlowSheet(native_sheet, **self.keywords)
  File "/opt/project/.venv/lib/python3.10/site-packages/pyexcel_xlsx/xlsxr.py", line 72, in __init__
    for ranges in sheet.merged_cells.ranges[:]:

TypeError: 'set' object is not subscriptable

Pinning version of openpyxl to 3.0.10 is a temporary workaround.

openpyxl leaks file handle in pypy

$ python test.py
number of open file handles: 0
number of open file handles: 2
[popenfile(path='/Users/github/pyexcel-xlsx/test.xlsx', fd=5), popenfile(path='/Users/github/pyexcel-xlsx/test.xlsx', fd=6)]
Traceback (most recent call last):
  File "test.py", line 14, in <module>
    assert len(open_files_l1) == len(open_files_l2)
AssertionError
$ python
Python 2.7.13 (1aa2d8e03cdf, Mar 31 2017, 10:20:53)
[PyPy 5.7.1 with GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

reference: #14

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.