unoconv is a command line tool to convert any document format that LibreOffice can import to any document format that LibreOffice can export. It makes use of the LibreOffice’s UNO bindings for non-interactive conversion of documents.
For practical reasons we mention LibreOffice, but OpenOffice is supported by unoconv as well.
unoconv starts its own office instance (if it cannot find an existing listener) that it then uses. There are some challenges to do this correctly, but in general this works fine.
Typically you would convert an ODT document to PDF by running:
unoconv -f pdf some-file.odt
However, you can always start an instance yourself at the default port 2002 (or specify another port with -p/--port) and then use unoconv until you’re finished using it and then stop it.
unoconv --listener & sleep 20 unoconv -f pdf *.odt unoconv -f doc *.odt unoconv -f html *.odt kill -15 %-
It is also possible to use a listener or LibreOffice instance that accepts connections on another system and use it from unoconv remotely. This way the conversion tasks are performed on a dedicated system instead of on the client system.
Since there is no feedback mechanism to see if the newly started ooffice is in fact ready to communicate, we don’t know exactly when we can use the UNO api. By default unoconv waits for 3 seconds, but you can specify another delay with -T/--timeout. If ooffice is not ready, you may get a random error.
Also beware that the pyuno python module needs to be compiled with the exact same version of python that you are using to load it. A lot of people that run into problems loading pyuno are actually using a precompiled LibreOffice that they downloaded somewhere and is incompatible with their python version. (Often the LibreOffice comes with its own python build that works flawlessly with their own pyuno)
The most recent unoconv works around this issue by automatically detecting incompatibilities, and restarting itself using a compatible python (the one that ships with LibreOffice).
You can influence the automatic detection by setting the UNO_PATH environment variable to point to an alternative LibreOffice installation, e.g.:
UNO_PATH=/opt/libreoffice3.4 unoconv some-file.odt
Since OpenOffice 2.3 you do not need an X display for starting ooffice. However you may need the openoffice.org-headless package from your distribution. Since LibreOffice 2.4 nothing special is needed, running in headless mode does not require X.
For any older OpenOffice releases, remember that ooffice requires an X display, even when using it in headless mode. One solution is to use Xvfb to create a headless X display for ooffice.
Some people have had difficulties using unoconv through webservices. Here is a list of probable causes:
-
X display issues (may require the headless subpackage)
-
Permission issues (unable to run or connect to LibreOffice or read/write)
-
SELinux issues (SELinux prevents you from connecting to LibreOffice)
If you encounter problems converting files, it often helps to try again. If you are using a listener, restarting the listener may help as well.
The reason for conversion failures are unclear, and they are not deterministic. unoconv is not the only project to have noticed problems with import and export filters using PyUNO. We assume these are related to internal state or timing issues that under certain conditions fail to correctly work.
We are looking into this with the LibreOffice developers to:
-
Collaborate closer to find, report and fix unexpected failures
-
Allow end-users to increase debugging and reporting to the project
Other tools that are useful or similar in operation:
-
Text based document generation: http://www.methods.co.nz/asciidoc/
-
DocBook to OpenDocument XSLT: http://open.comsultia.com/docbook2odf/
-
Simple (and stupid) converter from OpenDocument Text to plain text: http://stosberg.net/odt2txt/
-
Another python tool to aid in converting files using UNO: http://www.artofsolving.com/files/DocumentConverter.py http://www.artofsolving.com/opensource/pyodconverter