shuchkin / simplexlsx Goto Github PK
View Code? Open in Web Editor NEWParse and retrieve data from Excel XLSx files
License: MIT License
Parse and retrieve data from Excel XLSx files
License: MIT License
Hi!
There is a "lost" echo command in the function "rows".
Thanks!
Great library! I've noticed however if you edit or create a spreadsheet using Apple Numbers and save as an xlsx it doesn't appear to be able to read the spreadsheet. For whatever reason it causes a fatal error in PHP exhausting available memory (set to 1024MB in PHP which is huge!).
I've tested this and can reproduce the issue. I'm guessing that it's Numbers not converting the binary data correctly and causing the spreadsheet to be "unreadable" and hits an infinite loop.
Function returns all data in all sheets as if worksheetIndex not defined.
PHP7
tabs2spaces
remove file operations from __constructor
update: parseFile & parseData helpers ( added 0.8.13 )
updated: + Namespaces
updated: + xlsx comment reader (example below )
hey, make suggestions
Hi,
I have an xlsx file that is not able to open using your lib. If I open this file on excel and save it again, something is fixed internally and your library can open the file.
Can I send you this file to you take a look?
Thank you,
Inácio.
Hi,
I want to read some data from excel files (xlsx), analyze and insert into database on a WordPress site.
I don't know how to include SimpleXlsx without use composer ?
Hope give me some advises.
Thanks !
if numbers in rows more than 15 number its will be change to E number:like this 12345678910E+10
dont return full number
Array
(
[0] => Array
(
[0] => 3.5946708082419E+14
[1] => 111111
[2] =>
)
[1] => Array
(
[0] => 3.5946308078233E+14
[1] => 111112
[2] =>
)
[2] => Array
(
[0] => 3.5583108883883E+14
[1] => 111116
[2] =>
)
Parsing error found: value of data[1][0] must be at data[2][0]. Fix it, please.
Fatal error: Allowed memory size of 536870912 bytes exhausted (tried to allocate 528384 bytes) in C:\xampp\htdocs\redes3\SimpleXLSX.php on line 691
Hi,
I tried to import 20000 rows by excels to mysql database but there are some issues during import :
A PHP Error was encountered
Severity: Warning
Message: gmdate() expects parameter 2 to be integer, float given
Filename: controllers/simplexlsx.class.php
Line Number: 725
So after a bit of debugging.
I have determined that the sheetNames()
function returns ids and names. However using those ids to get the rows can return the incorrect sheet. if the spreadsheet has the sheets reordered. Asking for rows is not grabbing the matching sheet id but the sheet in the order of how they are sorted.
So if I made
Sheet 1
Sheet 2
Sheet 3
Then I reordered them before saving to
Sheet 2
Sheet 3
Sheet 1
Asking for sheet names with SimpleXLSX would return
2 => Sheet 2
3 => Sheet 3
1 => Sheet 1
But when I asked for ->rows(1)
which is the id for Sheet 1
I am getting back values from Sheet 2 since it is the first in the array.
First I want to give thanks for such a wonderful tool.
I would like to make a small comment about the installation of the library by means of composer.
composer require shuchkin/simplexlsx
The previous line does not work properly, for it to work it must be installed as follows:
composer require shuchkin/simplexlsx:dev-master
Is it possible to fix that?
Thank you for your time and dedication.
Hi,
I guess I found an issue about the lib. I'm reading .xlsx file with a beginning line given (here its 15) because there isn't data before line 15. And when i'm reading the file with the lib, the first data read (which is on line 15 when I open the file manually) are detected on line 12 when debugging.
I can reproduce the issue. This is when you insert lines (with Excel, right-click on the index column at left and Insert).
I can add tons of lines before, the data will always be read at line 12.
Is there an issue to this issue ? Or if not, can you fix it ?
Thanks & thanks for this weightless lib !
I have downloaded a report from google analytics and it seems to be having issues reading the sheet names. It only returns the last sheet name and assigns the index as 0.
There are three sheets and I am only able to read the first one.
If I delete the first sheet, it will read the other to with ->rows(1) and ->rows(2) but still will not read the sheet names properly.
I have attached the file as it was downloaded.
I had previously uploaded an edited version of the spreadsheet. This is the unedited version.
hi
this lib have an error 'to red
i have two file and i attach them
one of them which have a same rows, its OK
but one of them, 3 rows in the first have not correct , its not OK and when i read this file and wana to show or save in database dos not work
i attach two files
this is correct.xlsx
this is not correct.xlsx
thanks
I had to run :
$filtered = array_filter($r);
if (!empty($filtered)) {
Just to remove the empty rows, there is a way to get this inside the dimension function as part of the core?
Is there a minimum PHP version? I can't seem to get it working, even using the examples.
Hi. Just wondering if I can use this in a commercial project? But I found no license information. Is it LGPL or similar or just free? Best if this would be added to the source-code (some header comment information).
dimension() does not report correct size for all files from this source (Neustar Port PS).
Example file provided.
Hi, Your library is excelent!. Congratulations to you. Works grate.
I found this issue:
When I try to load an xlsx file saved with libreoffice, got an error,
so, I edit the worksheet function
// sheets numeration: 1,2,3....
public function rows( $worksheet_id = 1 ) {
//AND ADD THIS LINES, BEACOUSE; $this->sheets[$index] $index starts on 2
and $this->sheets[1] was not defined.
//fix libreoffice error
if(!isset($this->sheets[1]) && isset($this->sheets[2])){
$this->sheets[1] = $this->sheets[2];
}
The libreoffice I'm using:
Version: 4.3.3.2
Build ID: 430m0(Build:2)
and save file as Microsoft Excel 2007/2010/2013 (xlsx)
Excelente trabajo!
Saludos!
You should create a composer package (on packagist).
The package created by @Jakeroid is not updated (synced) with your version.
Also, better versioning in github, with releases.
Just suggestions. Nice work, by the way!!
Hi,
I use your class to read an xlsx file. I can read all sheets, but the sheetName() gives no name of the sheet return. I use this $_xlsx->sheetName(1); to read the name
Any idea where my problem is?
Andreas
Hello, I am having this problem, when wanting to read data of percentages, is returned to me divided by 100, for example if I have a cell with 17.34% I am receiving 0.17 as data read.
How could I get the real value of the cell? when it is of type percentage.
Is not a solution the multiplication by 100 but, because some extra data is missed, for exemple de float data.
Thanks.
Is there a way to count valid columns or I need to know the total columns at the beginning?
First row is the header in my excel... so I need to increment a counter until first row and $index_columns in empty? Or is there a better way?
Sergey, great tool.
Have this strange issue with it on a XLSX file that I am getting from a 3rd party by email, nightly.
I have written PHP wrapper around your tool, that gets the email from an IMAP account, then downloads the attachment and uses your tool to extract the data.
Two of the fields use a Custom Number Format ###0, and the output for these fields is in a Date/time format.
[GHIN Number] => 8231-10-07 00:00:00
[Local Number] => 1900-01-11 00:00:00
[Prefix] =>
[First Name] => David
The number should be:
[0] => SimpleXMLElement Object
(
[@attributes] => Array
(
[r] => A19
[s] => 9
)
[v] => 2312631.0
)
Unfortunately I don't have any access to the formatting at the source.
I have attached the file but are worried it won't be very useful, since when I save it on my PC and send it to myself and run the PHP it works just fine.
So I started digging around a bit.
I found some of you debug echo and turned them on. Found this
[numFmt] => Array
(
[0] => SimpleXMLElement Object
(
[@attributes] => Array
(
[numFmtId] => 50
[formatCode] => ####0
)
)
And of course numFmtId =50 looks like it is a date time format. Fortunately this gave me a work around for now. Just edited the id 50 to '0';
If you can't duplicate this on your end would be happy to help more, just push me in the right direction.
I have a XLSX that have some cell with accented chars, like:
Città
When I retrieve cell with SimpleXLSX and print result with var_dump I get wrong text:
"cittã " //<---- wrong ã with final space
I've already insert this at the beginning of the php page without result:
header('Content-Type: text/html; charset=utf-8');
I have an XLSX file I need to convert to CSV via the code below, however, at times this file contains chinese, japanese and korean characters which get lost in the conversion process.
How can I convert to CSV but keep the UTF8 characters intact?
`
// convert from XLSX to CSV
print ("XLSX file found, starting conversion to CSV
");
include '../../plugins/simplexlsx/simplexlsx.class.php';
$file_csv = $newfilename.'.csv'; // imported filename + .csv
$xlsx = new SimpleXLSX('./attachments/temp/'.$newfilename);
// convert file and move to property attachment folder
$fp = fopen('./attachments/'.$hotel_code.'/'.$file_csv, 'w');
foreach( $xlsx->rows() as $fields ) {
fputcsv( $fp, $fields);
}
fclose($fp);
print ("XLSX file has been converted to CSV and moved to property attachment folder.
");`
Hi, SimpleXLSX::parse
with "skip_empty_rows = true" return arrays with empty values.
Hey!
First of all thank you very much for the great work.
I have installed the package through composer and it is stating that v.0.8.8 was installed,
but the SimpleXLSX class has still version v.0.8.7...
composer show shuchkin/simplexlsx states:
You can also reproduce this also when you download the .ZIP for release 0.8.8. The .ZIP also contains SimpleXLSX 0.8.7.
Best regards
Hi Sergey,
First of all, I love this little piece of software that you created. The other day I was looking for a PHP solution to read Excel files and found yours, easy to use, and it just works.
I'd love to do some further work on it with you if you agree. I've got a few potential improvement ideas in mind. No major refactoring for now, but some modernisation if you will. Namely:
What do you say, am I okay to proceed with these? I am thinking of doing the updates in relatively smaller chunks and commits, so you can review, merge, or reject them more easily.
Or if you have a different method, I'm up for it. Let me know what you think.
Hi , very nice library. A small fix is needed in example 03-sheets.php : the comments say "output worksheet 1" and "output worksheet 2" but in fact it outputs the worksheets 2 and 3, with the titles of the sheets 1 and 2.
I get after use the Version 0.7.4
the following output:
xlsx error: xl/../customXml/item1.xml not found!
I did not understand this. Is that the message it can not found the file?
hoping you can help me figure something out for a client
they have a column in xlsx file that contains both date and string but in excel the column is formatted as date so appears when open in excel as
6 weeks
8/20/2019
12 weeks
8 weeks
when i import using
$xlsx = new SimpleXLSX('QuoteUpdate.xlsx');
list($num_cols, $num_rows) = $xlsx->dimension();
$f = 0;
foreach ($xlsx->rows() as $r) {...
else {
if ($i == 10) {
$date_due = $r[$i];
} }
it changes the text ( 6 weeks) into date format (2019-1-24) i tried $date_due2=(string)($date_due); but still gives me the date ..thoughts. the file comes from an outside source so i cant have them change the formatting
Hi Sergey!
First of all, thanks for the great job with SimpleXLSX.
The PHP team released the new PHP 7.4.0 Alpha 1.
I was reading the "Backward Incompatible Changes" item, and I think there is a point of concern for SimpleXLSX.
Please access the link below:
https://github.com/php/php-src/blob/php-7.4.0alpha1/UPGRADING
The "1. Backward Incompatible Changes" -> "- Core:" ->
" . Passing the result of a (non-reference) list() assignment by reference is
consistently disallowed now. Previously this worked if the right hand side
was a simple (CV) variable and did not occur as part of the list()."
Do you think is there any impact in SimpleXLSX code?
Thanks!
Hi, a small question about time fields in the excel file.
In the file there are cells formatted as date (dd/mm/yyyy) and cells formatted as time (hh:mm).
Returning the date is no problem. Returning the time is not good. The class just returns '01/01/1970' which is a excel default date.
Is there a way to work around this issue ?
Thanks in advance.
Thanks for making this, definitely saved the day!
I don't know if it's possible, but it would be nice if you could make it parse the rows of the file incrementally, so you could cycle threw it line by line till it's empty instead of the giant parse at one time. Probably better for memory issues on potentially too large files
Thanks!
Try this file with your module. In my case it do not extract data and skip_empty_rows option not work. This file was created with Excell, but later was modified with LibreOffice. Problems appeared after modification. Source file was read correctly.
ФЕДЕРАЛЬНЫЕ ЕДИНИЧНЫЕ РАСЦЕНКИ НА КАПИТАЛЬНЫЙ РЕМОНТ ОБОРУДОВАНИЯ. Работы.xlsx
Hello Sergey,
SimpleXLSX is an impressive class - easy to use and very fast.
Benchmarking this class for large .xlsx reads (3MB file) against Box's Spout library - I found SimpleXLSX to be approximately 18 times faster (PHP7, Linux laptop).
Functionality that may be worth adding to the class is extracting column data.
Below is my attempt at this.
require('simplexlsx.class.php');
$xlsx = new SimpleXLSX('data.xlsx');
$sheet = $xlsx->rows(); # first sheet
unset($xlsx); # free object memory when continuing to process large files
$columns = extractColumns($sheet, [0, 1, 2, 5]); # variable number and selection of columns
var_dump($columns);
#####
function extractColumns(array &$aInput, array $aColumns) {
/**
* Extract multiple array columns.
* Martin Latter, 06/05/17
*
* @param array $aInput, input array
* @param array $aColumns, array of integers corresponding to column index/position
* @return indexed array
*/
$aCols = [];
foreach ($aColumns as $i) {
$s = '$c' . $i;
$aCols[$s] = $i;
}
return array_map(function($aRow) use($aCols) {
$aOutput = [];
foreach ($aCols as $c) {
$aOutput[] = $aRow[$c];
}
return $aOutput;
}, $aInput);
}
https://github.com/shuchkin/simplexlsx#xlsxparse-memory-data
Missing closing brace on that if statement.
Fixed version:
if ($xlsx = SimpleXLSX::parse($data, true)) {
print_r($xlsx->rows());
} else {
echo SimpleXLSX::parseError();
}
Even I set the sheet number some of xlsx files that have more than 25000 records give this error.
public function worksheet( $worksheet_id ) {
if ( isset( $this->sheets[ $worksheet_id ] ) ) {
$ws = $this->sheets[ $worksheet_id ];
if (isset($ws->hyperlinks)) {
$this->hyperlinks = array();
foreach( $ws->hyperlinks->hyperlink as $hyperlink ) {
$this->hyperlinks[ (string) $hyperlink['ref'] ] = (string) $hyperlink['display'];
}
}
return $ws;
} else {
**$this->error( 'Worksheet '.$worksheet_id.' not found.');**
return false;
}
}
I have download the new version 0.7.3, but it did not read the xlsx and give some errors:
if I use the old style:
$_xlsx = new SimpleXLSX('öbsv_results.xlsx');
if($_xlsx->success()){
}else{
echo 'xlsx error: '.$_xslx->error();
}
this error are shown:
Notice: Undefined variable: _xslx in C:\xampp\htdocs\vhost\support.a-timing.wien\importParaResults.php on line 111
Fatal error: Uncaught Error: Call to a member function error() on null in C:\xampp\htdocs\vhost\support.a-timing.wien\importParaResults.php:111 Stack trace: #0 {main} thrown in C:\xampp\htdocs\vhost\support.a-timing.wien\importParaResults.php on line 111
if I use the new one
$_xlsx = SimpleXLSX::parse('öbsv_results.xlsx');
print_r($_xlsx->rows(2));
Fatal error: Uncaught Error: Call to a member function rows() on boolean in C:\xampp\htdocs\vhost\support.a-timing.wien\importParaResults.php:28 Stack trace: #0 {main} thrown in C:\xampp\htdocs\vhost\support.a-timing.wien\importParaResults.php on line 28
did you have any idea where my problem is?
I have this issue with the import of an Excel 2016 Sheet again.
Warning: gmdate() expects parameter 2 to be integer, float given in ...\SimpleXLSX.php on line 868
Warning: gmdate() expects parameter 2 to be integer, float given in ...\SimpleXLSX.php on line 868
The warning comes two times. I'am not sure why this warning comes again.
use PHP 7.2 and simpleXLSX 0.8.3
Andreas
If you try to read a file with this Content-Type it does not work:
application/vnd.ms-excel
It´s a Excel create with MsExcel 97.
I have another issue with the import of an Excel 2016 Sheet. I get following warning:
Warning: gmdate() expects parameter 2 to be integer, float given in C:\xampp\htdocs\vhosts\tcup\lib\simplexlsx.class.php on line 418
I use the following code to import it:
$_xlsx = new SimpleXLSX("finswimming_Cup.xlsx");
if ($_xlsx->success()){
$_noSheets = $_xlsx->sheetsCount();
list($cols, $rows) = $_xlsx->dimension(10);
print"<pre>";print_r( $_xlsx->rows(10) );print "</pre>";
}
Great code, thanks a lot!
const SCHEMA_OFFICEDOCUMENT = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument';
const SCHEMA_RELATIONSHIP = 'http://schemas.openxmlformats.org/package/2006/relationships';
const SCHEMA_SHAREDSTRINGS = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings';
const SCHEMA_WORKSHEETRELATION = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet';
The website used to get those constants values seems not to be working anymore.
Hi, I'm getting a strange 500 internal server error while using your library.
I have several files with the data from a system, all of them in the same format. Some of them open without any problem, others return error 500.
Can I send the files to you? Or you can just download for example here http://np04-slow-control.web.cern.ch/np04-slow-control/app/data/Monitoring_18_08_08_Merge.XLSX which is giving the error. The file Monitoring_18_08_10_Merge.XLSX is read without any problem.
Thanks in advance
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.