Giter Site home page Giter Site logo

jmcnamara / rust_xlsxwriter Goto Github PK

View Code? Open in Web Editor NEW
221.0 221.0 20.0 10.87 MB

A Rust library for creating Excel XLSX files.

Home Page: https://crates.io/crates/rust_xlsxwriter

License: Apache License 2.0

Rust 100.00%
libxlsxwriter rust xlsx xlsxwriter

rust_xlsxwriter's People

Contributors

adriandelgado avatar dodomorandi avatar fight-for-food avatar guillaumepressiat avatar jmcnamara avatar lucatrv avatar sebastian2010 avatar zjp-cn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rust_xlsxwriter's Issues

Don't depend on tempfile

Feature Request

As can be seen here, rust_xlsxwriter currently depends on the tempfile crate, which in turn means that a /tmp directory (or equivalent) needs to be available. This isn't the case in Kubernetes environments by default, which is how I noticed this. I guess wasm environments would also have trouble with this.

Wouldn't it make more sense to default to in-memory storage, and make tempfile an optional feature to the crate? Then with that feature enabled, one could opt in to the file-creating behavior using an associated function, e.g. Workbook::new_tempfile_backed().

Format a table column header

Feature Request

Using xlsxwriter, it's possible to use the following field to set header formatting:

// xlsxwriter
TableColumn {
    header_format: Option<xlsxwriter::Format>,
    // ...
}

It would be nice to be able to do something similar with this library, e.g.:

// rust_xlsxwriter (hypothetical)
TableColumn::new()
    .set_header_format(rust_xlsxwriter::Format)

Is it already possible to accomplish this in another way or is this something that will be implemented?

Best regards

feature request: Could serialize_chrono_naive_to_excel() accept Option<T>?

Feature Request

Hi!!!

Please see:
https://github.com/claudiofsr/read_xml/blob/master/src/excel.rs
https://github.com/claudiofsr/read_xml/blob/master/src/xml_structs/efinanceira.rs

I have:

#[derive(Debug, Default, Serialize, Deserialize, Clone)]
pub struct InfoEFinanceira {
   ...
    #[serde(rename = "Ano/Mês Caixa Reportado")]
    pub ano_mes_caixa: Option<NaiveDate>,
   ...
    #[serde(rename = "Balanço da Conta: Total de Débitos")]
    pub tot_debitos: f64,
}

I try use

#[serde(rename = "Ano/Mês Caixa Reportado", serialize_with = "serialize_chrono_naive_to_excel")]

but I have ano_mes_caixa: Option<NaiveDate>.

Could serialize_chrono_naive_to_excel() accept Option<T>?

Thanks.

question: Vec<CustomStruct> for headers and data?

Question

Is there a way to serialize a vec to a xlsx data table? I was looking through the documentation but it seems only to be possible from arrays and things that are iters, I imagine there should be a way to serialize a vec into a table but it's evading me.

feature request: manually flush contents to disk

Feature Request

Thanks for this amazing crate!

Can we have an API that allows user to manually save current memory data into disk? For two reasons:

  • Currently the xlsx file is saved in the end, if there is a XlsxError occurred in between, all the data that has been written into memory have lost, some times we also like to access the half-complete file even if the process is aborted.
  • For manually memory management, differs from constant memory in the roadmap, user can flush the data into a file at any place they think is suitable. For instance, flush once a worksheet is completed or number of rows have been reached.

pseudo code is like:

let xpath = "/xx/xxx.xlsx"
let mut xlsx = Workbook::new().set_save_path(xpath);
// create a worksheet and write data in memory
let sheet = xlsx.add_worksheet();
sheet.set_name("A");
sheet.write_number_only(0, 0, 0)?;
// ...
// flush memory data into disk
xlsx.flush_to_disk();
// maybe we can still get sheet A?
let sheet = xlsx.worksheet_from_name("A")?;
sheet.write_number_only(1, 0, 1)?;
// save all to disk
xlsx.save_to_path(xx)?;

Wonder if it's implementable in current architecture?

Ideas for perf. improvements

I spent the weekend benchmarking, refactoring and optimizing the crate without changing the public API and got improvements of around 10%.
However I didn't split the changes into several commits so in the course of the following week I'm going to send PRs that are easier to review. Below are some of the changes I made. You can give feedback on any of them.

Throughout several files

  • Change vec! to static arrays whenever possible
  • Use the Entry API whenever posible. This avoids the .get(key) -> .insert(key,value) pattern which needs to double check the map.
  • If a value's size is less than or equal to 8 bytes, it's cheaper to copy than to reference. So I removed those references.

xmlwriter.rs

  • Changed the parameter type of attributes from &Vec<(&str, String)> to &[(&str, &str)], this implied a LOT of changes thoughout the codebase and a bit of borrow checker fighting but it removed a lot of unnecessary allocations in the form of .to_string(), vec! and attributes.push().
  • Slight refactor of the escape_string function to use str::char_indices instead of .chars().enumerate().
  • Removed some unnecessary allocations inside the functions.

worksheet.rs

  • Changed Worksheet::table from a HashMap to a BTreeMap for noticeable performance improvements and needed almost no other changes to the rest of the codebase.
  • Using the Entry API inside Worksheet::col_to_name had noticeable perf. improvements.
  • Changed use of HashSet to .sort_unstable() -> .dedup() inside process_pagebreaks.
  • Refactor the Worksheet::get_cache_data function. This didn't have perf. implications but it reduced the heavy indentation and the lines of code without sacrificing clarity.
  • Refactor the Worksheet::write_data_table and Worksheet::calculate_spans. We don't need to index the BTreeMaps because they are ordered already. We can cheaply iterate. This had very noticeable perf. improvements.
  • Change Option<&String> to Option<&str> in the Worksheet::write_row function.
  • Removed some unnecesary allocations inside the write_*_cell functions.
  • Bribed the borrow checker to remove some of the .clone().iter() and just directly iterate.
  • Changed the CellType enum. Now it uses Box<str> instead of String. Before: 80 bytes, 8 alignment. After: 56 bytes 6 alignment. This improves cache locality and allows the compiler to better optimize its use. This had noticeable perf. improvements and needed almost no other changes to the rest of the codebase.

format.rs

  • Changed the output type of XlsxColor::chart_scheme from (String, u32, u32) to (&'static str, u32, u32)

filter.rs

  • Changed the output type of FilterCriteria::operator from String to &'static str

chart.rs

  • Added an as_str method that outputs an &'static str to ChartMarkerType, ChartAxisPosition, ChartGrouping, ChartLegendPosition, ChartLineDashType, ChartPatternFillType. Then I refactored the body of the ToString::to_string implementations to self.as_str().to_string(). I think it would be preferable to delete the ToString::to_string implementations but technically those are part of the public API.

Other ideas

Using the deflate-zlib feature flag has huge performance benefits at the cost of not being pure Rust.

Possible bug with sheet name apostrophes in ChartRange

Current behavior

"'" is only stripped from sheet name in ChartRange if both start and beginning are apostrophes

Expected behavior

Either ' in the beginning or the end of the sheet name in a Chart range would be stripped, since the excel standard requires them to be removed independently

Sample code to reproduce

use rust_xlsxwriter::{Workbook, XlsxError};

fn main() -> Result<(), XlsxError> {
    // This apostrophe would not be stripped since the code only strips if both end and beginning
    // of the chart name are apostrophes
    let chart = ChartRange::new_from_string("\'SheetName!A1:B2");
}

Environment

- rust_xlsxwriter version:
- rustc version:
- Excel version:
- OS:

Any other information

No response

WIP and RFC: Initial support for Serde serialization

I've added initial support for serialization of Serde structures to rust_xlsxwriter main under the serde feature flag.

The support works like this:

  1. Prepare a standard Serde #[derive(Serialize)] struct.
  2. Serialize the header to a location in the worksheet. The fields of the struct will become headers.
  3. Call worksheet.serialize() repeatedly to write data, without having to specify the row/col position.

For example:

use rust_xlsxwriter::{Format, Workbook, XlsxError};
use serde::Serialize;

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Add a simple format for the headers.
    let format = Format::new().set_bold();

    // Create a serializable test struct.
    #[derive(Serialize)]
    #[serde(rename_all = "PascalCase")]
    struct Produce {
        fruit: &'static str,
        cost: f64,
    }

    // Create some data instances.
    let item1 = Produce {
        fruit: "Peach",
        cost: 1.05,
    };
    let item2 = Produce {
        fruit: "Plum",
        cost: 0.15,
    };
    let item3 = Produce {
        fruit: "Pear",
        cost: 0.75,
    };

    // Set up the start location and headers of the data to be serialized using
    // any temporary or valid instance.
    worksheet.serialize_headers_with_format(0, 0, &item1, &format)?;

    // Serialize the data.
    worksheet.serialize(&item1)?;
    worksheet.serialize(&item2)?;
    worksheet.serialize(&item3)?;

    // Save the file.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

Which would give this output:
worksheet_serialize

The output can be positioned anywhere in the worksheet. For example if we change this line in the previous example:

    worksheet.serialize_headers_with_format(1, 3, &item1, &format)?;

We get this output:

screenshot

It will also serialize vectors in the struct:

    // Create a serializable test struct with vector fields.
    #[derive(Serialize)]
    #[serde(rename_all = "PascalCase")]
    struct Produce {
        fruit: Vec<&'static str>,
        cost: Vec<f64>,
    }

    // Create some data instances.
    let item = Produce {
        fruit: vec!["Peach", "Plum", "Pear"],
        cost: vec![1.05, 0.15, 0.75],
    };

    // Set up the start location and headers of the data to be serialized using
    // any temporary or valid instance.
    worksheet.serialize_headers_with_format(0, 0, &item, &format)?;

    // Serialize the data.
    worksheet.serialize(&item)?;

This gives the same output as the first example.

From the docs:

This method can be used, with some limitations, to serialize (i.e., convert automatically) structs that are serializable by Serde into cells on a worksheet.

The limitations are that the primary data type to be serialized must be a struct and its fields must be either primitive types (strings, chars, numbers, booleans) or vector/array types. Compound types such as enums, tuples or maps aren't supported. The reason for this is that the output data must fit in the 2D cell format of an Excel worksheet. (Note: this limitation is open for debate if it makes sense, see below.)

In order to serialize an instance of a data structure you must first define the fields/headers and worksheet location that the serialization will refer to. You can do this with the Worksheet::serialize_headers() or Worksheet::serialize_headers_with_format() methods. Any subsequent call to serialize() will write the serialized data below the headers and below any previously serialized data.

I am looking for feedback on the workability of this technique for any serialization use case that people may have. In particular I'd appreciate feedback from trying to make it work with existing serialize structures (within reason of what could be applied to a worksheet). Please leave comments below.

The code is on main. You will need to enable it in a local project with the following or similar:

cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git -F serde

There is some ongoing work to add support for ExcelDateTime and Chrono date/times. I will also be adding a method for adding formatting to each field value, and options to ignore fields apart from Serde #[serde(skip_serializing)], and also to reorder the fields. I may also change the error handling to just ignore unknown structs/fields.

rust_xlsxwriter Roadmap

rust_xlsxwriter is a Rust library that can be used to write text, numbers, dates and formulas to multiple worksheets in an Excel 2007+ XLSX file.

It is a port of the XlsxWriter Python module by the same author. I also actively maintain a C version libxlsxwriter and a Perl version Excel::Writer::XLSX. The Rust version will also try to address some limitations and frequently requested features of the previous versions, such as the separation of formatting and data writing.

The overall focus of rust_xlsxwriter is on performance, on testing, on documentation, and on fidelity with the file format created by Excel. Reading or modifying files is outside the scope of this library.

The major functionality will be added in approximately this order:

  • Charts - ongoing.
  • Tables
  • Conditional formatting
  • Data validation

Phase 1: Basic functionality

Phase 2: Medium functionality

  • Image support. Done: Image
  • Images in headers. Done: Headers and Footers
  • URLs in images.
  • Image/object positioning. Done
  • Embedded images. Done
  • Autofit - simulated: Done
  • Defined names. Done
  • Low memory mode. Example
  • In memory mode. Done: workbook.save_to_buffer()
  • Tempfile directory support. Example
  • Rich strings. Done
  • Dynamic functions . Done
  • Features/Optional dependencies.
  • Cell formatting separate from data writing
  • Border formatting for a range
  • Format merging

Phase 3: Larger functionality

  • Autofilters. Done
  • Charts. Example Under development. Charts have their own roadmap: #19
  • Tables. Done.
  • Conditional formatting. Done
  • Data Validation. Example
  • Cell comments. Example
  • Sparklines. Done.
  • Serde serialization support Done
  • Other XlsxWriter features

Feature Request: add `write_datetime_with_format`, while `write_datetime` should not receive a format

Feature Request

I set a column format with

worksheet.set_column_format(col, &rust_xlsxwriter::Format::new().set_num_format("dd/mm/yyyy"))?;

So IMHO I should be able to write dates to that column with:

worksheet.write_datetime(row, col, &date)?;

instead of having to repeat the same format with:

worksheet.write_datetime(row, col, &date, &rust_xlsxwriter::Format::new().set_num_format("dd/mm/yyyy"))?;

In other words, the plain write_datetime method should not receive a format, while a separate write_datetime_with_format method should be added for those cases where the date format has not already been applied to the cell.

Help wanted: ideas for improving the test run time

Help wanted

One of the goals of rust_xlsxwriter is fidelity with the xlsx file format generated by Excel. This is achieved using integration tests that take files created in Excel 2007 and compares them file by file and element by element with files created using rust_xlsxwriter.

Here is a typical test file, the associated xlsx file and the test runner code.

This approach has a number of advantages from an maintenance point of view:

  • It allows incremental test-drive development of Excel features.
  • It allows bug reports to be replicated quickly in Excel and compared with rust_xlsxwriter
  • It avoids subjective arguments about whether rust_xlsxwriter or some other third party Excel reading software is correct in its implementation/interpretation of the XLSX file specification since it uses Excel as the standard.

For the end user the benefits of having output files that are effectively identical to files produced by Excel means the maximum possible interoperability with Excel and applications that read XLSX files.

The test suite contains an individual test for each file (although there is sometimes more than one test against the same input file). Each of these tests in compiled into and run as a crate which means the test suite is slow. For usability reasons I don't want to test more than one xlsx file per file/crate (apart from maybe the grouping scheme outlined below).

There are currently ~540 test files and it takes 8+ minutes to run on a 3.2 GHz 6-Core Intel Core i7 with 32GB of fast RAM:

$ time cargo test

real	8m36.340s
user	30m34.062s
sys	9m0.802s

In the GitHub Actions CI this is currently taking around 18 minutes.

There will eventually be around 800 test files so the runtime will be ~50% longer.

nextest is bit faster but not significantly so. The timing also doesn't include the doc tests:

$ time cargo nextest run

real	7m45.029s
user	26m44.624s
sys	6m59.271s

A few months ago when the test suite took around 4 minutes I tried to consolidate the tests into one crate using the advice in this article on Delete Cargo Integration Tests. This was significantly faster by around 5-10x but didn't allow me to run individual tests (I'm 99% sure). I tried to replicate that again to redo the performance testing and verify the running of individual tests but failed for some reasons related to test refactoring since then.

For comparison the Python bytes test suite runs 1600 integration and unit tests in 18 seconds. The Perl test suite takes around 3 minutes and the C test suite takes 5 minutes.

Anyway to the help wanted: if anyone has any ideas how the test runtime might be improved or if you can get the above "Delete Cargo Integration Tests" approach to work again for comparison let me know. I might be able to come up with a hybrid approach where the tests under development or debug are in their own crates and moved back to an overall test crate/folder afterwards.

`worksheet.autofit()` is not aware of merged cells

Current behavior

Situation:

Merging multiple cells with merge_range to form a "title" area, setting string content and a format (e.g. a large font size).

Adding data columns, then calling autofit on the worksheet.

Result:

The string content of the multi-cell title is used to autofit the first column, although the string content spans multiple columns (merged range).

The increased font size is not taken into account.

The width of the first column is calculated

  • as if the title was placed only in the first column, instead of spanning multiple columns
  • as if the title was not formatted with a larger font size.

image

Note how the first column could be as narrow as "Manager" & "cargo" but is as wide as the title would be if it was written into the first cell in default font size.

Expected behavior

The expected behavior would be closer to the un-merged situation:

image

The first column is only calculated by the actual contents.

Note that I am using "actual" here not in a technical sense (since technically the content of a merged range is in fact written to the first cell) but in the sense of what a viewer perceives as the "visual reality".

Sample code to reproduce

// Write full project name with namespace into first row, merged across 4 columns. (big & bold)
        worksheet.set_row_height(0, 25)?;
        worksheet.merge_range(
            0,
            0,
            0,
            3,
            &self.project.name_with_namespace,
            &Format::new().set_bold().set_font_size(20),
        )?;

Environment

- rust_xlsxwriter version: 0.36.1
- rustc version: 1.69.0 (84c898d65 2023-04-16)
- Excel version: LibreOffice 7.5.2.2 (X86_64)
- OS: OpenSUSE Tumbleweed Linux

Any other information

No response

Looking for feedback on rust_xlsxwriter error handling

Request for Feedback

Currently the rust_xlsxwriter error generation and handling is a bit restrictive. For background the error conditions are listed in the XlsxError docs: https://docs.rs/rust_xlsxwriter/latest/rust_xlsxwriter/enum.XlsxError.html

There are 2 levels of error that can affect an Excel file:

  • Errors that would lead to a corrupt or broken file. Issues in that category would be merged ranges overlapping, which creates a corrupt-ish Excel file or File IO error or a ZipWriter errors. These would still need to be Results/Errors but could mainly be restricted to the save() methods.
  • Non-corrupting errors that could be ignored with some data loss and a warning. Issues in this category would be row/column errors for out of bounds writing. For example it is a relative common issue that users try to write past Excel's row limit of ~ 1M rows. In these case the out of bounds writes could be raised as a STDERR warning but a valid file would still be produced. This is how the Python/C/Perl versions of the library work. Python also has exception handling but that isn't available in Rust.

I am looking for feedback from people using the library on their experience with the APIs and error handling.

feature request: `autofit_height()`

Feature Request

autofit() is only concerned with cell width, but I have a use case with multiline text cells which I would like to display fully. Are you interested in having a method that adjusts height? I can take a shot at implementing it

Bug: Several clippy::pedantic warnings

Current behavior

Running cargo clippy -- -W clippy::pedantic shows hundreds of warnings, most of them are easily fixable.

Expected behavior

cargo clippy -- -W clippy::pedantic shouldn't give any results

Sample code to reproduce

cargo clippy -- -W clippy::pedantic

Environment

- rust_xlsxwriter version: 0.32.0
- rustc version: 1.68.2
- Excel version:
- OS: Linux (Pop os)

Any other information

I will send a PR in a couple of minutes.

Set the Worksheet name before adding it to the Workbook

Feature Request

It is possible something like that:

  let mut worksheet = Worksheet::new();
  worksheet.set_name("Name XYZ")?;

or

  let worksheet = Worksheet::new("Name XYZ");

And then:

  workbook.push_worksheet(worksheet);

I don not want to use:

  let _worksheet3 = workbook.add_worksheet().set_name("Data");

I would like to create multiple worksheets with previous names:

 let mut worksheets: Vec<Worksheet> = Vec::new();

 // Split a vector into smaller vectors of size N
 vector.chunks(SPLIT_NUMBER).enumerate.for_each(|(index, data)| {
    let name = format("Name {}", index);
    let worksheet = Worksheet::new(name);
    ...
    worksheets.push(worksheet);
 });

 ...

 worksheets.for_each(|worksheet| {
    workbook.push_worksheet(worksheet);
 }

Thanks!

feature request: Change `Rc` to `Arc`

Feature Request

In a previous PR (#27), I made the mistake to change the Strings inside CellType to Box<str> and Rc<str>. However, Rc is not Send. This means that if we write worksheets on different theads, we can't send them to a single thread to use Workbook::push_worksheet. This shouldn't change the performance characteristics of the crate.

WIP: Support for worksheet tables is in progress

I've started work on supporting worksheet tables.

Tables in Excel are a way of grouping a range of cells into a single entity that has common formatting or that can be referenced from formulas. Tables can have column headers, autofilters, total rows, column formulas and default formatting.

See the Python docs for details: https://xlsxwriter.readthedocs.io/working_with_tables.html

Here is an example of one of the tables generated by the example program below.

image

There is still work to do on parameter validation, documentation and some minor features. It also doesn't currently support cell formatting. That will have to wait for a subsequent release.

question: editing existing files planned?

Question

Hello!

I have got a question concerning both the roadmap (#1) and the README of this project. The README states that editing existing files is not supported, yet. The roadmap does not mention whether such a feature would be scheduled for future releases.

Thus, I was wondering whether editing existing files is planned or if such functionalities would be beyond this project's scope.

feature request: Can use multithreading?

Feature Request

First of all, I would like to thank the author for providing a very useful library.
Is it possible to speed up the export if using multithreading

feature request: impl IntoExcelData for bool

Feature Request

Hi and thanks for the crate!

I'm currently writing an sql export tool from MSSQL to xlsx.

I would love to be able to write bools directly with sheet.write(row, col, true).

Excel should have some kind of representation for logic, maybe just the text TRUE and FALSE, but potentially some better values or cell types.

feature request: Implement `IntoExcelData` for `serde_json::Value`

Feature Request

It would be pretty nice to be able to do sheet.write(serde_json::Value), pretty what needs to be done is just implement for all types of serde_json::Value. Should perhaps be put behind a feature flag.

Or even better, maybe IntoExcelData could even be implemented for anything that derives Serialize, or maybe a macro

More ideas to improve performance

Using a real world app (used in my job), I identified some functions that can be optimized. These problems don't show up in the current benchmark because it doesn't use these parts of the codebase.

*_key functions

image

A way of optimizing those functions cloud be to remove them and then I following the compiler errors. The Format struct can be refactored to group the "alignment" fields into a separate struct, same with "borders" and "font". Then we can implement the Hash trait. On those structs, including the Format struct, we can then directly use those structs as keys of HashMaps. The current system needs to format and allocate huge strings composed of other huge strings.

I did it all of this on my local fork but I know this is something that you would want to do personally because it needs changes across several files. It also required some manual deriving of Hash because of an f64. Some parts of the ordered-float crate can be vendored to solve that issue.

Worksheet::set_name

image

Constructing the regex takes a lot of time. This is a very quick fix, we can use lazy_static to "memoize" the regex. However, an even better change would be to remove the regex altogether. We can use something like:

let invalid_characters = br"[]:*?/\";
if name
    .as_bytes()
    .iter()
    .any(|ch| invalid_characters.contains(ch))
{
    // ...
}

Maybe invalid_characters should be a constant, that's an stylistic choice.

Support for custom paths

Feature Request

How do I set the save path?
c:\aa\cell_formats.xlsx
workbook.save("c:\aa\cell_formats.xlsx")?;

Announce: major API change in the next release

In the next release (0.9.0) there is a major change to the constructor and destructor methods.

The API will change from having the filename in the constructor like this:

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new("workbook.xlsx");

    _ = workbook.add_worksheet();

    workbook.close()?;

    Ok(())
}

To having a workbook::new() without filename and a workbook.save() with a filename instead of a workbook.close(). LIke this:

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    _ = workbook.add_worksheet();

    workbook.save("workbook.xlsx")?;

    Ok(())
}

There will also be workbook.save_to_buffer() and workbook.save_to_path() methods.

This simplifies the constructor/destructor sequence and introduces a save() option that was often requested in the Python version but which I never implemented.

This is quite an invasive change on previous versions and I apologies for the inconvenience, but there is still a relatively small number of active users so it is best to make this change now.

Proposal: generic write() method

It would be possible to have a generic write() method using a trait that could call the type specific write_*() method such as write_number(). This is a feature in the Python version of the library.

I'll leave this here as a reminder to myself and to see if it gets an +1s for usefulness or desirability.

Bug: Using leader_lines leads to corrupted Excel

Current behavior

When adding leader_lines than there is a left angle bracket too much in showLeaderLines tag which leads to a corrupted Excel file
<<c15:showLeaderLines val="1"/>

Expected behavior

Valid Excel when using leader_lines

Sample code to reproduce

use rust_xlsxwriter::{
    Chart, ChartDataLabel, ChartDataLabelPosition, ChartFont, ChartType, Workbook, XlsxError,
};

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();
    let worksheet = workbook.add_worksheet();

    worksheet.write(0, 0, 10)?;
    worksheet.write(1, 0, 60)?;
    worksheet.write(2, 0, 30)?;
    worksheet.write(3, 0, 10)?;
    worksheet.write(4, 0, 50)?;

    let mut chart = Chart::new(ChartType::Column);

    chart
        .add_series()
        .set_values("Sheet1!$A$1:$A$5")
        .set_data_label(
            ChartDataLabel::new()
                .show_value()
                .set_position(ChartDataLabelPosition::OutsideEnd)
                .show_leader_lines()
                .set_font(ChartFont::new().set_name("Arial").set_size(8)),
        );

    worksheet.insert_chart(0, 2, &chart)?;

    workbook.save("chart.xlsx")?;

    Ok(())
}

Environment

- rust_xlsxwriter version: 0.51.0

Any other information

No response

rust_xlsxwriter roadmap: Charts

The Chart feature has a roadmap of it own since it is has a lot of properties and methods (see the following for details on the features: https://xlsxwriter.readthedocs.io/chart.html).

I've added initial support for Charts via the Chart class and the worksheet.insert_chart() method. See also the Chart Examples in the user guide.

All Excel first generation chart types such as Area, Bar, Column, Line, Scatter, Pie, Doughnut and Radar are supported. Newer Excel 365 style charts such as Waterfall probably won't be supported (at least not until other major features have been completed).

The planned and completed features are.

  • Initial support for all first generation chart types.

Format options:

  • line: Set the properties of the series line type such as color and width.
  • border: Set the border properties of the series such as color and style.
  • fill: Set the solid fill properties of the series such as color.
  • pattern: Set the pattern fill properties of the series.
  • gradient: Set the gradient fill properties of the series.

Series option:

  • name: Set the name for the series.
  • format
  • marker: Set the properties of the series marker such as style and color.
  • trendline: Set the properties of the series trendline.
  • smooth: Set the smooth property of a line series.
  • y_error_bars: Set vertical error bounds for a chart series.
  • x_error_bars: Set horizontal error bounds for a chart series.
  • data_labels: Set data labels for the series.
  • points: Set properties for individual points in a series.
  • invert_if_negative: Invert the fill color for negative values.
  • invert_if_negative_color: Invert the fill color for negative values.
  • overlap: Set the overlap between series in a Bar/Column chart.
  • gap: Set the gap between series in a Bar/Column chart.

X-Axis `chart.set_x_axis() sub options:

  • name
  • name_font
  • name_layout
  • num_font
  • num_format
  • format
  • min
  • max
  • minor_unit
  • major_unit
  • interval_unit
  • interval_tick
  • crossing
  • position_axis
  • reverse
  • log_base
  • label_position
  • label_align
  • major_gridlines
  • minor_gridlines
  • hidden
  • date_axis
  • text_axis
  • minor_unit_type
  • major_unit_type
  • minor_tick_mark
  • major_tick_mark
  • display_units
  • display_units_visible

Other Chart options:

  • chart.set_name()
  • chart.set_y_axis()
  • chart.set_x2_axis()
  • chart.set_y2_axis()
  • chart.combine()
  • chart.set_size()
  • chart.set_title()
  • chart.set_legend()
  • chart.set_legend() - delete entries
  • chart.set_chartarea()
  • chart.set_plotarea()
  • chart.set_style()
  • chart.set_table()
  • chart.set_up_down_bars()
  • chart.set_drop_lines()
  • chart.set_high_low_lines()
  • chart.show_blanks_as()
  • chart.show_na_as_empty()
  • chart.show_hidden_data()
  • chart.set_rotation()
  • chart.set_hole_size()
  • Chartsheets

Help wanted: create a Wasm example/tutorial using rust_xlsxwriter

Help wanted

It would be great if someone could create a Wasm example/tutorial using rust_xlsxwriter.

Ideally I could add this as a section in or link from the Working with the rust_xlsxwriter library user guide.

An example like one in the SheetJS tutorial would be good but don't copy it exactly.

It would be good to use the write_row_matrix() or write_column_matrix() methods that are currently on main (#16) and which will be in v0.39.0.

It would also be nice to use whatever is the current Rust/Wasm best practices.

Feature request: Centralize worksheet name validation

Feature Request

Since I've looked at the code here for the first time I want to suggest that Worksheet name validation gets centralized into one String wrapper struct that can be reused throughout the program.

Since the appropriate errors are already in the XlsxError that wouldn't add any additional overhead and would allow to reuse that validation for ChartRange or Range string validation.

question: The generated xlsx file with a large number of images cannot be opened properly with Excel

Question

Hello, I tried to use this crate to generate a basic-full.xlsx file with a lot of images, but the file did not open properly with Excel. The Excel error message is "We found a problem with some content".

Then I tried to generate two files basic-1.xlsx and basic-2.xlsx, each containing half the number of images, which could be opened normally with Excel. I then manually merged the two separately generated files into a single file basic-merge.xlsx, which also opened normally.

I don't know why this crate doesn't generate basic-full.xlsx correctly, can you help me?

basic-full.xlsx
basic-1.xlsx
basic-2.xlsx
basic-merge.xlsx

feature request: methods `Worksheet::serialize_headers*` should get headers directly from the struct / enum type instead of from an instance

Feature Request

In almost all practical Worksheet::serialize use cases that I can think of, instances of the defined struct / enum are going to be created within a loop. It is therefore inconvenient to have to treat the first instance differently from all the others, for the only purpose to be able to serialize headers with one of the Worksheet::serialize_headers* methods.
A better approach would be to rely on the serde_aux::serde_introspection::serde_introspect method to get the struct field names directly from the defined type. For this to work rust_xlsxwriter would need to depend on the serde-aux crate when the serde feature is enabled.

feature request: support wasm

Feature Request

Currently it fails when creating a worksheet:

panicked at library/std/src/sys/wasm/../unsupported/time.rs:31:9:
time not implemented on this platform

Stack:

__wbg_get_imports/imports.wbg.__wbg_new_abda76e883ba8a5f@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214.js:325:21
console_error_panic_hook::hook::h79ac961f5aab2ea7@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[2176]:0x2db1f8
core::ops::function::Fn::call::hb648b073a046a4d9@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[7370]:0x3d7f87
std::panicking::rust_panic_with_hook::h964f5352fba2016e@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[4196]:0x3956cf
std::panicking::begin_panic_handler::{{closure}}::h716a005a10b672c4@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[4626]:0x3ab139
std::sys_common::backtrace::__rust_end_short_backtrace::haac356ff17e1eee4@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[7362]:0x3d7f42
rust_begin_unwind@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[5496]:0x3c6720
core::panicking::panic_fmt::h8f8677da57cb02fb@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[5916]:0x3ce379
std::time::SystemTime::now::h0483490d67579cfa@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[5914]:0x3ce2f8
rust_xlsxwriter::datetime::ExcelDateTime::utc_now::hcaed1ef12b7c4468@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[604]:0x1c0439
rust_xlsxwriter::workbook::Workbook::new::h7ccd5d2417379d72@http://127.0.0.1:8080/wingman_ui-ab1b952e4ac40214_bg.wasm:wasm-function[1984]:0x2c0989
...

Replacing std::time::SystemTime with chrono will probably fix the problem as chrono works with web assembly.

feature request: `impl<I: IntoExcelData> IntoExcelData for Option<I>`

Feature Request

Hi. I have a use case with writing a lot of optional values. Currently, I had to write manual wrappers for handling Options like so:

match opt_value {
  Some(value) => worksheet.write(row, col, value)?,
  None => Ok(worksheet),
};

match opt_value {
  Some(value) => worksheet.write_with_format(row, col, value, format)?,
  None => worksheet.write_blank(row, col, format)?,
};

What do you think about upstreaming it as a trait impl, so that standard write() and write_with_format() can work with Options directly? I only started using this crate yesterday, so I don't know if this would be intuitive (not doing any writes in write()) or consistent with other languages. For me as a newcomer, it would be very intuitive. If you greenlight this, I can attempt implementing this myself.

Alternative

If you don't find this behavior intuitive, maybe we can add separate helper methods for handling Options, something like write_or_blank()?

How to write unit data based on vec or json values?

Question

How to write unit data based on vec or json values?

 let mut compyinfo = c1
            .map(|row: MssqlRow| DetailJson {
                areaDesc: area_Desc.to_string(),
                // areaDesc: row.get::<String,&str>("areaDesc").parse::<f64>().unwrap(),
                streetDesc: row.get::<String, &str>("streetDesc"),
                creditCode: row.get("creditCode"),
                companyName: row.get("companyName"),
                assetsTotalCurrentYear: row
                    .get::<String, &str>("assetsTotalCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                assetsOverSamePeriodLastYear: row
                    .get::<String, &str>("assetsOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                assetsBankDepositsCurrentYear: row
                    .get::<String, &str>("assetsBankDepositsCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                assetsBankOverSamePeriodLastYear: row
                    .get::<String, &str>("assetsBankOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                debtTotalCurrentYear: row
                    .get::<String, &str>("debtTotalCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                debtOverSamePeriodLastYear: row
                    .get::<String, &str>("debtOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                debtLoanFinancingCurrentYear: row
                    .get::<String, &str>("debtLoanFinancingCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                debtLoanOverSamePeriodLastYear: row
                    .get::<String, &str>("debtLoanOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                netAssetTotalCurrentYear: row
                    .get::<String, &str>("netAssetTotalCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                netAssetOverSamePeriodLastYear: row
                    .get::<String, &str>("netAssetOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                dabrCurrentYear: row
                    .get::<String, &str>("dabrCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                dabrOverSamePeriodLastYear: row
                    .get::<String, &str>("dabrOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                businessIncomeTotalCurrentYear: row
                    .get::<String, &str>("businessIncomeTotalCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                businessIncomeRentOverSamePeriodLastYear: row
                    .get::<String, &str>("businessIncomeRentOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                businessIncomeRentCurrentYear: row
                    .get::<String, &str>("businessIncomeRentCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                businessIncomeOverSamePeriodLastYear: row
                    .get::<String, &str>("businessIncomeOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                profitTotalCurrentYear: row
                    .get::<String, &str>("profitTotalCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                profitOverSamePeriodLastYear: row
                    .get::<String, &str>("profitOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                netProfitTotalCurrentYear: row
                    .get::<String, &str>("netProfitTotalCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                netProfitOverSamePeriodLastYear: row
                    .get::<String, &str>("netProfitOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                payingTaxTotalCurrentYear: row
                    .get::<String, &str>("payingTaxTotalCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                payingTaxOverSamePeriodLastYear: row
                    .get::<String, &str>("payingTaxOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
                stockholderBonusTotalCurrentYear: row
                    .get::<String, &str>("stockholderBonusTotalCurrentYear")
                    .parse::<f64>()
                    .unwrap(),
                stockholderBonusOverSamePeriodLastYear: row
                    .get::<String, &str>("stockholderBonusOverSamePeriodLastYear")
                    .parse::<f64>()
                    .unwrap(),
            })
            .fetch_all(&pool)
            .await?;
        pool.close();
        detailJson1.append(&mut compyinfo);

Bug: autofit too small on datetime columns with custom format

Current behavior

image

Expected behavior

image

Sample code to reproduce

    let mut xls = Workbook::new();
    let sheet1 = xls.add_worksheet();
    sheet1.write(0, 0, "date")?;

    let format = Format::new().set_num_format("dd\\.mm\\.yyyy hh:mm:ss");

    let now = SystemTime::now();
    let now: DateTime<Utc> = now.into();
    let now = now.naive_local();

    sheet1.write_with_format(1, 0, &now, &format)?;

    sheet1.set_freeze_panes(1, 0)?;
    sheet1.autofit();
    xls.save(filename)?;

Environment

on latest main

Example: write Polars Dataframe to Excel

Sample code

Here is a simple example of writing a Polars rust dataframe to Excel using rust_xlsxwriter >= 0.40.0:

screenshot

And here is the code:

use chrono::prelude::*;
use polars::export::arrow::temporal_conversions::*;
use polars::prelude::*;
use rust_xlsxwriter::{Format, Table, TableColumn, Workbook, XlsxError};

fn main() {
    let df: DataFrame = df!(
        "String" => &["North", "South", "East", "West", "All"],
        "Integer" => &[1, 2, 3, 4, 5],
        "Datetime" => &[
            NaiveDate::from_ymd_opt(2022, 1, 1).unwrap().and_hms_opt(1, 0, 0).unwrap(),
            NaiveDate::from_ymd_opt(2022, 1, 2).unwrap().and_hms_opt(2, 0, 0).unwrap(),
            NaiveDate::from_ymd_opt(2022, 1, 3).unwrap().and_hms_opt(3, 0, 0).unwrap(),
            NaiveDate::from_ymd_opt(2022, 1, 4).unwrap().and_hms_opt(4, 0, 0).unwrap(),
            NaiveDate::from_ymd_opt(2022, 1, 5).unwrap().and_hms_opt(5, 0, 0).unwrap(),
        ],
        "Date" => &[
            NaiveDate::from_ymd_opt(2022, 1, 1).unwrap(),
            NaiveDate::from_ymd_opt(2022, 1, 2).unwrap(),
            NaiveDate::from_ymd_opt(2022, 1, 3).unwrap(),
            NaiveDate::from_ymd_opt(2022, 1, 4).unwrap(),
            NaiveDate::from_ymd_opt(2022, 1, 5).unwrap(),
        ],
        "Time" => &[
            NaiveTime::from_hms_milli_opt(2, 59, 3, 456).unwrap(),
            NaiveTime::from_hms_milli_opt(2, 59, 3, 456).unwrap(),
            NaiveTime::from_hms_milli_opt(2, 59, 3, 456).unwrap(),
            NaiveTime::from_hms_milli_opt(2, 59, 3, 456).unwrap(),
            NaiveTime::from_hms_milli_opt(2, 59, 3, 456).unwrap(),
        ],
        "Float" => &[4.0, 5.0, 6.0, 7.0, 8.0],
    )
    .expect("should not fail");

    write_xlsx(df, "dataframe.xlsx").unwrap();
}

// Simplified Polars dataframe to Excel file converter. See the actual
// `write_xlsx()` method in Polars for a much more complete and flexible
// interface.
//
// https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.DataFrame.write_excel.html#polars.DataFrame.write_excel
//
fn write_xlsx(df: DataFrame, filename: &str) -> Result<(), XlsxError> {
    // Create a new Excel file object.
    let mut workbook = Workbook::new();
    let mut headers = vec![];

    // Create some formats for the dataframe.
    let datetime_format = Format::new().set_num_format("yyyy\\-mm\\-dd\\ hh:mm:ss");
    let date_format = Format::new().set_num_format("yyyy\\-mm\\-dd;@");
    let time_format = Format::new().set_num_format("hh:mm:ss;@");

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Iterate through the dataframe column by column.
    for (col_num, column) in df.get_columns().iter().enumerate() {
        let col_num = col_num as u16;

        // Store the column names for use as table headers.
        headers.push(column.name().to_string());

        // Write the row data for each column/type.
        for (row_num, data) in column.iter().enumerate() {
            let row_num = 1 + row_num as u32;

            // Map the Polars Series AnyValue types to Excel/rust_xlsxwriter
            // types.
            match data {
                AnyValue::Int8(value) => {
                    worksheet.write_number(row_num, col_num, value)?;
                }
                AnyValue::UInt8(value) => {
                    worksheet.write_number(row_num, col_num, value)?;
                }
                AnyValue::Int16(value) => {
                    worksheet.write_number(row_num, col_num, value)?;
                }
                AnyValue::UInt16(value) => {
                    worksheet.write_number(row_num, col_num, value)?;
                }
                AnyValue::Int32(value) => {
                    worksheet.write_number(row_num, col_num, value)?;
                }
                AnyValue::UInt32(value) => {
                    worksheet.write_number(row_num, col_num, value)?;
                }
                AnyValue::Float32(value) => {
                    worksheet.write_number(row_num, col_num, value)?;
                }
                AnyValue::Float64(value) => {
                    worksheet.write_number(row_num, col_num, value)?;
                }
                AnyValue::Utf8(value) => {
                    worksheet.write_string(row_num, col_num, value)?;
                }
                AnyValue::Boolean(value) => {
                    worksheet.write_boolean(row_num, col_num, value)?;
                }
                AnyValue::Null => {
                    // Treat Null as blank for now.
                }
                AnyValue::Datetime(value, time_units, _) => {
                    let datetime = match time_units {
                        TimeUnit::Nanoseconds => timestamp_ns_to_datetime(value),
                        TimeUnit::Microseconds => timestamp_us_to_datetime(value),
                        TimeUnit::Milliseconds => timestamp_ms_to_datetime(value),
                    };
                    worksheet.write_datetime(row_num, col_num, &datetime, &datetime_format)?;
                    worksheet.set_column_width(col_num, 18)?;
                }
                AnyValue::Date(value) => {
                    let date = date32_to_date(value);
                    worksheet.write_date(row_num, col_num, &date, &date_format)?;
                    worksheet.set_column_width(col_num, 10)?;
                }
                AnyValue::Time(value) => {
                    let time = time64ns_to_time(value);
                    worksheet.write_time(row_num, col_num, &time, &time_format)?;
                }
                _ => {
                    println!(
                        "WARNING: AnyValue data type '{}' is not supported",
                        data.dtype()
                    );
                    break;
                }
            }
        }
    }

    // Create a table for the dataframe range.
    let (max_row, max_col) = df.shape();
    let mut table = Table::new();
    let columns: Vec<TableColumn> = headers
        .into_iter()
        .map(|x| TableColumn::new().set_header(x))
        .collect();
    table.set_columns(&columns);

    // Add the table to the worksheet.
    worksheet.add_table(0, 0, max_row as u32, max_col as u16 - 1, &table)?;

    // Autofit the columns.
    worksheet.autofit();

    // Save the file to disk.
    workbook.save(filename)?;

    Ok(())
}

The Cargo.toml file for this is:

[package]
name = "dataframe_test"
version = "0.1.0"
edition = "2021"

[dependencies]
chrono = "0.4.24"
polars = { version = "0.29.0", features = ["lazy"] }
rust_xlsxwriter = "0.40.0"

Which was set up as follows:

cargo add polars -F lazy
cargo add chrono
cargo add rust_xlsxwriter

Note, this is for demonstration/testing purposes only. Polars already has a really nice and very configurable dataframe.write_excel() API that uses the Python based XlsxWriter. I even wrote some docs on Working with Polars and XlsxWriter.

`impl IntoExcelData for String` (and optionally `&String`)

Feature Request

Examples:

                // Assume `path: PathBuf`
                worksheet.write(
                    0,
                    0,
                    path.display().to_string(),
                )?;

                worksheet.write(
                    0,
                    0,
                    format!("Something {formatted:#?}"),
                )?;

-> "the trait IntoExcelData is not implemented for std::string::String"

Must work around with &* (as impl IntoExcelData for &str is the only stringy type available), which is quite cumbersome.

https://docs.rs/rust_xlsxwriter/latest/rust_xlsxwriter/trait.IntoExcelData.html#foreign-impls

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.