Giter Site home page Giter Site logo

Comments (11)

han1548772930 avatar han1548772930 commented on June 10, 2024

I tried using multithreading to handle this, but I found it to be slower than single threading.

use std::{
    sync::{Arc, Mutex},
    thread,
    time::{SystemTime, UNIX_EPOCH},
};

use rust_xlsxwriter::*;

fn main() {
    let workbook = Workbook::new();
    let workbook_arc = Arc::new(Mutex::new(workbook));
    workbook_arc.lock().unwrap().add_worksheet();

    let mut time = timestamp1();
    println!("start{:?}", time);
    let mut handles = vec![];
    for i in 1..6 {
        let workbook_clone = workbook_arc.clone();
        let handle = thread::spawn(move || {
            let mut workbook = workbook_clone.lock().unwrap();
            let sheet = workbook.worksheet_from_index(0).unwrap();
            for j in (i - 1) * 209715..i * 209715 {
                sheet.write_string(j, 1, "Hello, World!").unwrap();
                sheet.write_string(j, 2, "Hello, World!").unwrap();
                sheet.write_string(j, 3, "Hello, World!").unwrap();
                sheet.write_string(j, 4, "Hello, World!").unwrap();
                sheet.write_string(j, 5, "Hello, World!").unwrap();
                sheet.write_string(j, 6, "Hello, World!").unwrap();
                sheet.write_string(j, 7, "Hello, World!").unwrap();
                sheet.write_string(j, 8, "Hello, World!").unwrap();
                sheet.write_string(j, 9, "Hello, World!").unwrap();
                sheet.write_string(j, 10, "Hello, World!").unwrap();
                sheet.write_string(j, 11, "Hello, World!").unwrap();
                sheet.write_string(j, 12, "Hello, World!").unwrap();
            }
        });
        handles.push(handle);
    }
    for handle in handles {
        handle.join().unwrap();
    }
    let mut workbook = workbook_arc.lock().unwrap();
    workbook.save("demo.xlsx").unwrap();
    time = timestamp1();
    println!("end{:?}", time);
}

fn timestamp1() -> i64 {
    let start = SystemTime::now();
    let since_the_epoch = start
        .duration_since(UNIX_EPOCH)
        .expect("Time went backwards");
    let ms = since_the_epoch.as_secs() as i64 * 1000i64
        + (since_the_epoch.subsec_nanos() as f64 / 1_000_000.0) as i64;
    ms
}

from rust_xlsxwriter.

jmcnamara avatar jmcnamara commented on June 10, 2024

I’ll add multi-threading into the back end in the next release +1 or +2 release.

The library is probably IO bound rather than CPU bound so multi-threading may not have a linear benefit. Nonetheless I’ll implement it to get whatever possible benefit.

@adriandelgado Any suggestions to the OP on multi-threading in the front end/user app?

from rust_xlsxwriter.

adriandelgado avatar adriandelgado commented on June 10, 2024

Multitheading is only useful for massive Worksheets.

I also recommend not using a Mutex. You can generate each Worksheet on a separate thread and then join together using push_worksheet.

from rust_xlsxwriter.

han1548772930 avatar han1548772930 commented on June 10, 2024

I tried using these two methods and still got something similar to single threading.

fn main() {
    let workbook = Workbook::new();
    let workbook_arc = Arc::new(Mutex::new(workbook));
    // workbook_arc.lock().unwrap().add_worksheet();
    // workbook_arc.lock().unwrap().add_worksheet();
    // workbook_arc.lock().unwrap().add_worksheet();
    // workbook_arc.lock().unwrap().add_worksheet();
    
    let mut time = timestamp1();
    println!("start{:?}", time);
    let mut handles = vec![];
    for i in 0..4 {
        let workbook_clone = workbook_arc.clone();
        let handle = thread::spawn(move || {
            let mut workbook = workbook_clone.lock().unwrap();
            // let sheet: &mut Worksheet = workbook.worksheet_from_index(i).unwrap();
            let mut sheet: Worksheet=Worksheet::new();
            for j in 0..1048576 {
                sheet.write_string(j, 1, "Hello, World!").unwrap();
                sheet.write_string(j, 2, "Hello, World!").unwrap();
                sheet.write_string(j, 3, "Hello, World!").unwrap();
                sheet.write_string(j, 4, "Hello, World!").unwrap();
                sheet.write_string(j, 5, "Hello, World!").unwrap();
                sheet.write_string(j, 6, "Hello, World!").unwrap();
                sheet.write_string(j, 7, "Hello, World!").unwrap();
                sheet.write_string(j, 8, "Hello, World!").unwrap();
                sheet.write_string(j, 9, "Hello, World!").unwrap();
                sheet.write_string(j, 10, "Hello, World!").unwrap();
                sheet.write_string(j, 11, "Hello, World!").unwrap();
                sheet.write_string(j, 12, "Hello, World!").unwrap();
            }
            workbook.push_worksheet(sheet);
        });
        handles.push(handle);
    }
    for handle in handles {
        handle.join().unwrap();
    }
    let mut workbook = workbook_arc.lock().unwrap();
    workbook.save("demo.xlsx").unwrap();
    time = timestamp1();
    println!("end{:?}", time);
}

fn timestamp1() -> i64 {
    let start = SystemTime::now();
    let since_the_epoch = start
        .duration_since(UNIX_EPOCH)
        .expect("Time went backwards");
    let ms = since_the_epoch.as_secs() as i64 * 1000i64
        + (since_the_epoch.subsec_nanos() as f64 / 1_000_000.0) as i64;
    ms
}
fn main() {
    task::block_on(async {
        let mut time = timestamp1();
        println!("start:{:?}", time);
        let mut workbook: Workbook = Workbook::new();

        let res = async_main().await;
        workbook.push_worksheet(res.0);
        workbook.push_worksheet(res.1);
        workbook.push_worksheet(res.2);
        workbook.push_worksheet(res.3);
        workbook.save("demo.xlsx").unwrap();
        time = timestamp1();
        println!("end:{:?}", time);
    });
}
async fn async_main() -> (Worksheet, Worksheet, Worksheet, Worksheet) {
    let f1 = write_data();
    let f2 = write_data();
    let f3 = write_data();
    let f4 = write_data();
    let res: (Worksheet, Worksheet, Worksheet, Worksheet) = futures::join!(f1, f2, f3, f4);
    res
}
fn timestamp1() -> i64 {
    let start = SystemTime::now();
    let since_the_epoch = start
        .duration_since(UNIX_EPOCH)
        .expect("Time went backwards");
    let ms = since_the_epoch.as_secs() as i64 * 1000i64
        + (since_the_epoch.subsec_nanos() as f64 / 1_000_000.0) as i64;
    ms
}
async fn write_data() -> Worksheet {
    let mut sheet: Worksheet = Worksheet::new();
    for j in 1..1048576 {
        sheet.write_string(j, 0, "Hello, World!").unwrap();
        sheet.write_string(j, 1, "Hello, World!").unwrap();
        sheet.write_string(j, 2, "Hello, World!").unwrap();
        sheet.write_string(j, 3, "Hello, World!").unwrap();
        sheet.write_string(j, 4, "Hello, World!").unwrap();
        sheet.write_string(j, 5, "Hello, World!").unwrap();
        sheet.write_string(j, 6, "Hello, World!").unwrap();
        sheet.write_string(j, 7, "Hello, World!").unwrap();
        sheet.write_string(j, 8, "Hello, World!").unwrap();
        sheet.write_string(j, 9, "Hello, World!").unwrap();
        sheet.write_string(j, 10, "Hello, World!").unwrap();
        sheet.write_string(j, 11, "Hello, World!").unwrap();
    }
    sheet
}

from rust_xlsxwriter.

han1548772930 avatar han1548772930 commented on June 10, 2024

After some testing, I found that the write_string data is very fast, but it will take a long time to save.
Is it possible to make save_internal asynchronous

from rust_xlsxwriter.

jmcnamara avatar jmcnamara commented on June 10, 2024

Is it possible to make save_internal asynchronous

That is the plan.

I think the highest value bottleneck for parallelism would be the worksheet writing loop in packager.rs:

https://github.com/jmcnamara/rust_xlsxwriter/blob/main/src/packager.rs#L104-L110

        let mut string_table = SharedStringsTable::new();
        for (index, worksheet) in workbook.worksheets.iter_mut().enumerate() {
            self.write_worksheet_file(worksheet, index + 1, &mut string_table)?;
            if worksheet.has_relationships() {
                self.write_worksheet_rels_file(worksheet, index + 1)?;
            }
        }

The tricky(?) part would be to have mutex locked (or some other scheme) updates to the shared string table (which maps strings to an index value using Excel's scheme).

The self.write_worksheet_rels_file() part could probably move to a non-threaded loop.

@adriandelgado pointed out in #29 that there could be a lot of value in parallelising the zip writing. I don't know if that will be possible using the current zip crate.

from rust_xlsxwriter.

jmcnamara avatar jmcnamara commented on June 10, 2024

I've made a first pass at introducing threading into the back end of rust_xlsxwriter. The preliminary work in on branch threaded1. Some notes on this:

  • I've uses thread::scope instead of thread::spawn since that makes it easier to work with the lifetimes and "self escapes the method body here" warnings.
  • Access to the shared string table (SST) is mutex locked (as it needs to be be since the string table is meant to have unique entries for each repeated string).
  • It is prototype code only and not meant for production testing.
  • Comparison tests are failing due to string ordering. I'll fix those when this is a little less "work in progress".
  • I've kept the threading to the rust_xlsxwriter parts for now and not the zip parts to make the direct effects of the worksheet writing more obvious.

On the threaded1 there are 3 test cases:

  1. examples/app_perf_test: Single worksheet with mixed string and number values.
  2. examples/app_perf_test2: 4 worksheets for string data only.
  3. examples/app_perf_test3: 4 worksheets for number data only.

From this I get mixed results:

$ hyperfine target/release/examples/app_perf_test_threaded target/release/examples/app_perf_test_unthreaded --warmup 3
Benchmark 1: target/release/examples/app_perf_test_threaded
  Time (mean ± σ):     244.8 ms ±  12.4 ms    [User: 221.3 ms, System: 16.9 ms]
  Range (min … max):   238.2 ms … 280.0 ms    12 runs

Benchmark 2: target/release/examples/app_perf_test_unthreaded
  Time (mean ± σ):     237.3 ms ±   1.1 ms    [User: 218.9 ms, System: 16.8 ms]
  Range (min … max):   235.6 ms … 239.5 ms    12 runs

Summary
  'target/release/examples/app_perf_test_unthreaded' ran
    1.03 ± 0.05 times faster than 'target/release/examples/app_perf_test_threaded'

$ hyperfine target/release/examples/app_perf_test2_threaded target/release/examples/app_perf_test2_unthreaded --warmup 3
Benchmark 1: target/release/examples/app_perf_test2_threaded
  Time (mean ± σ):      1.261 s ±  0.011 s    [User: 1.184 s, System: 0.905 s]
  Range (min … max):    1.247 s …  1.283 s    10 runs

Benchmark 2: target/release/examples/app_perf_test2_unthreaded
  Time (mean ± σ):     986.1 ms ±   6.9 ms    [User: 916.6 ms, System: 66.1 ms]
  Range (min … max):   977.7 ms … 997.0 ms    10 runs

Summary
  'target/release/examples/app_perf_test2_unthreaded' ran
    1.28 ± 0.01 times faster than 'target/release/examples/app_perf_test2_threaded'

$ hyperfine target/release/examples/app_perf_test3_threaded target/release/examples/app_perf_test3_unthreaded --warmup 3
Benchmark 1: target/release/examples/app_perf_test3_threaded
  Time (mean ± σ):     778.6 ms ±  20.2 ms    [User: 837.8 ms, System: 54.2 ms]
  Range (min … max):   766.5 ms … 832.6 ms    10 runs

Benchmark 2: target/release/examples/app_perf_test3_unthreaded
  Time (mean ± σ):     889.2 ms ±   4.1 ms    [User: 834.7 ms, System: 52.0 ms]
  Range (min … max):   884.7 ms … 895.8 ms    10 runs

Summary
  'target/release/examples/app_perf_test3_threaded' ran
    1.14 ± 0.03 times faster than 'target/release/examples/app_perf_test3_unthreaded'

Some observations from this:

  • The single worksheet case is ~ the same speed threaded and unthreaded as the mutex is uncontended.
  • The 4 x string worksheet threaded case is slower (~30%) due to the mutex contention (most likely - I need to do more testing).
  • the 4 x number worksheet threaded case is ~15% faster since the mutex is again uncontended.

There are some options to remove the mutex lock and contention:

  1. Do a separate non-threaded pass of all the worksheet string data to build up the SST table.
  2. Ignore the mutex and do non-atomic updates to the SST. This could lead to duplicates in the SST table but that isn't an error in Excel and would probably only happen in a very small number of cases anyway. But it is poor engineering.
  3. Move to a rwlock and do initial non-locking reads to see if the string exists in the SST and only lock if it doesn't.

I'll look into some of these options in the next few days and I'll post some updates as I go.

from rust_xlsxwriter.

jmcnamara avatar jmcnamara commented on June 10, 2024

There are some options to remove the mutex lock and contention:

  • Option 2 isn't possible, as far as I can see, in Rust. Probably for good reasons.
  • Option3, I haven't looked at yet. It would be better for cases with repeated strings but probably a bit worse for the case of a lot of unique string data.

So for now I've gone with Option1 "Do a separate non-threaded pass of all the worksheet string data to build up the SST table." I've added a second prototype for this on the threaded2 branch.

Overall the results are good:

$ hyperfine target/release/examples/app_perf_test target/release/examples/app_perf_test_unthreaded --warmup 3
Benchmark 1: target/release/examples/app_perf_test
  Time (mean ± σ):     238.3 ms ±   2.5 ms    [User: 221.8 ms, System: 15.3 ms]
  Range (min … max):   234.8 ms … 244.2 ms    12 runs

Benchmark 2: target/release/examples/app_perf_test_unthreaded
  Time (mean ± σ):     236.4 ms ±   2.5 ms    [User: 220.0 ms, System: 15.0 ms]
  Range (min … max):   233.3 ms … 241.2 ms    12 runs

Summary
  'target/release/examples/app_perf_test_unthreaded' ran
1.01 ± 0.02 times faster than 'target/release/examples/app_perf_test'


$ hyperfine target/release/examples/app_perf_test2 target/release/examples/app_perf_test2_unthreaded --warmup 3
Benchmark 1: target/release/examples/app_perf_test2
  Time (mean ± σ):     919.2 ms ±  14.0 ms    [User: 924.3 ms, System: 63.5 ms]
  Range (min … max):   901.7 ms … 949.2 ms    10 runs

Benchmark 2: target/release/examples/app_perf_test2_unthreaded
  Time (mean ± σ):     980.1 ms ±  11.3 ms    [User: 915.7 ms, System: 61.1 ms]
  Range (min … max):   964.3 ms … 1000.4 ms    10 runs

Summary
  'target/release/examples/app_perf_test2' ran
    1.07 ± 0.02 times faster than 'target/release/examples/app_perf_test2_unthreaded'

$ hyperfine target/release/examples/app_perf_test3 target/release/examples/app_perf_test3_unthreaded --warmup 3
Benchmark 1: target/release/examples/app_perf_test3
  Time (mean ± σ):     794.1 ms ±  14.5 ms    [User: 856.9 ms, System: 50.8 ms]
  Range (min … max):   781.7 ms … 832.8 ms    10 runs

Benchmark 2: target/release/examples/app_perf_test3_unthreaded
  Time (mean ± σ):     887.7 ms ±   5.7 ms    [User: 837.8 ms, System: 46.9 ms]
  Range (min … max):   876.2 ms … 898.0 ms    10 runs

Summary
  'target/release/examples/app_perf_test3' ran
    1.12 ± 0.02 times faster than 'target/release/examples/app_perf_test3_unthreaded'


Summary:

  • The single worksheet case is ~1% slower but within the margin of error.
  • The 4 x string worksheet threaded case is ~ 7% faster.
  • the 4 x number worksheet threaded case is ~12% faster.

Not amazing but I'll take a 10% increase for the amount of work involved. If anyone could try the threaded2 branch against real code I'd be interested to see the results.

I'll move on to see what can be done with the zip writer parts.

from rust_xlsxwriter.

han1548772930 avatar han1548772930 commented on June 10, 2024

Wow, that's great!

from rust_xlsxwriter.

jmcnamara avatar jmcnamara commented on June 10, 2024

I'm going to merge the second option threaded2 onto main. I think it is the best I can do for now. There are still potential gains to be had from parallelizing the zipping but after an initial look I'm going to leave that to another time/person.

from rust_xlsxwriter.

jmcnamara avatar jmcnamara commented on June 10, 2024

I've pushed these changes to crates.io in v0.44.0. It is the best I can do for now. Hopefully it will inspire some other analysis/contributions.

Closing.

from rust_xlsxwriter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.