Giter Site home page Giter Site logo

Batching support about rust-oracle HOT 7 CLOSED

kubo avatar kubo commented on September 27, 2024
Batching support

from rust-oracle.

Comments (7)

kubo avatar kubo commented on September 27, 2024 3

Sorry for not replying for so long time.

Batching is efficient when the network distance between the client and the server is long.
When a network round trip requires 1ms, inserting 10k rows requires at least 10s excluding time spent in the server side.
If 1000 rows are sent in a batch, it decreases to 10ms.

Well, I'll implement it in two or three weeks.

from rust-oracle.

kubo avatar kubo commented on September 27, 2024 3

Rust-oracle 0.5.1 was published. It includes batching support.

from rust-oracle.

kubo avatar kubo commented on September 27, 2024 2

I added batch support. See the following doc comment. I'll release it after I add notice that batch errors and row counts are available only when both the client and the server are Oracle 12.1 or upper.

rust-oracle/src/batch.rs

Lines 227 to 406 in 0f62963

/// Statement batch, which inserts, updates or deletes more than one row at once
///
/// Batching is efficient when the network distance between the client and
/// the server is long. When a network round trip requires 1ms, inserting
/// 10k rows using [`Statement`] consumes at least 10s excluding time spent
/// in the client and the server. If 1000 rows are sent in a batch, it
/// decreases to 10ms.
///
/// # Usage
///
/// 1. [`conn.batch(sql_stmt, batch_size).build()`][Connection#method.batch] to create [`Batch`].
/// 2. [`append_row()`](#method.append_row) for each row. Rows in the batch are sent to
/// the server when the number of appended rows reaches the batch size.
/// **Note:** The "batch errors" option mentioned later changes this behavior.
/// 3. [`execute()`](#method.execute) in the end to send rows which
/// have not been sent by `append_rows()`.
///
/// ```
/// # use oracle::Error;
/// # use oracle::test_util;
/// # let conn = test_util::connect()?;
/// # conn.execute("delete from TestTempTable", &[])?;
/// let sql_stmt = "insert into TestTempTable values(:1, :2)";
/// let batch_size = 100;
/// let mut batch = conn.batch(sql_stmt, batch_size).build()?;
/// for i in 0..1234 {
/// batch.append_row(&[&i, &format!("value {}", i)])?;
/// }
/// batch.execute()?;
/// // Check the number of inserted rows.
/// assert_eq!(conn.query_row_as::<i32>("select count(*) from TestTempTable", &[])?, 1234);
/// # Ok::<(), Error>(())
/// ```
///
/// # Error Handling
///
/// There are two modes when invalid data are in a batch.
///
/// 1. Stop executions at the first failure and return the error information.
/// 2. Execute all rows in the batch and return an array of the error information.
///
/// ## Default Erorr Handling
///
/// `append_row()` and `execute()` stop executions at the first failure and return
/// the error information. There are no ways to know which row fails.
///
/// ```
/// # use oracle::Error;
/// # use oracle::test_util;
/// # let conn = test_util::connect()?;
/// # conn.execute("delete from TestTempTable", &[])?;
/// let sql_stmt = "insert into TestTempTable values(:1, :2)";
/// let batch_size = 10;
/// let mut batch = conn.batch(sql_stmt, batch_size).build()?;
/// batch.append_row(&[&1, &"first row"])?;
/// batch.append_row(&[&2, &"second row"])?;
/// batch.append_row(&[&1, &"first row again"])?; // -> ORA-00001: unique constraint violated.
/// batch.append_row(&[&3, &"third row ".repeat(11)])?; // -> ORA-12899: value too large for column
/// batch.append_row(&[&4, &"fourth row"])?;
/// let result = batch.execute();
/// match result {
/// Err(Error::OciError(dberr)) => {
/// assert_eq!(dberr.code(), 1);
/// assert!(dberr.message().starts_with("ORA-00001: "));
/// }
/// _ => panic!("Unexpected batch result: {:?}", result),
/// }
///
/// // Check the inserted rows.
/// let mut stmt = conn.prepare("select count(*) from TestTempTable where intCol = :1", &[])?;
/// assert_eq!(stmt.query_row_as::<i32>(&[&1])?, 1);
/// assert_eq!(stmt.query_row_as::<i32>(&[&2])?, 1);
/// assert_eq!(stmt.query_row_as::<i32>(&[&3])?, 0);
/// assert_eq!(stmt.query_row_as::<i32>(&[&4])?, 0);
/// # Ok::<(), Error>(())
/// ```
///
/// ## Error Handling with batch errors
///
/// [`BatchBuilder.with_batch_errors`][] changes
/// the behavior of `Batch` as follows:
/// * `execute()` executes all rows in the batch and return an array of the error information
/// with row positions in the batch when the errors are caused by invalid data.
/// * `append_row()` doesn't send rows internally when the number of appended rows reaches
/// the batch size. It returns an error when the number exceeds the size instead.
///
/// ```
/// # use oracle::Error;
/// # use oracle::test_util;
/// # let conn = test_util::connect()?;
/// # conn.execute("delete from TestTempTable", &[])?;
/// let sql_stmt = "insert into TestTempTable values(:1, :2)";
/// let batch_size = 10;
/// let mut batch = conn.batch(sql_stmt, batch_size).with_batch_errors().build()?;
/// batch.append_row(&[&1, &"first row"])?;
/// batch.append_row(&[&2, &"second row"])?;
/// batch.append_row(&[&1, &"first row again"])?; // -> ORA-00001: unique constraint violated.
/// batch.append_row(&[&3, &"third row ".repeat(11)])?; // -> ORA-12899: value too large for column
/// batch.append_row(&[&4, &"fourth row"])?;
/// let result = batch.execute();
/// match result {
/// Err(Error::BatchErrors(mut errs)) => {
/// // sort by position because errs may not preserve order.
/// errs.sort_by(|a, b| a.offset().cmp(&b.offset()));
/// assert_eq!(errs.len(), 2);
/// assert_eq!(errs[0].code(), 1);
/// assert_eq!(errs[1].code(), 12899);
/// assert_eq!(errs[0].offset(), 2); // position of `[&1, &"first row again"]`
/// assert_eq!(errs[1].offset(), 3); // position of `[&3, &"third row ".repeat(11)]`
/// assert!(errs[0].message().starts_with("ORA-00001: "));
/// assert!(errs[1].message().starts_with("ORA-12899: "));
/// }
/// _ => panic!("Unexpected batch result: {:?}", result),
/// }
///
/// // Check the inserted rows.
/// let mut stmt = conn.prepare("select count(*) from TestTempTable where intCol = :1", &[])?;
/// assert_eq!(stmt.query_row_as::<i32>(&[&1])?, 1);
/// assert_eq!(stmt.query_row_as::<i32>(&[&2])?, 1);
/// assert_eq!(stmt.query_row_as::<i32>(&[&3])?, 0); // value too large for column
/// assert_eq!(stmt.query_row_as::<i32>(&[&4])?, 1);
/// # Ok::<(), Error>(())
/// ```
///
/// # Affected Rows
///
/// Use [`BatchBuilder.with_row_counts`][] and [`Batch.row_counts`][] to get affected rows
/// for each input row.
///
/// ```
/// # use oracle::Error;
/// # use oracle::sql_type::OracleType;
/// # use oracle::test_util;
/// # let conn = test_util::connect()?;
/// # conn.execute("delete from TestTempTable", &[])?;
/// # let sql_stmt = "insert into TestTempTable values(:1, :2)";
/// # let batch_size = 10;
/// # let mut batch = conn.batch(sql_stmt, batch_size).build()?;
/// # batch.set_type(1, &OracleType::Int64)?;
/// # batch.set_type(2, &OracleType::Varchar2(1))?;
/// # for i in 0..10 {
/// # batch.append_row(&[&i]);
/// # }
/// # batch.execute()?;
/// let sql_stmt = "update TestTempTable set stringCol = :stringCol where intCol >= :intCol";
/// let mut batch = conn.batch(sql_stmt, 3).with_row_counts().build()?;
/// batch.append_row_named(&[("stringCol", &"a"), ("intCol", &9)])?; // update 1 row
/// batch.append_row_named(&[("stringCol", &"b"), ("intCol", &7)])?; // update 3 rows
/// batch.append_row_named(&[("stringCol", &"c"), ("intCol", &5)])?; // update 5 rows
/// batch.execute()?;
/// assert_eq!(batch.row_counts()?, &[1, 3, 5]);
/// # Ok::<(), Error>(())
/// ```
///
/// # Bind Parameter Types
///
/// Parameter types are decided by the value of [`Batch.append_row`][], [`Batch.append_row_named`][]
/// or [`Batch.set`][]; or by the type specified by [`Batch.set_type`][]. Once the
/// type is determined, there are no ways to change it except the following case.
///
/// For user's convenience, when the length of character data types is too short,
/// the length is extended automatically. For example:
/// ```no_run
/// # use oracle::Error;
/// # use oracle::sql_type::OracleType;
/// # use oracle::test_util;
/// # let conn = test_util::connect()?;
/// # let sql_stmt = "dummy";
/// # let batch_size = 10;
/// let mut batch = conn.batch(sql_stmt, batch_size).build()?;
/// batch.append_row(&[&"first row"])?; // allocate 64 bytes for each row
/// batch.append_row(&[&"second row"])?;
/// //....
/// // The following line extends the internal buffer length for each row.
/// batch.append_row(&[&"assume that data lenght is over 64 bytes"])?;
/// # Ok::<(), Error>(())
/// ```
/// Note that extending the internal buffer needs memory copy from existing buffer
/// to newly allocated buffer. If you know the maximum data length, it is better
/// to set the size by [`Batch.set_type`][].

from rust-oracle.

rgutierrez2004 avatar rgutierrez2004 commented on September 27, 2024 1

The StmtParam::FetchArraySize option is for queries, using it for updates will only increase the amount of memory your program need.

The batch support works, I personally tested the latest version of rust-oracle with a 19c database and performance is quite good,
furthermore, if you put all the logic inside a SQL block and send it with rust-oracle similar to what you would do with a SQL tool where you run your SQL scripts will notice that performance with rust-oracle is better than with a Java thin driver which most SQL tools use.

If summary your code should look like this:

let ids: Vec<i32> = (0..10000).map(|v| v + 1).collect();
let now = Instant::now();
let mut stmt = conn.prepare("UPDATE table SET field='something' WHERE id = :1", &[])?;
for id in &ids {
  stmt.execute(&[id])?;
}
println!("Time for {} updates: {}", ids.len(), now.elapsed().as_millis());
conn.commit()

Some numbers: running the above code using two VMs in the same machine, one runing the rust program and the other the DB in a container with 10k records, got 1970 ms, now putting all the logic inside a SQL string block and running it with the same stmt.execute got only 104 ms which is fater than my 145 ms using a SQL tool with Java thin driver.

Hope this help,
R

from rust-oracle.

aesteve avatar aesteve commented on September 27, 2024

Thanks for your comment.
I'll give it another try, I must have been doing something wrong with batching.

from rust-oracle.

IdemenB avatar IdemenB commented on September 27, 2024

Hi @rgutierrez2004 , could you give a bit more detail on "putting all the logic inside a SQL string block" please? Do you mean also including the ID and time stamp generation into the SQL script? If yes, can you share the final script as an example?

Thanks in advance!

Some numbers: running the above code using two VMs in the same machine, one runing the rust program and the other the DB in a container with 10k records, got 1970 ms, now putting all the logic inside a SQL string block and running it with the same stmt.execute got only 104 ms which is fater than my 145 ms using a SQL tool with Java thin driver.

Hope this help,
R

from rust-oracle.

aesteve avatar aesteve commented on September 27, 2024

Thanks a lot @kubo should I close the issue or let you do it?

from rust-oracle.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.