Giter Site home page Giter Site logo

asleekgeek / cosmos-typescript-bulk-import-throughput-optimizer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from azure-samples/cosmos-typescript-bulk-import-throughput-optimizer

0.0 0.0 0.0 166 KB

License: MIT License

TypeScript 98.48% JavaScript 1.52%

cosmos-typescript-bulk-import-throughput-optimizer's Introduction

page_type languages name description products urlFragment
sample
typescript
javascript
nodejs
Cosmos DB bulk import using TypeScript
A sample that shows different ways to do bulk import of items into Cosmos DB.
azure
azure-cosmos-db
cosmos-typescript-bulk-import-throughput-optimizer

Cosmos DB bulk import using TypeScript

CI badge

Introduction

This repository implements the same logic in TypeScript as what https://github.com/Azure-Samples/cosmos-dotnet-bulk-import-throughput-optimizer implements in dotnet.

The primary purpose is to find optimal ways to bulk insert in TypeScript / JavaScript and provide recommendations related to different approaches.

A secondary purpose is to compare the bulk insert performance between the two SDKs because they use different server-side API from Cosmos DB.

Prerequisites

Setup

Set environment variables as following (remove all < and >):

export ENDPOINT_URL="https://<your-account-URI>.documents.azure.com:443/"
export AUTHORIZATION_KEY="<your-account-PRIMARY-KEY>"

Build

npm install
npm run build

Run locally

node dist/main.js

Run tests locally

npm run test

Key concepts

Test results

Import mechanism Import method Consumed RU Items per second
Stored Procedure Create 25648 1512
Stored Procedure Upsert 25689 1365
Bulk Operations Create 27619 1265
Bulk Operations Upsert 27619 1252
Parallel Create 27600 798
Parallel Upsert 27600 954

Above results are when running the optimizer on Standard F2s VM in the same Azure region as where Cosmos DB is provisioned. The provisioned throughput was set to 10000 RU/s and 5000 items were inserted.

Parallel vs Stored Procedure

As seen in above results, Stored Procedure gives better performance. It was also observed that Parallel import is heavy on CPU. The relative improvement with Stored Procedure is even higher if there is more distance between client and server because the is a lot less requests sent. For example, when run on laptop outside of Azure, the Parallel import achieved only ~100 items per second while the Stored Procedure achieved ~1000 items per second.

The limitation of Stored Procedure is that the scope is within a single Partition Key so it means bulk import works only if all items have the same value as the Partition Key.

Attribution

The original author for the bulk import with stored procedures is Patrick Schuler and the main parts of the related code has been taken from here.

cosmos-typescript-bulk-import-throughput-optimizer's People

Contributors

vjrantal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.