bruin-data / ingestr Goto Github PK
View Code? Open in Web Editor NEWingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Home Page: https://bruin-data.github.io/ingestr/
License: MIT License
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Home Page: https://bruin-data.github.io/ingestr/
License: MIT License
OS: macOS 14.3.1
Python: 3.11.8
Installed via instructions of pip install ingestr
I'm getting "ModuleNotFoundError: No module named 'MySQLdb'"
when I try mysql -> csv.
Will love to have support for mongodb
I get an Invalid IPv6 URL
when using a valid Postgres URL. Any idea how to bypass it?
The error stems from theurllib
package
Looks like a schema must be included in the table name schema_name.table_name
when sourcing from a sqlite table, but sqlite does not have a schema unless a db is attached.
raise ValueError("Table name must be in the format schema.<table>")
ValueError: Table name must be in the format schema.<table>
Thanks for the neat tool!
I tried to load tables from one MS SQL database into the other. I got the error message that an "alter table" cannot add columns without default value or null column.
This is my Test:
.\ingestr ingest --source-uri 'mssql://:@localhost:1433/WideWorldImporters?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes' --source-table 'Sales.Orders' --dest-uri 'mssql://:@localhost:1433/Staging?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes' --dest-table 'raw_wwi.Orders'
I have run a quick test, moving a 2M rows table from Snowflake to Snowflake.
It took ~4 min locally, with 3 min being spent at the "normalization" stage.
The data being already in a table, couldn't we request the schema to the DB directly and then use it when calling dlt
?
I also tried to replicate the same tables with Sling and it took ~ 1 min to run.
If the normalization time can be cut down, the perf would then be quite close to the Sling one..
Curious what was breaking Sqlite Destination.
Any love for Parquet files as source or destination?
Hi,
I wanted to try ingestr using snowflake because it looks really cool. I was trying to find the connection part in the repository but I didn't manage, so I haven't discarded that I am the one causing the error. However, when I connect to snowflake using sqlalchemy, I have no issues.
I managed to write the input following the documentation but if the password contains the character "@", it throws a 403 error. My understanding is that when ingestr parses the connection string, it locates only the first "@".
From the docs, the snowflake connection should be formed following this format:
snowflake://user:password@account/dbname?warehouse=COMPUTE_WH
so ingestr gets confused when the password has this structure:
snowflake://user:first_half_of_password@second_half_of_password@account/dbname?warehouse=COMPUTE_WH
because the error says: Failed to connect to DB: second_half_of_password@my_account.snowflakecomputing.com:443. Bad request; operation not supported.
(Background on this error at: https://sqlalche.me/e/14/4xp6)
This is the full command that I input:
ingestr ingest --source-uri "snowflake://user:first_half_of_password@second_half_of_password@my_account/my_db?warehouse=my_wh"
--source-table "my_schema.INGESTER_SRC"
--dest-uri "snowflake://user:first_half_of_password@second_half_of_password@my_account/my_db?warehouse=my_wh"
--dest-table "my_schema.INGESTER_DEST"
When I run it, I see ingestr asking if I would like to continue, so that part is ok.
I understand that you are under no obligation to fix it or to reply to this issue, just reporting it because I would like to use ingestr.
Thank you.
A.
It would be great if your tool will add support for ClickHouse
In main.py
there is the code
total: int | None = None,
message: str | None = None,
But for Python 3.9 it needs to be
total: Optional[int] = None,
message: Optional[str] = None,
otherwise the tool crashes with 3.9
MySQL Destination would be nice? I see there is source already, any particular reason why destination is missing, except lack of time?
It would be great to have ElasticSeach destination. I would try to use this today if it had support.
Even we could implement readers for 3rd party configurations.
Personally, I implemented a reader for dbt profiles and can contribute with it to this project.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.