Comments (7)
try to add rewriteBatchedStatements=true
parameter to your jdbc url
from seatunnel.
try to add
rewriteBatchedStatements=true
parameter to your jdbc url
Thank you, but this parameter has been added before,did not meet expectations
from seatunnel.
try to add
rewriteBatchedStatements=true
parameter to your jdbc urlThank you, but this parameter has been added before,did not meet expectations
plugin_name = jdbc
user = xxxxx
url = "jdbc:mysql://xxxxxx/xxxxxxx?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true&useSSL=false"
enable_upsert = true
generate_sink_sql = true
database = db_name
table = table_name
primary_keys = [xxx,xxx]
try with this config, it will auto generate insert sql, i use this config, the write speed is good
from seatunnel.
try to add
rewriteBatchedStatements=true
parameter to your jdbc urlThank you, but this parameter has been added before,did not meet expectations
plugin_name = jdbc user = xxxxx url = "jdbc:mysql://xxxxxx/xxxxxxx?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true&useSSL=false" enable_upsert = true generate_sink_sql = true database = db_name table = table_name primary_keys = [xxx,xxx]
try with this config, it will auto generate insert sql, i use this config, the write speed is good
`env {
execution.parallelism = 10
job.mode = "BATCH"
}
source {
Hive {
table_name = ""
metastore_uri = ""
result_table_name = "Table_test"
hdfs_site_path = "/home/hadoop/hadoop-3.2.2/etc/hadoop/hdfs-site.xml"
hive_site_path = "/home/hadoop/hive-2.3.9/conf/hive-site.xml"
}
}
transform {
sql {
source_table_name="Table_test"
query = "select xxx,xxx from Table_test"
result_table_name = "Table_test2"
}
}
sink {
Jdbc {
url = "jdbc:mysql://xxx:3306/xxx?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true&useSSL=false"
driver = "com.mysql.cj.jdbc.Driver"
user = "root"
enable_upsert = true
generate_sink_sql = true
password = "xxx"
database = "xxx"
primary_keys = [xxx,xxx,xxx]
table = "xxx"
}
}`
The version I am using is 2.3.1, and the configuration is as shown above, but the extraction speed is the same as before and has not improved. Is there something wrong with my configuration?
Approximately 12,000 pieces of data can be extracted per second.
I passed the primary key parameters based on the granular fields of the hive table, but these fields are not set as primary keys in the mysql table. Does this have any impact?
from seatunnel.
try to add
rewriteBatchedStatements=true
parameter to your jdbc urlThank you, but this parameter has been added before,did not meet expectations
plugin_name = jdbc user = xxxxx url = "jdbc:mysql://xxxxxx/xxxxxxx?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true&useSSL=false" enable_upsert = true generate_sink_sql = true database = db_name table = table_name primary_keys = [xxx,xxx]
try with this config, it will auto generate insert sql, i use this config, the write speed is good
Hello, now in the seatunnel-2.3.5 version, using the same configuration, the parallelism parameter will not take effect.
But after adding the parameter read_limit.rows_per_second=10000 to seatunnel-2.3.5, the parallelism parameter will take effect and the extraction speed will be significantly improved. Do you know the reason?
from seatunnel.
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
from seatunnel.
I had some issue at 2.3.5 version,the parallelism parameter is not work,only one parallelism working
from seatunnel.
Related Issues (20)
- [Bug] [Spark] Running on spark-operator Executor run error
- [Bug] [engine] MultipleTableJobConfigParser error HOT 1
- [Feature][Connector-V2] Support multi-table for sink Kafka HOT 2
- [Feature][Connector-File] Support user configure date/time/datetime format HOT 5
- [Feature][Core] Support openjdk17
- [Feature][Core] Support Batch Write or BulkWrite
- [Bug] [CDC] Loss last batch data in splitSchemaChangeStream HOT 1
- [Feature][Druid] Druid sink connector support multi table and cdc HOT 1
- SeaTunnel community meeting Topic collect HOT 8
- [Bug] [Kingbase] Why 2.3.4 support Kingbase, but could not find any jdbc dialect factory HOT 1
- [Discuss] support python scripts? HOT 2
- There is a problem with SeaTunnel Web's sql function for processing and transforming tasks HOT 3
- [Feature][Flink] Support multiple tables read and write HOT 2
- [Feature][Spark] Support multiple tables read and write HOT 4
- 【BUG】When synchronizing data from MySQL CDC to Kafka, certain fields are automatically converted to scientific notation HOT 10
- [Bug] [Spark] Seatunnel on Spark occurred conversion exception HOT 14
- [Bug] [Seatunnel] jdbc-mysql to doris error HOT 4
- DORIS sink bug HOT 2
- [Bug] [Hive Connector] hive_site_path and hdfs_site_path config loading too late
- [Doc] [Map OSS] org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem not found HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seatunnel.