Giter Site home page Giter Site logo

Comments (11)

ddcw avatar ddcw commented on August 13, 2024

Traceback (most recent call last): File "/root/ibd2sql-main/main.py", line 169, in ldata = innodb_index.index(bdata,pk,columns) File "/root/ibd2sql-main/innodb_index.py", line 127, in index data_list.append(read_row(columns,pk,bdata,offset)) File "/root/ibd2sql-main/innodb_index.py", line 101, in read_row return [ innodb_type.transdata(columns[x]['dtype'],tdata[x],columns[x]['is_unsigned'],columns[x]['extra']) if tdata[x] is not None else '' for x in range(len_column)] #数据类型转换 File "/root/ibd2sql-main/innodb_index.py", line 101, in return [ innodb_type.transdata(columns[x]['dtype'],tdata[x],columns[x]['is_unsigned'],columns[x]['extra']) if tdata[x] is not None else '' for x in range(len_column)] #数据类型转换 File "/root/ibd2sql-main/innodb_type.py", line 205, in transdata return bdata.decode() UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 0: invalid continuation byte

从报错来看, 是varchar数据类型做decode报的, 是第一行数据就报错了, 还是解析了一部分再报错的呢?
原因大概如下:

  1. 未获取到正确的varchar数据
  2. 表结构中含有不支持的数据类型
  3. 其它/BUG

可否提供下脱敏后的相关DDL呢.
python3 main.py --ddl xxxx.ibd

from ibd2sql.

ruan121212 avatar ruan121212 commented on August 13, 2024

Traceback (most recent call last): File "/root/ibd2sql-main/main.py", line 169, in ldata = innodb_index.index(bdata,pk,columns) File "/root/ibd2sql-main/innodb_index.py", line 127, in index data_list.append(read_row(columns,pk,bdata,offset)) File "/root/ibd2sql-main/innodb_index.py", line 101, in read_row return [ innodb_type.transdata(columns[x]['dtype'],tdata[x],columns[x]['is_unsigned'],columns[x]['extra']) if tdata[x] is not None else '' for x in range(len_column)] #数据类型转换 File "/root/ibd2sql-main/innodb_index.py", line 101, in return [ innodb_type.transdata(columns[x]['dtype'],tdata[x],columns[x]['is_unsigned'],columns[x]['extra']) if tdata[x] is not None else '' for x in range(len_column)] #数据类型转换 File "/root/ibd2sql-main/innodb_type.py", line 205, in transdata return bdata.decode() UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 0: invalid continuation byte

从报错来看, 是varchar数据类型做decode报的, 是第一行数据就报错了, 还是解析了一部分再报错的呢? 原因大概如下:

  1. 未获取到正确的varchar数据
  2. 表结构中含有不支持的数据类型
  3. 其它/BUG

可否提供下脱敏后的相关DDL呢. python3 main.py --ddl xxxx.ibd

有执行出了一些insert后报错的

python3 main.py --ddl /root/res_resource_history.ibd

CREATE Table IF NOT EXISTS xxx.xxx(
id bigint NOT NULL COMMENT '主键ID' ,
oid bigint NOT NULL COMMENT '资源初始ID' ,
project_id bigint DEFAULT NULL COMMENT '所属项目ID' ,
uri varchar(255) DEFAULT NULL COMMENT '资源统一标识' ,
extension varchar(64) DEFAULT NULL COMMENT '资源扩展名' ,
type_code varchar(255) DEFAULT NULL COMMENT '类型编号' ,
deleted tinyint(1) unsigned zerofill NOT NULL DEFAULT 0 COMMENT '是否删除,0表示不删除,1表示已删除' ,
version int DEFAULT NULL DEFAULT 0 COMMENT '版本号' ,
creator_id bigint DEFAULT NULL COMMENT '创建者id' ,
creator varchar(255) DEFAULT NULL COMMENT '创建者名称' ,
create_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间' ,
updator varchar(255) DEFAULT NULL COMMENT '更新者名称' ,
updator_id bigint DEFAULT NULL COMMENT '更新者ID' ,
update_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '最后更新时间' ,
PRIMARY KEY (id)
)ENGINE=InnoDB ;

from ibd2sql.

ddcw avatar ddcw commented on August 13, 2024

Traceback (most recent call last): File "/root/ibd2sql-main/main.py", line 169, in ldata = innodb_index.index(bdata,pk,columns) File "/root/ibd2sql-main/innodb_index.py", line 127, in index data_list.append(read_row(columns,pk,bdata,offset)) File "/root/ibd2sql-main/innodb_index.py", line 101, in read_row return [ innodb_type.transdata(columns[x]['dtype'],tdata[x],columns[x]['is_unsigned'],columns[x]['extra']) if tdata[x] is not None else '' for x in range(len_column)] #数据类型转换 File "/root/ibd2sql-main/innodb_index.py", line 101, in return [ innodb_type.transdata(columns[x]['dtype'],tdata[x],columns[x]['is_unsigned'],columns[x]['extra']) if tdata[x] is not None else '' for x in range(len_column)] #数据类型转换 File "/root/ibd2sql-main/innodb_type.py", line 205, in transdata return bdata.decode() UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 0: invalid continuation byte

从报错来看, 是varchar数据类型做decode报的, 是第一行数据就报错了, 还是解析了一部分再报错的呢? 原因大概如下:

  1. 未获取到正确的varchar数据
  2. 表结构中含有不支持的数据类型
  3. 其它/BUG

可否提供下脱敏后的相关DDL呢. python3 main.py --ddl xxxx.ibd

有执行出了一些insert后报错的

python3 main.py --ddl /root/res_resource_history.ibd

CREATE Table IF NOT EXISTS xxx.xxx( id bigint NOT NULL COMMENT '主键ID' , oid bigint NOT NULL COMMENT '资源初始ID' , project_id bigint DEFAULT NULL COMMENT '所属项目ID' , uri varchar(255) DEFAULT NULL COMMENT '资源统一标识' , extension varchar(64) DEFAULT NULL COMMENT '资源扩展名' , type_code varchar(255) DEFAULT NULL COMMENT '类型编号' , deleted tinyint(1) unsigned zerofill NOT NULL DEFAULT 0 COMMENT '是否删除,0表示不删除,1表示已删除' , version int DEFAULT NULL DEFAULT 0 COMMENT '版本号' , creator_id bigint DEFAULT NULL COMMENT '创建者id' , creator varchar(255) DEFAULT NULL COMMENT '创建者名称' , create_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间' , updator varchar(255) DEFAULT NULL COMMENT '更新者名称' , updator_id bigint DEFAULT NULL COMMENT '更新者ID' , update_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '最后更新时间' , PRIMARY KEY (id) )ENGINE=InnoDB ;

数据类型均是支持的, 我在我环境测试了下该表结构(10W数据) 是能正常解析并导入原表的. 如下为测试过程.

([email protected]) [db1]> checksum table t20230926;
+---------------+------------+
| Table         | Checksum   |
+---------------+------------+
| db1.t20230926 | 3979957364 |
+---------------+------------+
1 row in set (0.08 sec)

([email protected]) [db1]> drop table t20230926;
Query OK, 0 rows affected (0.01 sec)

([email protected]) [db1]> checksum table t20230926;
+---------------+------------+
| Table         | Checksum   |
+---------------+------------+
| db1.t20230926 | 3979957364 |
+---------------+------------+
1 row in set (0.08 sec)

([email protected]) [db1]>

(venv) 12:56:24 [root@ddcw21 ibd2sql_v0.2]#python main.py /data/mysql_3314/mysqldata/db1/t20230926.ibd --ddl --sql > t20230926.sql
(venv) 12:56:33 [root@ddcw21 ibd2sql_v0.2]#tail ^C
(venv) 12:57:13 [root@ddcw21 ibd2sql_v0.2]#wc -l t20230926.sql 
100019 t20230926.sql
(venv) 12:57:16 [root@ddcw21 ibd2sql_v0.2]#tail t20230926.sql 
(venv) 12:57:19 [root@ddcw21 ibd2sql_v0.2]#
(venv) 12:57:21 [root@ddcw21 ibd2sql_v0.2]#
(venv) 12:57:21 [root@ddcw21 ibd2sql_v0.2]#mysql -h127.0.0.1 -P3314 -uroot -pxxxx -Ddb1 < t20230926.sql 
mysql: [Warning] Using a password on the command line interface can be insecure.
(venv) 12:58:43 [root@ddcw21 ibd2sql_v0.2]#

该ibd文件是否有损坏呢, 如果有损坏的话, 需要跳过改行/页数据
比如跳过该页, 修改main.py 169行左右如下代码

try:
	if parser.DELETED:
		ldata = innodb_index.index_deleted(bdata,pk,columns)
	else:
		ldata = innodb_index.index(bdata,pk,columns)
except Exception as e:
	#print(e)
	continue

from ibd2sql.

ruan121212 avatar ruan121212 commented on August 13, 2024

我现在最新的版本ibd2sql,数据库版本是8.0.30,创建了一个表,插入了一行数据,在执行main.py --ddl可以正常,但是--sql就报错。以下是数据
DROP TABLE IF EXISTS xxl_job_lock;
/*!40101 SET @saved_cs_client = @@character_set_client /;
/
!50503 SET character_set_client = utf8mb4 /;
CREATE TABLE xxl_job_lock (
lock_name varchar(50) NOT NULL COMMENT '锁名称',
PRIMARY KEY (lock_name)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
/
!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table xxl_job_lock

LOCK TABLES xxl_job_lock WRITE;
/*!40000 ALTER TABLE xxl_job_lock DISABLE KEYS /;
INSERT INTO xxl_job_lock VALUES ('schedule_lock');
/
!40000 ALTER TABLE xxl_job_lock ENABLE KEYS */;
UNLOCK TABLES;

报错是
1

from ibd2sql.

ruan121212 avatar ruan121212 commented on August 13, 2024

在帮忙看下这个问题,谢谢

from ibd2sql.

ddcw avatar ddcw commented on August 13, 2024

我现在最新的版本ibd2sql,数据库版本是8.0.30,创建了一个表,插入了一行数据,在执行main.py --ddl可以正常,但是--sql就报错。以下是数据 DROP TABLE IF EXISTS xxl_job_lock; /*!40101 SET @saved_cs_client = @@character_set_client /; /!50503 SET character_set_client = utf8mb4 /; CREATE TABLE xxl_job_lock ( lock_name varchar(50) NOT NULL COMMENT '锁名称', PRIMARY KEY (lock_name) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci; /!40101 SET character_set_client = @saved_cs_client */;

--

-- Dumping data for table xxl_job_lock
LOCK TABLES xxl_job_lock WRITE; /*!40000 ALTER TABLE xxl_job_lock DISABLE KEYS /; INSERT INTO xxl_job_lock VALUES ('schedule_lock'); /!40000 ALTER TABLE xxl_job_lock ENABLE KEYS */; UNLOCK TABLES;

报错是 1

对于只有一个主键的情况, 解析存在BUG, 已修复. 可按如下步骤处理, 也可以重新下载
编辑 innodb_index.py 在data_offset = offset后面添加一行如下代码即可

var_offset += 1 if len_column == 1 else 0

例子

(venv) 15:42:48 [root@ddcw21 ibd2sql_v0.4]#python main.py  /data/mysql_3314/mysqldata/db1/t20231213_3.ibd --ddl --sql

 CREATE Table IF NOT EXISTS `db1`.`t20231213_3`(
`id` varchar(20) NOT NULL ,
PRIMARY KEY (`id`)
)ENGINE=InnoDB ; 

INSERT INTO `db1`.`t20231213_3` VALUES('321d的asd');

from ibd2sql.

ruan121212 avatar ruan121212 commented on August 13, 2024

新下载的那个可以解决上面的问题了,但是还有一个问题在帮忙看下,感谢。
image

数据在文本里面
sql.txt

from ibd2sql.

ddcw avatar ddcw commented on August 13, 2024

新下载的那个可以解决上面的问题了,但是还有一个问题在帮忙看下,感谢。 image

数据在文本里面 sql.txt

又是BUG .... 看来得找时间重写了 -_-
修改 innodb_index.py 或者重新下载

var_offset +=1 if null_bitmask > 0 else 0
bitmaskvar = 0
..........
        for x in range(len_column):
                col = columns[x]
                bitmaskvar += 1 if col['isvar'] else 0
                if tdata[x] is not None or null_bitmask&(1<<bitmaskvar):

from ibd2sql.

ruan121212 avatar ruan121212 commented on August 13, 2024

可以了,非常感谢

from ibd2sql.

kumv-net avatar kumv-net commented on August 13, 2024

--ddl

 CREATE Table IF NOT EXISTS `kudongchaowan`.`address`(
`created_at` datetime(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ,
`updated_at` datetime(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ,
`delete_at` datetime(6) DEFAULT NULL ,
`id` varchar(36) NOT NULL ,
`user_id` int NOT NULL ,
`code` varchar(255) NOT NULL ,
`city` varchar(255) NOT NULL ,
`address` varchar(255) NOT NULL ,
`name` varchar(255) NOT NULL ,
`phone` varchar(255) NOT NULL ,
`default` tinyint NOT NULL DEFAULT 0 ,
PRIMARY KEY (`id`),
KEY `FK_35cd6c3fafec0bb5d072e24ea20` (`user_id`),
CONSTRAINT `FK_35cd6c3fafec0bb5d072e24ea20` FOREIGN KEY (`user_id`) REFERENCES `kudongchaowan`.`sys_user` (`id`)
)ENGINE=InnoDB ;

加--sql报错

Traceback (most recent call last):
  File "D:\hby\project\github\ibd2sql\main.py", line 169, in <module>
    ldata = innodb_index.index(bdata,pk,columns)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\hby\project\github\ibd2sql\innodb_index.py", line 130, in index
    data_list.append(read_row(columns,pk,bdata,offset))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\hby\project\github\ibd2sql\innodb_index.py", line 104, in read_row
    return [ innodb_type.transdata(columns[x]['dtype'],tdata[x],columns[x]['is_unsigned'],columns[x]['extra']) if tdata[x] is not None else '' for x in range(len_column)] #数据类型转换
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\hby\project\github\ibd2sql\innodb_type.py", line 205, in transdata
    return bdata.decode()
           ^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x97 in position 0: invalid start byte

加--sql报错 原数据库类型是utf8mb4

from ibd2sql.

ddcw avatar ddcw commented on August 13, 2024

--ddl

 CREATE Table IF NOT EXISTS `kudongchaowan`.`address`(
`created_at` datetime(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ,
`updated_at` datetime(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ,
`delete_at` datetime(6) DEFAULT NULL ,
`id` varchar(36) NOT NULL ,
`user_id` int NOT NULL ,
`code` varchar(255) NOT NULL ,
`city` varchar(255) NOT NULL ,
`address` varchar(255) NOT NULL ,
`name` varchar(255) NOT NULL ,
`phone` varchar(255) NOT NULL ,
`default` tinyint NOT NULL DEFAULT 0 ,
PRIMARY KEY (`id`),
KEY `FK_35cd6c3fafec0bb5d072e24ea20` (`user_id`),
CONSTRAINT `FK_35cd6c3fafec0bb5d072e24ea20` FOREIGN KEY (`user_id`) REFERENCES `kudongchaowan`.`sys_user` (`id`)
)ENGINE=InnoDB ;

加--sql报错

Traceback (most recent call last):
  File "D:\hby\project\github\ibd2sql\main.py", line 169, in <module>
    ldata = innodb_index.index(bdata,pk,columns)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\hby\project\github\ibd2sql\innodb_index.py", line 130, in index
    data_list.append(read_row(columns,pk,bdata,offset))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\hby\project\github\ibd2sql\innodb_index.py", line 104, in read_row
    return [ innodb_type.transdata(columns[x]['dtype'],tdata[x],columns[x]['is_unsigned'],columns[x]['extra']) if tdata[x] is not None else '' for x in range(len_column)] #数据类型转换
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\hby\project\github\ibd2sql\innodb_type.py", line 205, in transdata
    return bdata.decode()
           ^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x97 in position 0: invalid start byte

加--sql报错 原数据库类型是utf8mb4

我测试了下那个表结构是可以解析的. 从报错来看是字符串解析有问题, 也可能是读字段的时候有问题. 也可能是其它

  1. 跳过报错, 编辑main.py的169行为如下
					try:
						ldata = innodb_index.index(bdata,pk,columns)
					except:
						continue
  1. 整理下表
optimize table `kudongchaowan`.`address`;
  1. 等下个版本, 我加debug功能后再调试

from ibd2sql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.