视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
MySQL和PostgreSQL导入数据对比_MySQL
2020-11-09 19:18:05 责编:小采
文档

在虚拟机上测评了下MySQL 和 PostgreSQL 的各种LOAD FILE方式以及时间。 因为是虚拟机上的测评,所以时间只做参考,不要太较真, 看看就好了。
MySQL 工具:
1. 自带mysqlimport工具。
2. 命令行 load data infile ...
3. 利用mysql-connector-python Driver来写的脚本。
PostgreSQL 工具:
1. pgloader 第三方工具。
2. 命令行 copy ... from ...
3. 利用psycopg2写的python 脚本。
测试表结构:
mysql> desc t1;+----------+-----------+------+-----+-------------------+-------+| Field| Type| Null | Key | Default | Extra |+----------+-----------+------+-----+-------------------+-------+| id | int(11) | NO | PRI | NULL| || rank | int(11) | NO | | NULL| || log_time | timestamp | YES| | CURRENT_TIMESTAMP | |+----------+-----------+------+-----+-------------------+-------+3 rows in set (0.00 sec)mysql> select count(*) from t1;+----------+| count(*) |+----------+|1000000 |+----------+1 row in set (6.80 sec)
测试CSV文件:
t1.csv
MySQL 自身的loader: (时间24妙)
mysql> load data infile '/tmp/t1.csv' into table t1 fields terminated by ',' enclosed by '"' lines terminated by '/r/n'; Query OK, 1000000 rows affected (24.21 sec)Records: 1000000Deleted: 0Skipped: 0Warnings: 0
MySQL python 脚本:(时间23秒)
>>>
Running 23.2 Seconds

MySQL 自带mysqlimport:(时间23秒)
[root@mysql56-master ~]# time mysqlimport t_girl '/tmp/t1.csv' --fields-terminated-by=',' --fields-enclosed-by='"' --lines-terminated-by='/r/n' --use-threads=2 -uroot -proott_girl.t1: Records: 1000000Deleted: 0Skipped: 0Warnings: 0real0m23.6suser0m0.016ssys 0m0.037s
PostgreSQL 自身COPY:(时间7秒)
t_girl=# copy t1 from '/tmp/t1.csv' with delimiter ',';COPY 1000000Time: 7700.332 ms
Psycopg2 驱动copy_to方法:(时间6秒)
[root@postgresql-instance scripts]# python load_data.py Running 5.969 Seconds.
Pgloader 导入CSV:(时间33秒)
[root@postgresql-instance ytt]# pgloader commands.loadtable name read imported errorstimeytt.t1100000010000000 33.514s---------------------------------------------------------------------------------------------------------------------------------------------- Total import time100000010000000 33.514s
Pgloader 直接从MySQL 拉数据:(时间51秒)
[root@postgresql-instance ytt]# pgloader commands.mysql table name read imported errorstime fetch meta data2200.138s-----------------------------------------------------------------------	t1100000010000000 51.136s--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Total import time100000010000000 51.274s
附上commands.load和commands.mysql
commands.load:LOAD CSV FROM '/tmp/ytt.csv' WITH ENCODING UTF-8	(	 id, rank, log_time	) INTO postgresql://t_girl:t_girl@127.0.0.1:5432/t_girl?ytt.t1WITH skip header = 0,	fields optionally enclosed by '"',	fields escaped by backslash-quote,	fields terminated by ',' 	SET work_mem to '32 MB', maintenance_work_mem to ' MB'; commands.mysql:LOAD DATABASE	 FROM mysql://python_user:python_user@192.168.1.131:3306/t_girl?t1	 INTO postgresql://t_girl:t_girl@127.0.0.1:5432/t_girl?ytt.t1with data onlySET maintenance_work_mem to 'MB',	work_mem to '3MB',	search_path to 'ytt';附pgloader 手册:http://pgloader.io/howto/pgloader.1.html

下载本文
显示全文
专题