無序uuid對數據庫的影響
由于最近在做超大表的性能測試,在該過程中發(fā)現了無序uuid做主鍵對表插入性能有一定影響。結合實際情況發(fā)現當表的數據量越大,對表插入性能的影響也就越大。
測試環(huán)境
PostgreSQL創(chuàng)建插入腳本,測試各種情況的tps。
數據庫版本:PostgreSQL 10.4 (ArteryBase 5.0.0, Thunisoft)
操作系統配置:CentOS Linux release 7 ,32GB內存,8 cpu
測試參數:pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
空表,1000w數據,5000w數據,一億數據的各種主鍵測試。
測試無序的uuid,有序的uuid,序列,有普通btree,有唯一索引和沒有主鍵的情況
測試
1.創(chuàng)建表
--無序的uuid
pgbenchdb=# create table test_uuid_v4(id char(32) primary key);
CREATE TABLE
--有序的uuid
pgbenchdb=# create table test_time_nextval(id char(32) primary key);
CREATE TABLE
--遞增序列
pgbenchdb=# create table test_seq_bigint(id int8 primary key);
CREATE TABLE
--創(chuàng)建序列
create sequence test_seq start with 1 ;
2.測試腳本
--測試無序uuid腳本
vi pgbench_uuid_v4.sql
insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
--測試有序uuid腳本
vi pgbench_time_nextval.sql
insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,'-',''));
--測試序列腳本
vi pgbench_seq_bigint.sql
insert into test_seq_bigint (id) values (nextval('test_seq'::regclass));
無序uuid,無數據情況
磁盤使用情況
avg-cpu: %user %nice %system %iowait %steal %idle
0.76 0.00 0.38 4.67 0.00 94.19
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 0.00 96.00 0.00 2048.00 42.67 1.02 10.67 0.00 10.67 10.33 99.20
dm-0 0.00 0.00 0.00 96.00 0.00 2048.00 42.67 1.02 10.66 0.00 10.66 10.32 99.10
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
tps:
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 53494
latency average = 8.974 ms
tps = 891.495404 (including connections establishing)
tps = 891.588967 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
9.006 insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
無數據情況下,tps
類別 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
無序uuid | 919 | 907 | 891 | 906 | 99.2% | 10.66
有序uuid | 985 | 882 | 932 | 933 | 98.7% | 4.4
序列 | 1311 | 1277 | 1280 | 1289 | 97.5% | 3.4
向表里面初始化100w數據
pgbenchdb=# insert into test_uuid_v4 (id) select replace(uuid_generate_v4()::text,'-','') from generate_series(1,1000000);
INSERT 0 1000000
Time: 43389.817 ms (00:43.390)
pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,'-','') from generate_series(1,1000000);
INSERT 0 1000000
Time: 30585.134 ms (00:30.585)
pgbenchdb=# insert into test_seq_bigint select generate_series (1,1000000);
INSERT 0 1000000
Time: 9818.639 ms (00:09.819)
無序uuid插入100w需要43s,有序需要30s,序列需要10s。
插入一百萬數據后的tps
類別 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
無序uuid | 355 | 440 | 302 | 365 | 98.8% | 13
有序uuid | 948 | 964 | 870 | 927 | 97.2% | 4.0
序列 | 1159 | 1234 | 1115 | 1169 | 96.6% | 3.5
插入一千萬數據后的tps
類別 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
無序uuid | 260 | 292 | 227 | 260 | 99.2% | 16.8
有序uuid | 817 | 960 | 883 | 870 | 97.7% | 3.9
序列 | 1305 | 1261 | 1270 | 1278 | 96.8% | 3.0
插入五千萬數據后
向表中插入5kw數據,并且添加主鍵
pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,'-','') from generate_series(1,50000000);
INSERT 0 50000000
Time: 453985.318 ms (07:33.985)
pgbenchdb=# insert into test_seq_bigint select generate_series (1,50000000);
INSERT 0 50000000
Time: 352206.160 ms (05:52.206)
pgbenchdb=# insert into test_uuid_v4 (id) select replace(uuid_generate_v4()::text,'-','') from generate_series(1,50000000);
INSERT 0 50000000
Time: 1159689.338 ms (00:19:19.689)
在無主鍵情況下,插入五千萬數據,有序uuid耗時7分鐘,序列耗時6分鐘,而無序uuid耗時接近20分鐘。
pgbenchdb=# alter table test_uuid_v4 add primary key ("id");
ALTER TABLE
Time: 845199.296 ms (14:05.199)
pgbenchdb=# alter table test_time_nextval add primary key ("id");
ALTER TABLE
Time: 932151.103 ms (15:32.151)
pgbenchdb=# alter table test_seq_bigint add primary key ("id");
ALTER TABLE
Time: 148138.871 ms (02:28.139)
pgbenchdb=# select pg_size_pretty(pg_total_relation_size('test_uuid_v4'));
pg_size_pretty
----------------
6072 MB
(1 row)
Time: 0.861 ms
pgbenchdb=# select pg_size_pretty(pg_total_relation_size('test_time_nextval'));
pg_size_pretty
----------------
6072 MB
(1 row)
Time: 0.942 ms
pgbenchdb=# select pg_size_pretty(pg_total_relation_size('test_seq_bigint'));
pg_size_pretty
----------------
2800 MB
(1 row)
Time: 0.699 ms
插入5kw后
類別 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
無序uuid | 162 | 163 | 163 | 163 | 99.6% | 18.4
有序uuid | 738 | 933 | 979 | 883 | 97.7% | 3.9
序列 | 1132 | 1264 | 1265 | 1220 | 96.8% | 3.5
插入1億條數據后
類別 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
無序uuid | 121 | 131 | 143 | 131 | 99.6% | 28.2
有序uuid | 819 | 795 | 888 | 834 | 99.2% | 28.7
序列 | 1193 | 1115 | 1109 | 1139 | 96.8% | 11.3
普通btree索引
上面測了無序uuid,1kw情況下,有主鍵的tps是260,無主鍵的tps是1234。嘗試測試普通的索引,和唯一索引tps
--創(chuàng)建普通索引
pgbenchdb=# create index i_test_uuid_v4_id on test_uuid_v4(id);
CREATE INDEX
Time: 316367.010 ms (05:16.367)
--創(chuàng)建普通索引后
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 13308
latency average = 36.080 ms
tps = 221.727391 (including connections establishing)
tps = 221.749660 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
38.512 insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
--創(chuàng)建唯一索引
pgbenchdb=# drop index i_test_uuid_v4_id;
DROP INDEX
Time: 267.451 ms
pgbenchdb=# create unique index i_test_uuid_v4_id on test_uuid_v4(id);
CREATE INDEX
Time: 153372.622 ms (02:33.373)
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
^[[3~transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 13847
latency average = 34.693 ms
tps = 230.593988 (including connections establishing)
tps = 230.620469 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
36.410 insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
無論是普通btree索引和唯一索引,都會影響插入的效率。
刪除所有的主鍵索引
--刪除所有主鍵
alter table test_uuid_v4 drop constraint "test_uuid_v4_pkey";
alter table test_time_nextval drop constraint "test_time_nextval_pkey" ;
alter table test_seq_bigint drop constraint "test_seq_bigint_pkey";
1,--無序uuid:測試pgbench_uuid_v4.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 74109
latency average = 6.479 ms
tps = 1234.842229 (including connections establishing)
tps = 1235.042674 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
6.112 insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
2、--有序uuid,測試pgbench_time_nextval.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_time_nextval.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_time_nextval.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 74027
latency average = 6.486 ms
tps = 1233.364360 (including connections establishing)
tps = 1233.482292 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
6.186 insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,'-',''));
3、--序列,測試pgbench_seq_bigint.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_seq_bigint.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_seq_bigint.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 76312
latency average = 6.290 ms
tps = 1271.832907 (including connections establishing)
tps = 1272.124397 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
5.916 insert into test_seq_bigint (id) values (nextval('test_seq'::regclass));
刪除主鍵約束后,三種情況下tps非常接近,都達到了1200+。
Btree索引,插入操作的平均tps對比
類別/平均tps | 無數據 | 一千萬 | 五千萬 | 一億 |
---------------+---------+---------+---------+---------+
無序uuid | 960 | 260 | 163 | 131 |
有序uuid | 933 | 870 | 883 | 834 |
序列 | 1289 | 1278 | 1220 | 1139 |
根據測試數據可以看出無序的uuid在數據到達1kw后插入數據的tps下降的非常厲害,而有序的uuid和遞增序列下降的比較少。到一億數據的tps有序uuid是無序的6倍,序列是無序uuid的9倍。
創(chuàng)建單獨的表空間用來存儲索引信息
如果有多快磁盤那么可以將索引和數據分開存儲,以此來加快寫入的速度。
創(chuàng)建單獨的索引空間:
create tablespace indx_test owner sa location '/home/tablespace/index_test';
指定索引存儲目錄:
create index i_test_uuid_v4_id on test_uuid_v4 using btree(id) tablespace indx_test;
關于有序uuid
測試使用的sequential-uuids插件,生成的有序uuid。
有序uuid的結構為(block ID; random data),實際上就是把數據拆成兩部分,一部分自增,一部分隨機。
sequential-uuids
sequential-uuids-git
提供了兩種算法:
1.uuid_sequence_nextval(sequence regclass, block_size int default 65536, block_count int default 65536)
前綴為自增序列,如果塊ID使用2字節(jié)存儲,一個索引BLOCK里面可以存儲256條記錄(假設8K的BLOCK,一條記錄包括uuid VALUE(16字節(jié))以及ctid(6字節(jié)),所以一個索引頁約存儲363條記錄(8000 /(16 + 6)))
2.uuid_time_nextval(interval_length int default 60, interval_count int default 65536) RETURNS uuid
默認每60秒內的數據的前綴是一樣的,前綴遞增1,到65535后循環(huán)。
使用uuid_time_nextval生成的有序uuid
pgbenchdb=# select id from test_time_nextval;
id
----------------------------------
a18b7dd0ca92b0b5c1844a402f9c6999
a18b540b8bbe0ddb2b6d0189b2e393c6
a18b83eb7320b0a90e625185421e065e
a18bade4ff15e05dab81ecd3f4c2dee4
a18b79e41c3bc8d2d4ba4b70447e6b29
a18bdad18d9e0d2fa1d9d675bc7129f0
a18b13723ec7be9a2f1a3aec5345a88b
a18bd9d866047aec69a064d30e9493d2
a18bd76e8c787c7464479502f381e6d7
a18ba5c0c966f81cfdbeff866618da8d
......
有序uuid前四位有序,后面的隨機生成。
結語
1.關于有序的uuid,前4位是有序的,后面都是隨機生成的。
2.在該環(huán)境中發(fā)現,無序uuid隨著數據量的不斷增大,tps下滑比較厲害。
3.由于btree索引的存在,無序的uuid會導致大量的離散io。導致磁盤使用率高。進而影響插入效率。隨著表數據量的增大更加明顯。
4.該測試是在普通的磁盤上面測試,并未在ssd上面測試。
5.如果要使用有序uuid,有多種實現方式,還需要考慮分布式情況下生成全局有序uuid。
以上就是postgresql無序uuid性能測試的詳細內容,更多關于postgresql無序uuid性能測試的資料請關注腳本之家其它相關文章!
您可能感興趣的文章:- 使用Ruby on Rails和PostgreSQL自動生成UUID的教程
- PostgreSQL 主備數據宕機恢復測試方案
- 使用Postgresql 實現快速插入測試數據
- 基于postgresql行級鎖for update測試
- 一個提升PostgreSQL性能的小技巧
- PostgreSQL 數據庫性能提升的幾個方面