主頁 > 知識庫 > postgresql 中的 like 查詢優(yōu)化方案

postgresql 中的 like 查詢優(yōu)化方案

熱門標(biāo)簽:濟南外呼網(wǎng)絡(luò)電話線路 400電話申請客服 地圖標(biāo)注要花多少錢 天津開發(fā)區(qū)地圖標(biāo)注app 電話機器人怎么換人工座席 江蘇400電話辦理官方 移動外呼系統(tǒng)模擬題 電銷機器人能補救房產(chǎn)中介嗎 廣州電銷機器人公司招聘

當(dāng)時數(shù)量量比較龐大的時候,做模糊查詢效率很慢,為了優(yōu)化查詢效率,嘗試如下方法做效率對比

一、對比情況說明:

1、數(shù)據(jù)量100w條數(shù)據(jù)

2、執(zhí)行sql

二、對比結(jié)果

explain analyze SELECT
 c_patent,
 c_applyissno,
 d_applyissdate,
 d_applydate,
 c_patenttype_dimn,
 c_newlawstatus,
 c_abstract 
FROM
 public.t_knowl_patent_zlxx_temp 
WHERE
 c_applicant LIKE '%本溪滿族自治縣連山關(guān)鎮(zhèn)安平安養(yǎng)殖場%';

1、未建索時執(zhí)行計劃:

"Gather (cost=1000.00..83803.53 rows=92 width=1278) (actual time=217.264..217.264 rows=0 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Parallel Seq Scan on t_knowl_patent_zlxx (cost=0.00..82794.33 rows=38 width=1278) (actual time=212.355..212.355 rows=0 loops=3)
  Filter: ((c_applicant)::text ~~ '%本溪滿族自治縣連山關(guān)鎮(zhèn)安平安養(yǎng)殖場%'::text)
  Rows Removed by Filter: 333333
Planning time: 0.272 ms
Execution time: 228.116 ms"

2、btree索引

建索引語句

CREATE INDEX idx_public_t_knowl_patent_zlxx_applicant ON public.t_knowl_patent_zlxx(c_applicant varchar_pattern_ops);

執(zhí)行計劃

"Gather (cost=1000.00..83803.53 rows=92 width=1278) (actual time=208.253..208.253 rows=0 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Parallel Seq Scan on t_knowl_patent_zlxx (cost=0.00..82794.33 rows=38 width=1278) (actual time=203.573..203.573 rows=0 loops=3)
  Filter: ((c_applicant)::text ~~ '%本溪滿族自治縣連山關(guān)鎮(zhèn)安平安養(yǎng)殖場%'::text)
  Rows Removed by Filter: 333333
Planning time: 0.116 ms
Execution time: 218.189 ms"

但是如果將查詢sql稍微改動一下,把like查詢中的前置%去掉是這樣的

Index Scan using idx_public_t_knowl_patent_zlxx_applicant on t_knowl_patent_zlxx_temp (cost=0.55..8.57 rows=92 width=1278) (actual time=0.292..0.292 rows=0 loops=1)
 Index Cond: (((c_applicant)::text ~>=~ '本溪滿族自治縣連山關(guān)鎮(zhèn)安平安養(yǎng)殖場'::text) AND ((c_applicant)::text ~~ '本溪滿族自治縣連山關(guān)鎮(zhèn)安平安養(yǎng)殖圻'::text))
 Filter: ((c_applicant)::text ~~ '本溪滿族自治縣連山關(guān)鎮(zhèn)安平安養(yǎng)殖場%'::text)
Planning time: 0.710 ms
Execution time: 0.378 ms

3、gin索引

創(chuàng)建索引語句(postgresql要求在9.6版本及以上)

create extension pg_trgm;
CREATE INDEX idx_public_t_knowl_patent_zlxx_applicant ON public.t_knowl_patent_zlxx USING gin (c_applicant gin_trgm_ops);

執(zhí)行計劃

Bitmap Heap Scan on t_knowl_patent_zlxx (cost=244.71..600.42 rows=91 width=1268) (actual time=0.649..0.649 rows=0 loops=1)
 Recheck Cond: ((c_applicant)::text ~~ '%本溪滿族自治縣連山關(guān)鎮(zhèn)安平安養(yǎng)殖場%'::text)
 -> Bitmap Index Scan on idx_public_t_knowl_patent_zlxx_applicant (cost=0.00..244.69 rows=91 width=0) (actual time=0.647..0.647 rows=0 loops=1)
  Index Cond: ((c_applicant)::text ~~ '%本溪滿族自治縣連山關(guān)鎮(zhèn)安平安養(yǎng)殖場%'::text)
Planning time: 0.673 ms
Execution time: 0.740 ms

三、結(jié)論

btree索引可以讓后置% "abc%"的模糊匹配走索引,gin + gp_trgm可以讓前后置% "%abc%" 走索引。但是gin 索引也有弊端,以下情況可能導(dǎo)致無法命中:

搜索字段少于3個字符時,不會命中索引,這是gin自身機制導(dǎo)致。

當(dāng)搜索字段過長時,比如email檢索,可能也不會命中索引,造成原因暫時未知。

補充:PostgreSQL LIKE 查詢效率提升實驗

一、未做索引的查詢效率

作為對比,先對未索引的查詢做測試

EXPLAIN ANALYZE select * from gallery_map where author = '曹志耘';
             QUERY PLAN             
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on gallery_map (cost=0.00..7002.32 rows=1025 width=621) (actual time=0.011..39.753 rows=1031 loops=1)
 Filter: ((author)::text = '曹志耘'::text)
 Rows Removed by Filter: 71315
 Planning time: 0.194 ms
 Execution time: 39.879 ms
(5 rows)
 
Time: 40.599 ms
EXPLAIN ANALYZE select * from gallery_map where author like '曹志耘';
             QUERY PLAN             
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on gallery_map (cost=0.00..7002.32 rows=1025 width=621) (actual time=0.017..41.513 rows=1031 loops=1)
 Filter: ((author)::text ~~ '曹志耘'::text)
 Rows Removed by Filter: 71315
 Planning time: 0.188 ms
 Execution time: 41.669 ms
(5 rows)
 
Time: 42.457 ms
 
EXPLAIN ANALYZE select * from gallery_map where author like '曹志耘%';
             QUERY PLAN             
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on gallery_map (cost=0.00..7002.32 rows=1028 width=621) (actual time=0.017..41.492 rows=1031 loops=1)
 Filter: ((author)::text ~~ '曹志耘%'::text)
 Rows Removed by Filter: 71315
 Planning time: 0.307 ms
 Execution time: 41.633 ms
(5 rows)
 
Time: 42.676 ms

很顯然都會做全表掃描

二、創(chuàng)建btree索引

PostgreSQL默認(rèn)索引是btree

CREATE INDEX ix_gallery_map_author ON gallery_map (author);
 
EXPLAIN ANALYZE select * from gallery_map where author = '曹志耘';  
                QUERY PLAN                
-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on gallery_map (cost=36.36..2715.37 rows=1025 width=621) (actual time=0.457..1.312 rows=1031 loops=1)
 Recheck Cond: ((author)::text = '曹志耘'::text)
 Heap Blocks: exact=438
 -> Bitmap Index Scan on ix_gallery_map_author (cost=0.00..36.10 rows=1025 width=0) (actual time=0.358..0.358 rows=1031 loops=1)
   Index Cond: ((author)::text = '曹志耘'::text)
 Planning time: 0.416 ms
 Execution time: 1.422 ms
(7 rows)
 
Time: 2.462 ms
 
EXPLAIN ANALYZE select * from gallery_map where author like '曹志耘';
                QUERY PLAN                
-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on gallery_map (cost=36.36..2715.37 rows=1025 width=621) (actual time=0.752..2.119 rows=1031 loops=1)
 Filter: ((author)::text ~~ '曹志耘'::text)
 Heap Blocks: exact=438
 -> Bitmap Index Scan on ix_gallery_map_author (cost=0.00..36.10 rows=1025 width=0) (actual time=0.560..0.560 rows=1031 loops=1)
   Index Cond: ((author)::text = '曹志耘'::text)
 Planning time: 0.270 ms
 Execution time: 2.295 ms
(7 rows)
 
Time: 3.444 ms
EXPLAIN ANALYZE select * from gallery_map where author like '曹志耘%';
             QUERY PLAN             
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on gallery_map (cost=0.00..7002.32 rows=1028 width=621) (actual time=0.015..41.389 rows=1031 loops=1)
 Filter: ((author)::text ~~ '曹志耘%'::text)
 Rows Removed by Filter: 71315
 Planning time: 0.260 ms
 Execution time: 41.518 ms
(5 rows)
 
Time: 42.430 ms
EXPLAIN ANALYZE select * from gallery_map where author like '%研究室';
             QUERY PLAN             
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on gallery_map (cost=0.00..7002.32 rows=2282 width=621) (actual time=0.064..52.824 rows=2152 loops=1)
 Filter: ((author)::text ~~ '%研究室'::text)
 Rows Removed by Filter: 70194
 Planning time: 0.254 ms
 Execution time: 53.064 ms
(5 rows)
 
Time: 53.954 ms

可以看到,等于、like的全匹配是用到索引的,like的模糊查詢還是全表掃描

三、創(chuàng)建gin索引

CREATE EXTENSION pg_trgm;
 
CREATE INDEX ix_gallery_map_author ON gallery_map USING gin (author gin_trgm_ops);
EXPLAIN ANALYZE select * from gallery_map where author like '曹%'; 
                QUERY PLAN                
-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on gallery_map (cost=19.96..2705.69 rows=1028 width=621) (actual time=0.419..1.771 rows=1031 loops=1)
 Recheck Cond: ((author)::text ~~ '曹%'::text)
 Heap Blocks: exact=438
 -> Bitmap Index Scan on ix_gallery_map_author (cost=0.00..19.71 rows=1028 width=0) (actual time=0.312..0.312 rows=1031 loops=1)
   Index Cond: ((author)::text ~~ '曹%'::text)
 Planning time: 0.358 ms
 Execution time: 1.916 ms
(7 rows)
 
Time: 2.843 ms
EXPLAIN ANALYZE select * from gallery_map where author like '%耘%'; 
             QUERY PLAN             
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on gallery_map (cost=0.00..7002.32 rows=1028 width=621) (actual time=0.015..51.641 rows=1031 loops=1)
 Filter: ((author)::text ~~ '%耘%'::text)
 Rows Removed by Filter: 71315
 Planning time: 0.268 ms
 Execution time: 51.957 ms
(5 rows)
 
Time: 52.899 ms
EXPLAIN ANALYZE select * from gallery_map where author like '%研究室%';
                QUERY PLAN                
-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on gallery_map (cost=31.83..4788.42 rows=2559 width=621) (actual time=0.914..4.195 rows=2402 loops=1)
 Recheck Cond: ((author)::text ~~ '%研究室%'::text)
 Heap Blocks: exact=868
 -> Bitmap Index Scan on ix_gallery_map_author (cost=0.00..31.19 rows=2559 width=0) (actual time=0.694..0.694 rows=2402 loops=1)
   Index Cond: ((author)::text ~~ '%研究室%'::text)
 Planning time: 0.306 ms
 Execution time: 4.403 ms
(7 rows)
 
Time: 5.227 ms

gin_trgm索引的效果好多了

由于pg_trgm的索引是把字符串切成多個3元組,然后使用這些3元組做匹配,所以gin_trgm索引對于少于3個字符(包括漢字)的查詢,只有前綴匹配會走索引

另外,還測試了btree_gin,效果和btree一樣

注意:

gin_trgm要求數(shù)據(jù)庫必須使用UTF-8編碼

demo_v1 # \l demo_v1
        List of databases
 Name | Owner | Encoding | Collate | Ctype | Access privileges
---------+-----------+----------+-------------+-------------+-------------------
 demo_v1 | wmpp_user | UTF8  | en_US.UTF-8 | en_US.UTF-8 |
 

以上為個人經(jīng)驗,希望能給大家一個參考,也希望大家多多支持腳本之家。如有錯誤或未考慮完全的地方,望不吝賜教。

您可能感興趣的文章:
  • postgresql coalesce函數(shù)數(shù)據(jù)轉(zhuǎn)換方式
  • postgresql 中的COALESCE()函數(shù)使用小技巧
  • postgresql 實現(xiàn)修改jsonb字段中的某一個值
  • postgresql 實現(xiàn)將字段為空的值替換為指定值
  • PostgreSQL 禁用全表掃描的實現(xiàn)
  • 解決PostgreSQL Array使用中的一些小問題
  • sql 實現(xiàn)將空白值替換為其他值

標(biāo)簽:榆林 杭州 溫州 昭通 濮陽 海西 寶雞 辛集

巨人網(wǎng)絡(luò)通訊聲明:本文標(biāo)題《postgresql 中的 like 查詢優(yōu)化方案》,本文關(guān)鍵詞  postgresql,中的,like,查詢,;如發(fā)現(xiàn)本文內(nèi)容存在版權(quán)問題,煩請?zhí)峁┫嚓P(guān)信息告之我們,我們將及時溝通與處理。本站內(nèi)容系統(tǒng)采集于網(wǎng)絡(luò),涉及言論、版權(quán)與本站無關(guān)。
  • 相關(guān)文章
  • 下面列出與本文章《postgresql 中的 like 查詢優(yōu)化方案》相關(guān)的同類信息!
  • 本頁收集關(guān)于postgresql 中的 like 查詢優(yōu)化方案的相關(guān)信息資訊供網(wǎng)民參考!
  • 推薦文章