近日,在對(duì)一張百萬(wàn)數(shù)據(jù)的業(yè)務(wù)表進(jìn)行去重時(shí),去重操作竟然夯住了。下面就來(lái)簡(jiǎn)單回憶一下。
1、查詢業(yè)務(wù)表數(shù)據(jù)量,查看到總共有200多w條
SQL> select count(*) from tb_bj_banker_etl;
2552381
2、查詢表內(nèi)應(yīng)該去掉的重復(fù)數(shù)據(jù)量,共80多w條
SQL> select count(*) from tb_bj_banker_etl where (id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1);
830099
3、于是,在晚上下班前,執(zhí)行了下面的語(yǔ)句腳本,為了去重
SQL> delete from tb_bj_banker_etl where(id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1);
SQL> commit;
4、第二天,到達(dá)現(xiàn)場(chǎng)時(shí),發(fā)現(xiàn)PL/SQL Developer工具中昨天晚上執(zhí)行的語(yǔ)句仍在執(zhí)行中
首先察覺(jué),80多w的去重?cái)?shù)據(jù)跑了一個(gè)晚上也沒(méi)跑完?這肯定是哪里出了問(wèn)題?
懷疑有鎖表。
于是查詢是否有鎖表的用戶。
SELECT
A.OWNER, --OBJECT所屬用戶
A.OBJECT_NAME, --OBJECT名稱
B.XIDUSN,
B.XIDSLOT,
B.XIDSQN,
B.SESSION_ID, --鎖表用戶的session
B.ORACLE_USERNAME, --鎖表用戶的Oracle用戶名
B.OS_USER_NAME, --鎖表用戶的操作系統(tǒng)登陸用戶名
B.PROCESS,
B.LOCKED_MODE,
C.MACHINE, --鎖表用戶的計(jì)算機(jī)名稱
C.STATUS, --鎖表狀態(tài)
C.SERVER,
C.SID,
C.SERIAL#,
C.PROGRAM --鎖表用戶所用的數(shù)據(jù)庫(kù)管理工具
FROM
ALL_OBJECTS A,
V$LOCKED_OBJECT B,
SYS.GV_$SESSION C
WHERE
A.OBJECT_ID = B.OBJECT_ID
AND B.PROCESS = C.PROCESS
ORDER BY 1,2
在下面結(jié)果中可以看到,鎖表的只是去重語(yǔ)句的發(fā)起會(huì)話,并沒(méi)有其它用戶造成鎖表,這說(shuō)明語(yǔ)句仍然在執(zhí)行嘛?帶著疑問(wèn),開始嘗試解決。
1 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036:972 3 WORKGROUP\BACKDB ACTIVE DEDICATED 913 3381 plsqldev.exe
2 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036:972 3 WORKGROUP\BACKDB INACTIVE DEDICATED 649 41791 plsqldev.exe
3 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036:972 3 WORKGROUP\BACKDB INACTIVE DEDICATED 817 27777 plsqldev.exe
4 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036:972 3 WORKGROUP\BACKDB INACTIVE DEDICATED 841 1981 plsqldev.exe
5、采用分批次,解決去重夯住問(wèn)題
由于直接去重?zé)o法順利進(jìn)行,于是想到了分批次去重的方法,試一下。
第一次:
delete from tb_bj_banker_etl where(id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1) and rownum=100000;
commit;
第二次:
delete from tb_bj_banker_etl where(id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1) and rownum=100000;
commit;
。。。。。。。
。。。。。。。
。。。。。。。
第八次:
delete from tb_bj_banker_etl where(id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1);
commit;
結(jié)果:通過(guò)將80多萬(wàn)數(shù)據(jù)劃分成以10w數(shù)據(jù)為單次進(jìn)行去重操作,總共用時(shí)140多秒,完成了去重80萬(wàn)數(shù)據(jù)的目的。但為何直接處理出現(xiàn)夯死情況,有待后續(xù)跟蹤分析。
以上就是臨時(shí)處理去重80w數(shù)據(jù)時(shí)夯死現(xiàn)象的全部過(guò)程,希望可以幫到大家。
您可能感興趣的文章:- SQL學(xué)習(xí)筆記五去重,給新加字段賦值的方法
- SQL分組排序去重復(fù)的小實(shí)例
- 淺談sql數(shù)據(jù)庫(kù)去重