Oracle – How to Fix Very Slow Insert Statement

oracleoracle-11g-r2

Oracle 11.2.0.4 on OEL 6.5

Our app admin run some update script from app vendor,
and this script has been running for several hours and no one know when it will end. Also database now generates lots of archivelogs.
I started sql trace for 15 minutes and found that there is almost the only query.

INSERT INTO house_t  
VALUES
 (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14, :15, :16, 
 :17, :18, :19, :20, :21, :22, :23)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       70      0.00       0.00          0          0          0           0
Execute     71     23.86     633.03      83564      41375    4597750      355735
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      141     23.87     633.03      83564      41375    4597750      355735

Why one simple insert needs to use this huge amount of blocks in current mode 64757=4597750/71 ???

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 103  
Number of plan statistics captured: 3

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
     0          0          0  LOAD TABLE CONVENTIONAL  (cr=584 pr=1443 pw=0 time=11388214 us)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
SQL*Net more data from client                9523        0.00          0.08
db file sequential read                     82252        0.60        603.94
SQL*Net message to client                      70        0.00          0.00
SQL*Net message from client                    70        0.00          0.15
log file switch completion                      4        0.08          0.18
undo segment extension                          4        0.00          0.01
log file sync                                   1        0.00          0.00

Table has simple structure:

SQL> desc house_t
Name                 Null?    Type
-------------------- -------- --------------
POSTALCODE                  VARCHAR2(6)
IFNSFL                      VARCHAR2(4)
TERRIFNSFL                  VARCHAR2(4)
IFNSUL                      VARCHAR2(4)
TERRIFNSUL                  VARCHAR2(4)
OKATO                       VARCHAR2(11)
OKTMO                       VARCHAR2(11)
UPDATEDATE                  DATE
HOUSENUM                    VARCHAR2(20)
ESTSTATUS                   NUMBER(5)
BUILDNUM                    VARCHAR2(10)
STRUCNUM                    VARCHAR2(10)
STRSTATUS                   NUMBER(5)
HOUSEID                     VARCHAR2(36)
HOUSEGUID                   VARCHAR2(36)
AOGUID                      VARCHAR2(36)
STARTDATE                   DATE
ENDDATE                     DATE
STATSTATUS                  NUMBER(5)
NORMDOC                     VARCHAR2(36)
COUNTER                     NUMBER(10)
CADNUM                      VARCHAR2(100)
DIVTYPE                     NUMBER(5)

There are no triggers on this table, no mviews, no defined fga policies, no fgac policies.

Also table has 3 indexes

CREATE UNIQUE INDEX HOUSE_T_IDX0 ON HOUSE_T(HOUSEID);
CREATE INDEX  HOUSE_T_IDX1 ON HOUSE_T (T_HOUSEGUID);
CREATE INDEX HOUSE_T_IDX2 ON HOUSE_T (T_AOGUID);

Raw trace file is full of 'db file sequential read' waits on this 3 indexes and undo blocks.

Table now has about 50M records, size is about 12Gb.
Indexes are about 3-4GB and have blevel=3 in dba_indexes.

I expect that this insert statement can take 20-30 blocks to run, but why thousands ?

Best Answer

Steps to reproduce:

create table house_t
(
POSTALCODE                  VARCHAR2(6),
IFNSFL                      VARCHAR2(4),
TERRIFNSFL                  VARCHAR2(4),
IFNSUL                      VARCHAR2(4),
TERRIFNSUL                  VARCHAR2(4),
OKATO                       VARCHAR2(11),
OKTMO                       VARCHAR2(11),
UPDATEDATE                  DATE,
HOUSENUM                    VARCHAR2(20),
ESTSTATUS                   NUMBER(5),
BUILDNUM                    VARCHAR2(10),
STRUCNUM                    VARCHAR2(10),
STRSTATUS                   NUMBER(5),
HOUSEID                     VARCHAR2(36),
HOUSEGUID                   VARCHAR2(36),
AOGUID                      VARCHAR2(36),
STARTDATE                   DATE,
ENDDATE                     DATE,
STATSTATUS                  NUMBER(5),
NORMDOC                     VARCHAR2(36),
COUNTER                     NUMBER(10),
CADNUM                      VARCHAR2(100),
DIVTYPE                     NUMBER(5)
);

Sample data:

insert into house_t (HOUSEID, HOUSEGUID, AOGUID)
with g as (select * from dual connect by level <= 1000)
select sys_guid(), sys_guid(), sys_guid() from g,g,g
where rownum <= 50000000;
commit;

SQL> select count(*) from house_t;

  COUNT(*)
----------
  50000000

CREATE UNIQUE INDEX HOUSE_T_IDX0 ON HOUSE_T(HOUSEID);
CREATE INDEX  HOUSE_T_IDX1 ON HOUSE_T (HOUSEGUID);
CREATE INDEX HOUSE_T_IDX2 ON HOUSE_T (AOGUID);

SQL> select segment_name, bytes/1024/1024 from user_segments;

SEGMENT_NAME                   BYTES/1024/1024
------------------------------ ---------------
HOUSE_T                                   6336
HOUSE_T_IDX0                              2368
HOUSE_T_IDX1                              2432
HOUSE_T_IDX2                              2432

Now to simulate your script:

SQL> alter session set tracefile_identifier=BALAZS1;
SQL> exec dbms_session.session_trace_enable(true, false);

declare
  TYPE T_HOUSE_T_ARRAY IS TABLE OF VARCHAR2(36);
  L_DATA T_HOUSE_T_ARRAY;
  CURSOR c IS SELECT sys_guid() from dual
    connect by level <= 50000 order by dbms_random.value;
BEGIN
  for i in 1..70
  loop
    begin
      OPEN c;
      LOOP
      FETCH c BULK COLLECT INTO L_DATA;
      FORALL i IN 1..L_DATA.COUNT
      INSERT INTO house_t (HOUSEID, HOUSEGUID, AOGUID)
        VALUES (L_DATA(i), L_DATA(i), L_DATA(i));
      EXIT WHEN c%NOTFOUND;
      END LOOP;
      CLOSE c;
  commit;
end;
  end loop;
END;
 21  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:58.97
SQL> exec dbms_session.session_trace_disable;

And the tkprof output:

SQL ID: 4h4byjv3yd590 Plan Hash: 0

INSERT INTO HOUSE_T (HOUSEID, HOUSEGUID, AOGUID)
VALUES
 (:B1 , :B2 , :B3 )


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     70     82.17     105.82      23693     359178   43342331     3500000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       71     82.17     105.82      23693     359178   43342331     3500000

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                     24243        0.01         11.82
  ...

And an enormous amount of leaf node splits:

select sn.name, ss.value from v$statname sn join v$sesstat ss 
  on (sn.statistic# = ss.statistic#) 
where sid = sys_context('userenv', 'sid') 
  and name in ('leaf node splits', 'leaf node 90-10 splits', 
    'branch node splits', 'root node splits');

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
leaf node splits                                                      82981
leaf node 90-10 splits                                                   18
branch node splits                                                      363
root node splits                                                          0

1) I am using an SSD here, that is why my demo took much less time, because of the much lower disk response time. You should look at the amount of current blocks, not the time spent on this.

2) I am inserting data in randomized order, this causes leaf block splits. leaf node 90-10 splits are caused by inserting data to the rightmost part of the index, in a sequentially increasing order. leaf node 90-10 splits can be simply done by adding new index blocks to the "rightmost" part of the index, then copy the highest value into them. Sometimes it is unofficially referred as a 99-1 split instead of 90-10. leaf block splits means adding new blocks in the "middle" of the index, copying and rearranging data in several blocks, this is much more work.

3) Performance-wise I find it very a bad idea to use GUIDs as identifiers. They consume multiple times more storage space compared to simple NUMBER identifiers.

This is the same, but with NUMBER types:

EGMENT_NAME                   BYTES/1024/1024
------------------------------ ---------------
HOUSE_T2                                  2432
HOUSE_T_IDX02                             1088
HOUSE_T_IDX12                             1088
HOUSE_T_IDX22                             1088

Also: Limitations of the Oracle Cost Based Optimizer (Doc ID 212809.1)

  • Limitations of Histogram on Character Columns

Histograms only store the first 32 characters of a character string (5 characters pre 8.1.6 See Bug:598799 ). If histograms are gathered on character columns that have data that is longer than 31 characters and the first 31 characters are identical then column histograms may not reflect the cardinality of these columns correctly as these values will all be treated as if they are identical. There are also similar limits with numeric data which is normalised to 15 digits in histogram endpoints.

As of 12c, the maximum possible number of characters considered for a histogram on a string column increases from thirty-two to sixty-four.

This has caused some seriously incorrect cardinality misestimates with histograms on GUID columns.

This is how this behaves when values are inserted in an increasing order (removed order by dbms_random.value):

INSERT INTO HOUSE_T (HOUSEID, HOUSEGUID, AOGUID)
VALUES
 (:B1 , :B2 , :B3 )


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     70     21.13      23.39       3266     167674    2496874     3500000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       71     21.13      23.39       3266     167674    2496874     3500000

As you can see, it is less than 1 current block per row.