Friday, 3 June 2016

[Maria/MySQL] Upsert(Merge) Statement

1. Insert ignoring duplication

ignore duplication row(s). skip duplicate data and insert only non-duplicate data.


INSERT IGNORE INTO <SOURCE_TABLE> VALUES (VALUE1, VALUE2...);

INSERT IGNORE INTO <SOURCE_TABLE> SELECT * FROM <TARGET_TABLE>;

2. Inserting new rows and updating existing rows

2.1. Using Standard SQL(OuterJoin)

It needs two-step process (insert and update). The benefit to this approach is standards compliance. But it is poor efficiency.

INSERT INTO <SOURCE_TABLE> (KEY_COL1, COL2, COL3)
    SELECT T.COL1, T.COL2, T.COL3
      FROM <TARGET_TABLE> T
     WHERE LEFT OUTER JOIN <SOURCE_TABLE> AS S
                        ON T.KEY_COL1 = S.KEY_COL1
WHERE S.KEY_COL1 IS NULL
;

UPDATE <TARGET_TABLE> AS  T
     INNER JOIN <SOURCE_TABLE> AS S             ON T.KEY_COL1 = S.KEY_COL1
    SET T.COL2 = S.COL2
      , T.COL3 = S.COL3
;

2.2. Using Standard SQL(OuterJoin)

This is far more efficient, but it still might be very bad. It could lock the tables for a long time with large datasets, and like all two-step processes, it is not transactional.


INSERT INTO <TARGET_TABLE> (KEY_COL1, COL2, COL3)
    SELECT S.COL1, S.COL2, S.COL3
      FROM <SOURCE_TABLE> S
     WHERE LEFT OUTER JOIN <SOURCE_TABLE> AS S
                        ON S.KEY_COL1 = T.KEY_COL1
WHERE T.KEY_COL1 IS NULL;
UPDATE <TARGET_TABLE> AS  T
     INNER JOIN <SOURCE_TABLE> AS S
             ON T.KEY_COL1 = S.KEY_COL1
    SET T.COL2 = S.COL2
      , T.COL3 = S.COL3
;

UPDATE <TARGET_TABLE> AS T
    INNER JOIN <SOURCE_TABLE> AS S
            ON T.KEY_COL1 = S.KEY_COL1
 SET T.COL2 = S.COL2
   , T.COL3 = S.COL3
 ;

INSERT INTO <TARGETTABLE> (KEY_COL1, COL2, COL3)
    SELECT S.KEY_COL1, S.COL2, S.COL3
      FROM <SOURCE_TABLE> AS S
       LEFT OUTER JOIN <TARGET_TABLE> AS T
          ON S.KEY_COL1 = T.KEY_COL1
      WHERE T.KEY_COL1 IS NULL
;

2.3. Using Non-standard ON DUPLICATE KEY UPDATE Statement

Recommend this way.

INSERT INTO <TARGET_TABLE> ( KEY_COL1, COL2, COL3)
  SELECT KEY_COL1, COL2, COL3 FROM <SOURCE_TABLE>
 ON DUPLICATE KEY UPDATE COL2 = <VALUE2>
                       , COL3 = <VALUE3>
; 

No comments:

Post a Comment