1. Insert ignoring duplication
ignore duplication row(s). skip duplicate data and insert only non-duplicate data.
INSERT IGNORE INTO <SOURCE_TABLE> VALUES (VALUE1, VALUE2...);
INSERT IGNORE INTO <SOURCE_TABLE> SELECT * FROM <TARGET_TABLE>;
 
2. Inserting new rows and updating existing rows
2.1. Using Standard SQL(OuterJoin)
It needs two-step process (insert and update). The benefit to this approach is standards compliance. But it is poor efficiency.
INSERT INTO <SOURCE_TABLE> (KEY_COL1, COL2, COL3)
    SELECT T.COL1, T.COL2, T.COL3
      FROM <TARGET_TABLE> T
     WHERE LEFT OUTER JOIN <SOURCE_TABLE> AS S
                        ON T.KEY_COL1 = S.KEY_COL1
WHERE S.KEY_COL1 IS NULL
;
UPDATE <TARGET_TABLE> AS  T
     INNER JOIN <SOURCE_TABLE> AS S             ON T.KEY_COL1 = S.KEY_COL1
    SET T.COL2 = S.COL2
      , T.COL3 = S.COL3
;
 
2.2. Using Standard SQL(OuterJoin)
This is far more efficient, but it still might be very bad. It could lock the tables for a long time with large datasets, and like all two-step processes, it is not transactional.
INSERT INTO <TARGET_TABLE> (KEY_COL1, COL2, COL3)
    SELECT S.COL1, S.COL2, S.COL3
      FROM <SOURCE_TABLE> S
     WHERE LEFT OUTER JOIN <SOURCE_TABLE> AS S
                        ON S.KEY_COL1 = T.KEY_COL1
WHERE T.KEY_COL1 IS NULL;
UPDATE <TARGET_TABLE> AS  T
     INNER JOIN <SOURCE_TABLE> AS S
             ON T.KEY_COL1 = S.KEY_COL1
    SET T.COL2 = S.COL2
      , T.COL3 = S.COL3
;
UPDATE <TARGET_TABLE> AS T
    INNER JOIN <SOURCE_TABLE> AS S
            ON T.KEY_COL1 = S.KEY_COL1
 SET T.COL2 = S.COL2
   , T.COL3 = S.COL3
 ;
INSERT INTO <TARGETTABLE> (KEY_COL1, COL2, COL3)
    SELECT S.KEY_COL1, S.COL2, S.COL3
      FROM <SOURCE_TABLE> AS S
       LEFT OUTER JOIN <TARGET_TABLE> AS T
          ON S.KEY_COL1 = T.KEY_COL1
      WHERE T.KEY_COL1 IS NULL
;
 
2.3. Using Non-standard ON DUPLICATE KEY UPDATE Statement
Recommend this way.
INSERT INTO <TARGET_TABLE> ( KEY_COL1, COL2, COL3)
  SELECT KEY_COL1, COL2, COL3 FROM <SOURCE_TABLE>
 ON DUPLICATE KEY UPDATE COL2 = <VALUE2>
                       , COL3 = <VALUE3>
; 
 
 
No comments:
Post a Comment