AI and DS Skills

 View Only

Checking for Differences between Tables and Making Changes

  • 1.  Checking for Differences between Tables and Making Changes

    Posted Tue May 28, 2024 09:24 AM

    I have these two tables:

        CREATE TABLE old_table 
        (
            name1 VARCHAR(20),
            name2 VARCHAR(20),
            origin_date DATE,
            var1 VARCHAR(10),
            end_date DATE,
            status VARCHAR(10)
        );
        
        INSERT INTO old_table(name1, name2, origin_date, var1, end_date, status) 
        VALUES
        ('red_1', 'red', '2010-01-01', 'aaa', NULL, 'active'),
        ('red_2', 'red', '2011-01-01', 'bbb', '2020-01-01', 'inactive'),
        ('blue_1', 'blue', '2005-01-01', 'ccc', '2020-01-01', 'inactive'),
        ('green_1', 'green', '2005-01-01', 'ddd', '2020-01-01', 'inactive'),
        ('purple_1', 'purple', '2001-01-01', 'fff', NULL, 'active'),
        ('pink_1', 'pink', '2002-01-01', 'ggg', NULL, 'active');
        
       
        CREATE TABLE new_table 
        (
            name1 VARCHAR(20),
            name2 VARCHAR(20),
            origin_date DATE,
            var1 VARCHAR(10),
            today DATE
        );
        
        INSERT INTO new_table (name1, name2, origin_date, var1, today) 
        VALUES
        ('red_1', 'red', '2010-01-01', 'aaa', '2020-01-03'),
        ('orange_1', 'orange', '2012-01-01', 'zzz', '2020-01-01');

    When comparing the `new_table` to the `old_table`:

    - pink and purple have no longer survived (`end_date = new_table.today, status = inactive`)
    - red has still survived (`end_date = NULL, status = active`)
    - orange has now appeared  (`end_date = NULL, status = active`)

    The final result should look like this:

    | name1    | name2  | origin_date | var1 | end_date   | status  |
    |----------|--------|-------------|------|------------|---------|
    | red_1    | red    | 2010-01-01  | aaa  | NULL       | active  |
    | red_2    | red    | 2011-01-01  | bbb  | 2020-01-01 | inactive|
    | blue_1   | blue   | 2005-01-01  | ccc  | 2020-01-01 | inactive|
    | green_1  | green  | 2005-01-01  | ddd  | 2020-01-01 | inactive|
    | purple_1 | purple | 2001-01-01  | fff  | 2020-01-03 | inactive|
    | pink_1   | pink   | 2002-01-01  | ggg  | 2020-01-03 | inactive|
    | orange_1 | orange | 2012-01-01  | zzz  | NULL       | active  |

    I tried writing SQL code to reflect this requirements:

        SELECT 
            o.name1,
            o.name2,
            o.origin_date,
            o.var1,
            CASE 
                WHEN n.name1 IS NULL THEN o.end_date 
                ELSE NULL 
            END AS end_date,
            CASE 
                WHEN n.name1 IS NULL THEN 'inactive' 
                ELSE 'active' 
            END AS status
        FROM 
            old_table o
        LEFT JOIN 
            new_table n ON o.name1 = n.name1

        UNION ALL

        SELECT 
            n.name1,
            n.name2,
            n.origin_date,
            n.var1,
            CASE 
                WHEN o.name1 IS NULL THEN NULL 
                ELSE n.today 
            END AS end_date,
            'active' AS status
        FROM 
            new_table n
        LEFT JOIN 
            old_table o ON n.name1 = o.name1
        WHERE 
            o.name1 IS NULL;

    **Problem**: The `end_date` for `purple_1` and `pink_1` are `NULL` when they should be non-null:

             name1  name2 origin_date var1   end_date   status
            red_1    red  2010-01-01  aaa       <NA>   active
            red_2    red  2011-01-01  bbb 2020-01-01 inactive
           blue_1   blue  2005-01-01  ccc 2020-01-01 inactive
          green_1  green  2005-01-01  ddd 2020-01-01 inactive
         purple_1 purple  2001-01-01  fff       NULL inactive
           pink_1   pink  2002-01-01  ggg       NULL inactive
         orange_1 orange  2012-01-01  zzz       <NA>   active

    Can someone please show me how to correct this?



    ------------------------------
    sasha s
    ------------------------------