Wednesday, August 5, 2009

SQL puzzle - Solve it!!

Here is an interesting SQL puzzle for you:

Lets assume that we have a table with the schema : Attr 1, Attr 2, ... Attr n, timestamp. Timestamp is an unique key in the table and the rows in the table are arranged based on this key. Assume the table has got a huge volume of data.

Sample Data

* Ti+1 > Ti

Write an optimized SQL query to find out the updated entries over time from that table. i.e., the entry which differs from its previous entry in any of those n attributes . Here, in the sample data, consider the entry for timestamp T4 and T5 , Attr 2 has changed from 'width' to 'height' and so the entry T5 needs to be selected. Feel free to contact me if anything is unclear.

Expected Output

Siva was the one who came to me with this problem and we were able to solve it (not optimal though). Thanks Siva.

Another interesting one on the similar lines, Minimum sequential access problem

-- Varun

2 comments:

  1. Say table name is tbl .. we need to join tbl with itself on just previous state of T .. what i mean is, for row with T5, we need to join with T4, for row with T6 we need to join with T5 etc ..

    SELECT t1.*
    FROM tbl t1
    LEFT OUTER JOIN tbl t2 ON (t2.T < t1.T)
    LEFT OUTER JOIN tbl t3 ON (t3.T < t1.T AND t3.T > t2.T)
    WHERE t3.T IS NULL
    AND (t2.attr1 IS NULL OR t2.attr1 != t1.attr1)

    ReplyDelete
  2. @kekeke: It doesnt seem to be selecting the correct entries. Also, try it out with a single join statement.

    ReplyDelete