Assume we have a table table_a with two string and one int fields:
CREATE TABLE `YOUR_DATABASE.table_a` (
`A1` STRING,
`A2` INT64,
`A3` STRING,
);
table_a includes some data:
INSERT `YOUR_DATABASE.table_a` (A1, A2, A3)
VALUES("X111", 10, "2022-02-07 08:19:00"),
("Y222", 15, "2017-05-08 10:19:00"),
("X111", 0,"2011-08-02 02:30:00"),
("X111", 12, "2019-12-20 11:25:00"),
("Y222", 258, "2001-10-14 07:00:00"),
("Z333", 4, "2015-03-08 03:25:12")
We have another table, table_b, which is kind of identical to table_a. In fact, table_b has the exact same fields and values as in table_a and it also has an extra DATETIME field. This extra DATETIME field, B4, is essentially same as A3 but converted to a standard DATETIME format.
CREATE TABLE `YOUR_DATABASE.table_b` (
`B1` STRING,
`B2` INT64,
`B3` STRING,
`B4` DATETIME,
);
And the values in table_b are:
INSERT `YOUR_DATABASE.table_b` (B1, B2, B3, B4)
VALUES("X111", 10, "2022-02-07 08:19:00", DATETIME(PARSE_TIMESTAMP("%Y-%m-%y %H:%M:%S","2022-02-07 08:19:00"),"America/Los_Angeles")),
("Y222", 15, "2017-05-08 10:19:00", DATETIME(PARSE_TIMESTAMP("%Y-%m-%y %H:%M:%S","2017-05-08 10:19:00"),"America/Los_Angeles")),
("X111", 0, "2011-08-02 02:30:00", DATETIME(PARSE_TIMESTAMP("%Y-%m-%y %H:%M:%S","2011-08-02 02:30:00"),"America/Los_Angeles")),
("X111", 12, "2019-12-20 11:25:00", DATETIME(PARSE_TIMESTAMP("%Y-%m-%y %H:%M:%S","2019-12-20 11:25:00"),"America/Los_Angeles")),
("Y222", 258, "2001-10-14 07:00:00", DATETIME(PARSE_TIMESTAMP("%Y-%m-%y %H:%M:%S","2001-10-14 07:00:00"),"America/Los_Angeles")),
("Z333", 4, "2015-03-08 03:25:12", DATETIME(PARSE_TIMESTAMP("%Y-%m-%y %H:%M:%S","2015-03-08 03:25:12"),"America/Los_Angeles"))
Note, table_a and table_b may or may not have a primary key. How can I check if table_a and table_b include the same information? I have a couple of table_a in my database and I have created equivalent table_b for each of them; now I want to double check and make sure the tables include same information.
CodePudding user response:
If you only want to know whether the tables contain the same data, then you can use CHECKSUM(). This basically calculates the cumulative hash for all values in the column and generates a value. If both values are the same, your tables have the same data in those columns.
SELECT
CHECKSUM(A1), CHECKSUM(A2), CHECKSUM(A3)
FROM
YOUR_DATABASE.table_a
SELECT
CHECKSUM(A1), CHECKSUM(A2), CHECKSUM(A3)
FROM
YOUR_DATABASE.table_b

