Returning only the rows that have a count greater than N in a table that contains multiple versions of data rows
Build intelligent, mission-critical applications using a scalable, hybrid database platform that has everything built in—from in-memory performance and advanced security to in-database analytics.
What do you do if we have a table that contains multiple version of a data row and we want to return only the rows that have a count greater than N? The solution is to group by column id and then count number of rows per the grouped column. When we have the number of rows per each individual group, we can further reduce the count based on the having statement.
For this example, we will create a sample table that will hold multiple versions of user data:
1 create 2 table table_name ( 3 id int, 4 version int, 5 desc_text varchar(100), 6 primary key (id, version) 7 ); 8 9 10 INSERT INTO table_name (id, version, desc_text) 11 VALUES (1, 1, 'row 1, version 1'), 12 (1, 2, 'row 1, version 2'), 13 (1, 3, 'row 1, version 3'), 14 (1, 4, 'row 1, version 4'), 15 (2, 1, 'row 2, version 1'), 16 (2, 2, 'row 2, version 2'), 17 (3, 4, 'row 3, version 4'), 18 (3, 5, 'row 3, version 5'), 19 (3, 6, 'row 3, version 6');
We can now group by he column, id to get the row count per grouped items. Once we have the number or rows per each group, we can further reduce the count using the having statement.
1 select id, 2 'row_count' = count(*) 3 4 from 5 table_name 6 group by 7 id 8 having 9 count(*) > 2 10 order by 11 id;