How use asterisk to select while grouping

keykey · September 26, 2016, 11:44am

i have a table A (wiht an integer PK named Id) with a lot of columns (> 100), and a "dependent" table B with this signature
Table B
{
Integer ParentId (FK to table A)
varchar User
}

i need to select all the rows of table A with at least 2 dependents records in Table B.
if i need to select only the id fields of table A i can do this query
select ID
from TableA join TableB
on TableA.ID = TableB.ParentId
group by ID
having count(*) > 1

but i need to get all the columns of TableA, so i need this query
select ID, column1, column2, ..., colum100
from TableA join TableB
on TableA.ID = TableB.ParentId
group by ID, column1, column2, ..., column100
having count(*) > 1

is it possible to use * so i can have the same results without enumerating every column names?

thanks

JamesK · September 26, 2016, 1:07pm

What you described - namely listing all the columns in the GROUP BY expression is the most reliable and that is what I would recommend. You don't have to type all the columns manually. If you right-click on the table name in SSMS, it gives you options to generate queries for selecting, inserting etc.

An alternate way is to use BINARY_CHECKSUM like this:

;WITH cte AS
(
	SELECT COUNT(*) OVER (PARTITION BY BINARY_CHECKSUM(*)) AS BCS,
		a.*
	FROM
		TableA AS a
		JOIN TableB AS b
			ON a.ID = b.ParentId
)
SELECT * FROM cte WHERE BCS > 1;

However, the problem with this is that there is a small possibility that two rows that are not exactly identical may produce the same checksum. So I would not recommend this.

keykey · September 26, 2016, 2:56pm

it's a query used to manually extract some reports data so it's ok if it's not reliable.
i have found this solution:

select a.*
from TableA a
join
(select b.ParentId
from Table b
group by b.ParentId
having count(*) > 2) as groupedB
on a.Id = groupedB.ParentId

My only doubt is about performance, i don't know how many rows have the 2 tables and if this query works good with large number

Kristen · October 1, 2016, 9:11am

Not really thought about possible performance issues, but this should let you select anything from TableA with 2+ corresponding rows in TableB

select ID, column1, column2, ..., colum100
from TableA 
WHERE TableA.ID IN
(
    SELECT TableB.ParentId
    FROM   TableB
    group by TableB.ParentId
    having count(*) >= 2
)

If you need some sort of value from [User] in [TableB] then you'd need a CTE or JOIN instead of the "IN"