Morning all,
Been banging my head on the wall bout this for a few days now.
I have about 133 DB in Availability groups and a few stand alone, the T-SQL below without the DB "size" INNER JOIN works great and returns what I have primary, secondary and standalone (341 rows of data returned).
But when I add in the INNER JOIN to master.sys.database_files to get the DB size for each matched record, all of a sudden thousands of rows are returned.....
I have not been able to get my head around why that would happen...help please point me in the right direction.
use Master
SELECT
AG.name AS [AvailabilityGroupName],
dbcs.database_name AS [DatabaseName],
Ad.size , -- DB size
CASE
WHEN dbrs.synchronization_state =0 THEN 'Not synchronizing'
WHEN dbrs.synchronization_state =1 THEN 'Synchronizing'
WHEN dbrs.synchronization_state =2 THEN 'Synchronized'
WHEN dbrs.synchronization_state =3 THEN 'Reverting'
WHEN dbrs.synchronization_state =4 THEN 'Initializing'
END AS AGState,
CASE
WHEN arstates.role =1 THEN 'Primary'
WHEN arstates.role =2 THEN 'Secondary'
END AS ROLE,
ar.failover_mode_desc,
ar.availability_mode_desc,
ISNULL(dbrs.is_suspended, 0) AS [IsSuspended]
FROM master.sys.availability_groups AS AG
LEFT OUTER JOIN master.sys.dm_hadr_availability_group_states as agstates ON AG.group_id = agstates.group_id
INNER JOIN master.sys.availability_replicas AS AR ON AG.group_id = AR.group_id
INNER JOIN master.sys.dm_hadr_availability_replica_states AS arstates ON AR.replica_id = arstates.replica_id AND arstates.is_local = 1
INNER JOIN master.sys.dm_hadr_database_replica_cluster_states AS dbcs ON arstates.replica_id = dbcs.replica_id
LEFT OUTER JOIN master.sys.dm_hadr_database_replica_states AS dbrs ON dbcs.replica_id = dbrs.replica_id AND dbcs.group_database_id = dbrs.group_database_id
-- The inner join below causes a plethora of rows to be generated
INNER JOIN master.sys.master_files AS Ad ON arstates.replica_id = dbcs.replica_id AND arstates.is_local =1
-- Above the problem code
ORDER BY DatabaseName ASC
INNER JOIN master.sys.master_files AS Ad ON ad.database_id=dbcs.database_id
Your original JOIN did not qualify any rows to a specific database, therefore you got a CROSS JOIN to all database files in all databases.
Hello Robert,
Thank you for your input. I thought the same, but if I try to INNER JOIN on the database_id of master.sys.dm_hadr_database_replica_cluster_states the editor reports that is an invalid table column....so I was left trying to match on the replica_id.
Thanks again for looking at my issue
It's a good sanity check to include a JOIN to sys.databases whenever you're writing queries to list metadata about DBs. Also good to include the shortest JOIN from sys.databases to whatever other DMVs you're querying.
Microsoft doesn't always have the best naming conventions, and they've goofed a few column names in the system views.
So, thanks to Robert's input; I modified my query to below. One weird thing is happening that makes no sense, 13 out of my 133 DB (in 8 servers and 8 AG groups) don't show up in the resulting report.
Expanded the query to include NULL values in size and voila they are there.....why would the value be null? If I query the master.sys.database_files by itself multiple size values for each are retuned.
use Master
SELECT DISTINCT
dbcs.database_name AS [DatabaseName],
AG.name AS [AvailabilityGroupName],
CAST(Ad.size AS bigint) * 8 / 1024 AS 'Space MB',
CASE
WHEN dbrs.synchronization_state =0 THEN 'Not synchronizing'
WHEN dbrs.synchronization_state =1 THEN 'Synchronizing'
WHEN dbrs.synchronization_state =2 THEN 'Synchronized'
WHEN dbrs.synchronization_state =3 THEN 'Reverting'
WHEN dbrs.synchronization_state =4 THEN 'Initializing'
END AS AGState,
CASE
WHEN arstates.role =1 THEN 'Primary'
WHEN arstates.role =2 THEN 'Secondary'
END AS ROLE,
ISNULL(dbrs.is_suspended, 0) AS [IsSuspended]
FROM master.sys.availability_groups AS AG
LEFT OUTER JOIN master.sys.dm_hadr_availability_group_states as agstates ON AG.group_id = agstates.group_id
INNER JOIN master.sys.availability_replicas AS AR ON AG.group_id = AR.group_id
INNER JOIN master.sys.dm_hadr_availability_replica_states AS arstates ON AR.replica_id = arstates.replica_id AND arstates.is_local = 1
INNER JOIN master.sys.dm_hadr_database_replica_cluster_states AS dbcs ON arstates.replica_id = dbcs.replica_id
LEFT OUTER JOIN master.sys.dm_hadr_database_replica_states AS dbrs ON dbcs.replica_id = dbrs.replica_id AND dbcs.group_database_id = dbrs.group_database_id
INNER JOIN master.sys.master_files AS Ad ON ( (arstates.role = 1 OR arstates.role = 2) AND ( Ad.name = dbcs.database_name) )
ORDER BY DatabaseName ASC
Apologies, I didn't express myself clearly. I meant to say that if the database size is the important item, get that first, then JOIN to the AG DMVs. The foundation of these JOINs begins with database_id from sys.databases. Something like this:
;WITH DB_sizes AS (SELECT d.database_id, d.name database_name
,SUM(CAST(f.size AS BIGINT))/128 size_mb
FROM sys.databases d
INNER JOIN sys.master_files f ON d.database_id=f.database_id
GROUP BY d.database_id, d.name)
SELECT DS.* --, other columns
FROM DB_sizes DS
INNER JOIN sys.dm_hadr_database_replica_states dbrs ON DS.database_id=dbrs.database_id
...-- continue JOINs to other AG related DMVs
The final comment about the JOINs means that you'd need to re-order your existing query, making it DB-centric, rather than AG and replica centric like it is now. And if you want to include databases that are NOT in an AG, just use a LEFT JOIN from DB_sizes, and the AG-related columns will simply be NULL.
The true size of the database is the sum of its component file sizes, without any other joins. This is what the CTE is doing. The discrepancies you're seeing in the sizes is likely due to multiple files per database, plus multiple replicas per database, which will duplicate the size value each time you JOIN.
FYI, sys.master_files is not the most accurate source of certain file information, it will not include sizes for FILESTREAM groups, for example. This shouldn't matter unless you're using in-memory tables or actual FILESTREAM data in your databases.
Edit: corrected format and SUM()