一尘不染

单个SQL查询可在数据库的所有列中查找空值

sql

我想确定所有表中每列的空值数量。我有一个数据库,它由大约250个表组成。大多数表都在使用中。问题是几乎所有表都包含不需要的列,这些列创建了一些简短的内容术语使用。现在我们要为所有表标识具有空值的列。由于表的计数较大且时间较少,我想知道一种最简单的方法来按列明智地标识每个表的空记录计数。

我尝试了这个来自Internet的查询,但是在此我必须手动指定每个表的名称。

DECLARE @cols1 NVARCHAR(MAX);
DECLARE @sql NVARCHAR(MAX);

SELECT @cols1 = STUFF((
    SELECT ', COUNT(CASE WHEN ISNULL(CONVERT(NVARCHAR(MAX), [' + t1.NAME + ']),'''' ) = '''' THEN 1 END) AS ' + t1.name
    FROM sys.columns AS t1
    WHERE t1.object_id = OBJECT_ID('Area')
    -- ORDER BY ', COUNT([' + t1.name + ']) AS ' + t1.name
    FOR XML PATH('')
), 1, 2, '');

SET @sql = '
SELECT ' + @cols1 + '
FROM Area
'
EXEC(@sql)

请帮助我获得改进的查询,以获取结果。

感谢你


阅读 655

收藏
2021-05-16

共1个答案

一尘不染

这是一团糟,但它的工作原理是:

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10);

CREATE TABLE #NullCounts (SchemaName sysname,
                          TableName sysname,
                          ColumnName sysname,
                          NULLCount bigint);

DECLARE @Delimiter nchar(3) = ',' +@CRLF;

SET @SQL = STUFF((SELECT @CRLF + @CRLF +
                         N'WITH Counts AS(' + @CRLF +
                         N'    SELECT N' + QUOTENAME(s.[name],'''') +N' AS SchemaName,' + @CRLF +
                         N'           N' + QUOTENAME(t.[name],'''') +N' AS TableName,' + @CRLF +
                         STRING_AGG(N'           COUNT_BIG(CASE WHEN N' + QUOTENAME(c.[name],'''') + N' IS NULL THEN 1 END) AS ' + QUOTENAME(c.[name]),@Delimiter) WITHIN GROUP(ORDER BY c.column_id) + @CRLF +
                         N'    FROM ' + QUOTENAME(s.[name]) + N'.' + QUOTENAME(t.[name]) + N' T)' + @CRLF +
                         N'INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)' + @CRLF +
                         N'SELECT SchemaName,' + @CRLF +
                         N'       TableName,' + @CRLF +
                         N'       V.ColumnName,' + @CRLF +
                         N'       V.NULLCount' + @CRLF +
                         N'FROM Counts C' + @CRLF +
                         N'     CROSS APPLY (VALUES' +
                         STUFF(STRING_AGG(N'                        (N' + QUOTENAME(c.[name], '''') + N', C.' + QUOTENAME(c.[name]) + N')',@Delimiter) WITHIN GROUP (ORDER BY c.column_id),1,24,N'') + N')V(ColumnName,NULLCount);'
                  FROM sys.schemas s
                       JOIN sys.tables t ON s.schema_id = t.schema_id
                       JOIN sys.columns c ON t.object_id = c.object_id
                  GROUP BY s.[name], t.[name]
                  FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,4,N'');

--PRINT @SQL; --This is gunna be way longer than 4,000 characters, so you'll want SELECT

EXEC sys.sp_executesql @SQL;

GO

SELECT *
FROM #NullCounts
ORDER BY SchemaName,
         TableName,
         ColumnName;

GO

DROP TABLE #NullCounts;

是的,我混合了STRING_AGGFOR XML PATH,是的,这很让人讨厌,但是打印的(选定的)SQL会生成一些非常好的语句。见下文:

WITH Counts AS(
    SELECT N'dbo' AS SchemaName,
           N'PerformanceTest' AS TableName,
           COUNT_BIG(CASE WHEN N'TestID' IS NULL THEN 1 END) AS [TestID],
           COUNT_BIG(CASE WHEN N'TestTarget' IS NULL THEN 1 END) AS [TestTarget],
           COUNT_BIG(CASE WHEN N'TestName' IS NULL THEN 1 END) AS [TestName],
           COUNT_BIG(CASE WHEN N'TimeStart' IS NULL THEN 1 END) AS [TimeStart],
           COUNT_BIG(CASE WHEN N'TimeEnd' IS NULL THEN 1 END) AS [TimeEnd],
           COUNT_BIG(CASE WHEN N'TimeTaken_ms' IS NULL THEN 1 END) AS [TimeTaken_ms],
           COUNT_BIG(CASE WHEN N'TotalRows' IS NULL THEN 1 END) AS [TotalRows],
           COUNT_BIG(CASE WHEN N'RowSets' IS NULL THEN 1 END) AS [RowSets],
           COUNT_BIG(CASE WHEN N'AvgRowsPerSet' IS NULL THEN 1 END) AS [AvgRowsPerSet]
    FROM [dbo].[PerformanceTest] T)
INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)
SELECT SchemaName,
       TableName,
       V.ColumnName,
       V.NULLCount
FROM Counts C
     CROSS APPLY (VALUES(N'TestID', C.[TestID]),
                        (N'TestTarget', C.[TestTarget]),
                        (N'TestName', C.[TestName]),
                        (N'TimeStart', C.[TimeStart]),
                        (N'TimeEnd', C.[TimeEnd]),
                        (N'TimeTaken_ms', C.[TimeTaken_ms]),
                        (N'TotalRows', C.[TotalRows]),
                        (N'RowSets', C.[RowSets]),
                        (N'AvgRowsPerSet', C.[AvgRowsPerSet]))V(ColumnName,NULLCount);

WITH Counts AS(
    SELECT N'dbo' AS SchemaName,
           N'someTable' AS TableName,
           COUNT_BIG(CASE WHEN N'id' IS NULL THEN 1 END) AS [id],
           COUNT_BIG(CASE WHEN N'SomeCol' IS NULL THEN 1 END) AS [SomeCol]
    FROM [dbo].[someTable] T)
INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)
SELECT SchemaName,
       TableName,
       V.ColumnName,
       V.NULLCount
FROM Counts C
     CROSS APPLY (VALUES(N'id', C.[id]),
                        (N'SomeCol', C.[SomeCol]))V(ColumnName,NULLCount);

是的,我真的花了最后45分钟来编写所有内容…

老实说,这不是入门级别,如果您不了解它,就不要使用它。但是,我非常怀疑您会找到一个不同的解决方案,即入门级和如此出色的解决方案。 例如,尽管
ACURSOR可能更容易理解,但这样做确实很慢

警告:
如果您的数据库中有任何不赞成使用的数据类型(即text),则此操作将失败。如果是这种情况,则需要确保从中的查询中将其消除WHERE。但是,我建议您修复数据类型(text例如,已弃用15年)。

2021-05-16