Il existe deux types de manières d'effectuer un PIVOT
. Avant SQL Server 2005, lorsque PIVOT
a été introduit, la plupart des utilisateurs procédaient comme suit:
SELECT RateID
SUM(CASE WHEN RateItemTypeID = 1 THEN UnitPrice ELSE 0 END),
SUM(CASE WHEN RateItemTypeID = 2 THEN UnitPrice ELSE 0 END),
SUM(CASE WHEN RateItemTypeID = 3 THEN UnitPrice ELSE 0 END)
FROM rate_item WHERE _WhereClause_
GROUP BY RateID
Plus tard, lorsque 2005 a introduit PIVOT
, il est devenu ceci:
SELECT RateID, [1], [2], [3]
FROM PertinentRates -- PertinentRates is a CTE with WHERE clause applied
PIVOT (SUM(UnitPrice) FOR RateItemTypeID IN ([1], [2], [3])) PVT)
Sur SQL Server 2005, 2008 R2, 2012 et 2014 (les versions de SQL Server avec lesquelles j'ai travaillé qui implémentent PIVOT
), d'après mon expérience, cela a toujours été plus rapide que SUM(CASE)
ou dans quelques cas également rapide. Existe-t-il des exemples où PIVOT
est plus lent?
Je ne peux pas donner le DDL car c'est un exemple de mon travail. Mais le tableau est assez simple. Dans l'exemple PIVOT
, il dessine à partir d'un CTE tandis que la SUM(CASE)
dessine directement à partir du tableau. Mais la SUM(CASE)
effectue le même dessin à partir du CTE.
Dans mon exemple de travail, le PIVOT
revient dans 10 secondes tandis que le SUM(CASE)
revient dans 14. Clairement, il doit faire quelque chose de différent sous les couvertures. Les plans sont les mêmes, 50% du total chacun. PIVOT
converti en SUM(CASE)
dans l'analyseur de requêtes. Pourtant, SUM(CASE)
ne revient jamais en moins de 13 secondes et PIVOT
ne revient jamais en plus de 11 secondes.
J'ai essayé de les exécuter d'avant en arrière, peu importe l'ordre dans lequel ils sont exécutés. Si je les exécute tous les deux à partir d'un cache froid, ils prennent tous les deux plus de temps, mais PIVOT
est toujours plus rapide, 12 vs 17 secondes. Impossible de reproduire sur un deuxième serveur, mais celui-ci est considérablement meilleur; c'est 5 secondes chacun là avec des variations mineures. PIVOT
est un peu mieux, mais en pourcentage, il n'a pas le même Edge que sur le premier serveur.
Les statistiques IO, comme le plan de requête, sont identiques entre les deux. C'est étrange, je m'attendais à voir des statistiques différentes IO, même si Je ne les ai jamais regardés pour cet exemple particulier.
Existe-t-il des exemples où
PIVOT
est plus lent?
C'est peu probable dans les cas simples. Comme le note Itzik Ben-Gan dans son article sur SQL Server Pro, données pivotantes lors de l'examen du plan d'une requête PIVOT
(italique ajouté):
La figure 3 montre le plan de la requête
PIVOT
. Comme vous pouvez le voir, ce plan est très similaire à celui de la solution standard, à tel point que si vous regardez les propriétés de l'opérateur Aggregate, sous Valeurs définies, vous constaterez que SQL Server a construitCASE
expressions dans les coulisses:…
[Expr1022] = Scalar Operator(SUM(CASE WHEN [InsideTSQL2008].[Sales].[Orders].[shipcity]=N'Barcelona' THEN [InsideTSQL2008].[Sales].[Orders].[freight] ELSE NULL END))
…Dans cet esprit, vous ne devez pas vous attendre à ce que la solution basée sur l'opérateur
PIVOT
fonctionne mieux que la solution standard. Le principal avantage de l'opérateurPIVOT
pour le moment est qu'il est moins détaillé.
Pour des exigences de pivotement plus avancées que la syntaxe (non standard) PIVOT
ne prend pas directement en charge, des solutions de contournement sont nécessaires. Ces mai ou peut-être pas conduisent à de moins bonnes performances par rapport à CASE
, en fonction de divers facteurs, notamment le niveau de compétence de l'implémenteur.
Des exemples de ces cas problématiques sont traités dans l'article d'Itzik, et également bien expliqués dans l'article Simple Talk de Robert Sheldon, Questions sur le pivotement des données dans SQL Server que vous avez été trop timide à poser .
D'après mon expérience, PIVOT
et Agg(CASE...
générer des plans extrêmement similaires avec des caractéristiques de performance extrêmement proches lorsque les deux sont écrits de manière optimale. Mon conseil habituel est d'écrire des requêtes en utilisant la syntaxe qui vous semble la plus naturelle et d'essayer de réécrire uniquement si les performances ne sont pas acceptables.
Le processeur de requêtes SQL Server existe possède un opérateur Pivot logique intégré (LogOp_Pivot
), donc il n'est peut-être pas tout à fait correct de dire que SQL Server réécrit les pivots vers les agrégats et les expressions de casse, du moins si nous parlons d'activités d'analyse et de compilation qui ont lieu avant le coût optimisation basée sur les bases (les plans triviaux ne sont pas disponibles pour les requêtes pivot).
D'autre part, il est vrai que la seule façon dont l'optimiseur peut implémenter un arbre de requête contenant LogOp_Pivot
se fait via la règle d'exploration ExpandPivot
. Cette règle étend LogOp_Pivot
dans un agrégat groupé normal (LogOp_GbAgg
) avec les expressions scalaires associées. Lorsque cette règle est désactivée, les requêtes pivot ne parviennent pas à être compilées.
Dans la pratique, nous pouvons donc dire que les pivots sont toujours (éventuellement) "réécrits" sous forme d'agrégats et d'expressions scalaires avant qu'un plan exécutable puisse être produit.
Quoi qu'il en soit, le résultat de la réécriture dans LogOp_GbAgg
est converti en les opérateurs physiques nécessaires pour un plan exécutable par les règles d'implémentation agrégées groupées standard GbAggToHS
(hachage) ou GbAggToStrm
(flux) .
En guise de remarque, la raison pour laquelle les pivots manuels (agrégats sur les expressions de cas) ont un scalaire de calcul supplémentaire en dessous de l'agrégat est que les expressions de cas sont poussées vers le niveau feuille de l'arborescence de requêtes pendant la normalisation du projet (une étape précoce de la compilation, avant l'optimisation basée sur les coûts).
Les requêtes qui utilisent la syntaxe PIVOT
ne l'ont pas car les expressions ne sont pas créées avant que ExpandPivot
ne s'exécute pendant l'optimisation basée sur les coûts. Au moment (antérieur) d'exécution de la normalisation du projet, l'arborescence de requêtes contient toujours LogOp_Pivot
éléments, il n'y a donc aucune projection à pousser vers le bas, et les expressions de cas se retrouvent généralement à l'intérieur de l'agrégat de hachage ou de flux.
Il n'y a généralement aucun avantage à éviter le calcul scalaire, car à partir de SQL Server 2005, l'évaluation des expressions est normalement différée jusqu'à ce que le résultat soit requis par un opérateur ultérieur. Dans ce cas, l'évaluation des expressions de cas est différée jusqu'à ce que l'agrégat (hachage ou flux) l'exige.
Répéter les tests de Tableaux croisés et pivots, Partie 1 - Conversion de lignes en colonnes - Par Jeff Moden, 2010/08/06 (première publication: 2008/08/19) sur rextester
Malheureusement, je ne peux pas accéder aux statistiques des plans d'E/S, de temps ou d'exécution sur rextester, mais cela a l'avantage unique d'être un environnement de test commun que n'importe qui ici peut bricoler et examiner. Je me rends compte que cela laisse encore à désirer pour creuser et enquêter sur ce qui se passe exactement, mais je dirais que pouvoir partager un environnement de test est un aspect important de cette discussion.
rextester: http://rextester.com/BAZMGJ69528
Celui-ci a été ajouté pour @MartinSmith et bien que les requêtes soient extraites du même article, il ne figurait pas dans les tests d'origine comme celui-ci:
create table #timer (what varchar(64), ended datetime);
insert into #timer values ('Start',getdate());
go
SELECT TOP 400000 --<<Look! Change this number for testing different size tables
RowNum = IDENTITY(INT,1,1),
Company = CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65),
Amount = CAST(ABS(CHECKSUM(NEWID()))%1000000/100.0 AS MONEY),
Quantity = ABS(CHECKSUM(NEWID()))%50000+1,
Date = CAST(Rand(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
Year = CAST(NULL AS SMALLINT),
Quarter = CAST(NULL AS TINYINT)
INTO #SomeTable3
FROM Master.sys.SysColumns t1
CROSS JOIN
Master.sys.SysColumns t2
--===== Fill in the Year and Quarter columns from the Date column
UPDATE #SomeTable3
SET Year = DATEPART(yy,Date),
Quarter = DATEPART(qq,Date)
--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 1 second to execute.
ALTER TABLE #SomeTable3
ADD PRIMARY KEY CLUSTERED (RowNum)
CREATE NONCLUSTERED INDEX IX_#SomeTable3_CoverYear
ON dbo.#SomeTable3 (Year)
INCLUDE (Amount, Quantity, Quarter)
create statistics syear on #sometable3(year) with fullscan, norecompute;
create statistics syearquarter on #sometable3(year,quarter) with fullscan, norecompute;
GO
insert into #timer values ('Finished Loading Test Data',getdate());
go
--===== Simple Pivot
SELECT Year,
COALESCE([1],0) AS [1st Qtr],
COALESCE([2],0) AS [2nd Qtr],
COALESCE([3],0) AS [3rd Qtr],
COALESCE([4],0) AS [4th Qtr],
COALESCE([1],0) + COALESCE([2] ,0) + COALESCE([3],0) + COALESCE([4],0) AS Total
into #SimplePivot_prep
FROM (SELECT Year, Quarter,Amount FROM #SomeTable3) AS src
PIVOT (SUM(Amount) FOR Quarter IN ([1],[2],[3],[4])) AS pvt
go
--===== Simple Cross Tab
SELECT Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS [1st Qtr],
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS [2nd Qtr],
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS [3rd Qtr],
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS [4th Qtr],
SUM(Amount) AS Total
into #simpleCrossTab_prep
FROM #SomeTable3
GROUP BY Year
go
--insert into #timer values ('Simple Cross Tab',getdate());
go
--=====--
insert into #timer values ('Finished Prep',getdate());
go
--=====--
--===== Simple Pivot
SELECT Year,
COALESCE([1],0) AS [1st Qtr],
COALESCE([2],0) AS [2nd Qtr],
COALESCE([3],0) AS [3rd Qtr],
COALESCE([4],0) AS [4th Qtr],
COALESCE([1],0) + COALESCE([2] ,0) + COALESCE([3],0) + COALESCE([4],0) AS Total
into #SimplePivot
FROM (SELECT Year, Quarter,Amount FROM #SomeTable3) AS src
PIVOT (SUM(Amount) FOR Quarter IN ([1],[2],[3],[4])) AS pvt
go
insert into #timer values ('Simple Pivot',getdate());
go
--=====--
--===== Simple Cross Tab
SELECT Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS [1st Qtr],
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS [2nd Qtr],
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS [3rd Qtr],
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS [4th Qtr],
SUM(Amount) AS Total
into #simpleCrossTab
FROM #SomeTable3
GROUP BY Year
go
insert into #timer values ('Simple Cross Tab',getdate());
go
--=====--
select
o.what
, started=isnull(convert(varchar(30),x.ended),o.ended)
, ended=convert(varchar(30),o.ended)
, DurationInMs=datediff(millisecond,x.ended,o.ended)
from #timer o
outer apply (select top 1 ended from #timer i where i.ended < o.ended order by i.ended desc) as x
retour:
+----------------------------+---------------------+---------------------+--------------+
| what | started | ended | DurationInMs |
+----------------------------+---------------------+---------------------+--------------+
| Start | Feb 19 2017 7:13PM | Feb 19 2017 7:13PM | NULL |
| Finished Loading Test Data | Feb 19 2017 7:13PM | Feb 19 2017 7:13PM | 7210 |
| Finished Prep | Feb 19 2017 7:13PM | Feb 19 2017 7:13PM | 700 |
| Simple Pivot | Feb 19 2017 7:13PM | Feb 19 2017 7:13PM | 340 |
| Simple Cross Tab | Feb 19 2017 7:13PM | Feb 19 2017 7:13PM | 386 |
+----------------------------+---------------------+---------------------+--------------+
Le reste de toutes les limitations de test dans la syntaxe pivot
où une seule requête de tableau croisé peut accomplir ce qui nécessiterait plusieurs pivot
s.
rextester: http://rextester.com/UVZE879
create table #timer (what varchar(64), ended datetime);
insert into #timer values ('Start',getdate());
go
SELECT TOP 300000 --<<Look! Change this number for testing different size tables
RowNum = IDENTITY(INT,1,1),
Company = CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65),
Amount = CAST(ABS(CHECKSUM(NEWID()))%1000000/100.0 AS MONEY),
Quantity = ABS(CHECKSUM(NEWID()))%50000+1,
Date = CAST(Rand(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
Year = CAST(NULL AS SMALLINT),
Quarter = CAST(NULL AS TINYINT)
INTO #SomeTable3
FROM Master.sys.SysColumns t1
CROSS JOIN
Master.sys.SysColumns t2
--===== Fill in the Year and Quarter columns from the Date column
UPDATE #SomeTable3
SET Year = DATEPART(yy,Date),
Quarter = DATEPART(qq,Date)
--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 1 second to execute.
ALTER TABLE #SomeTable3
ADD PRIMARY KEY CLUSTERED (RowNum)
CREATE NONCLUSTERED INDEX IX_#SomeTable3_Cover1
ON dbo.#SomeTable3 (Company, Year)
INCLUDE (Amount, Quantity, Quarter)
create statistics scompanyyear on #sometable3(company, year) with fullscan, norecompute;
GO
insert into #timer values ('Finished Loading Test Data',getdate());
go
--=====--
--===== "Normal" Pivot
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
into #NormalPivot_prep
FROM (SELECT Company, Year, Quarter, Amount FROM #SomeTable3) t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, Quantity FROM #SomeTable3) t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
go
--insert into #timer values ('Finished Normal Pivot',getdate());
go
--=====--
--===== "Normal" Cross Tab
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
into #NormalCrossTab_prep
FROM #SomeTable3
GROUP BY Company, Year
ORDER BY Company, Year
go
--insert into #timer values ('Finished Normal Cross Tab',getdate());
insert into #timer values ('Finished Prep',getdate());
go
--=====--
--===== "Normal" Pivot
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
into #NormalPivot
FROM (SELECT Company, Year, Quarter, Amount FROM #SomeTable3) t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, Quantity FROM #SomeTable3) t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
go
insert into #timer values ('Finished Normal Pivot',getdate());
go
--=====--
--===== "Normal" Cross Tab
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
into #NormalCrossTab
FROM #SomeTable3
GROUP BY Company, Year
ORDER BY Company, Year
go
insert into #timer values ('Finished Normal Cross Tab',getdate());
go
--=====--
select
o.what
, started=isnull(convert(varchar(30),x.ended),o.ended)
, ended=convert(varchar(30),o.ended)
, DurationInMs=datediff(millisecond,x.ended,o.ended)
from #timer o
outer apply (select top 1 ended from #timer i where i.ended < o.ended order by i.ended desc) as x
retour:
+----------------------------+---------------------+---------------------+--------------+
| what | started | ended | DurationInMs |
+----------------------------+---------------------+---------------------+--------------+
| Start | Feb 19 2017 7:19PM | Feb 19 2017 7:19PM | NULL |
| Finished Loading Test Data | Feb 19 2017 7:19PM | Feb 19 2017 7:19PM | 5260 |
| Finished Prep | Feb 19 2017 7:19PM | Feb 19 2017 7:19PM | 1003 |
| Finished Normal Pivot | Feb 19 2017 7:19PM | Feb 19 2017 7:19PM | 550 |
| Finished Normal Cross Tab | Feb 19 2017 7:19PM | Feb 19 2017 7:19PM | 513 |
+----------------------------+---------------------+---------------------+--------------+
rextester: http://rextester.com/WBGUYR51251
create table #timer (what varchar(64), ended datetime);
insert into #timer values ('Start',getdate());
go
SELECT TOP 300000 --<<Look! Change this number for testing different size tables
RowNum = IDENTITY(INT,1,1),
Company = CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65),
Amount = CAST(ABS(CHECKSUM(NEWID()))%1000000/100.0 AS MONEY),
Quantity = ABS(CHECKSUM(NEWID()))%50000+1,
Date = CAST(Rand(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
Year = CAST(NULL AS SMALLINT),
Quarter = CAST(NULL AS TINYINT)
INTO #SomeTable3
FROM Master.sys.SysColumns t1
CROSS JOIN
Master.sys.SysColumns t2
--===== Fill in the Year and Quarter columns from the Date column
UPDATE #SomeTable3
SET Year = DATEPART(yy,Date),
Quarter = DATEPART(qq,Date)
--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 1 second to execute.
ALTER TABLE #SomeTable3
ADD PRIMARY KEY CLUSTERED (RowNum)
CREATE NONCLUSTERED INDEX IX_#SomeTable3_Cover1
ON dbo.#SomeTable3 (Company, Year)
INCLUDE (Amount, Quantity, Quarter)
create statistics scompanyyear on #sometable3(company, year) with fullscan, norecompute;
GO
insert into #timer values ('Finished Loading Test Data',getdate());
go
--=====--
--===== "Pre-aggregated" Pivot
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
into #preA_Pivot_prep
FROM (SELECT Company, Year, Quarter, SUM(Amount) AS Amount FROM #SomeTable3 GROUP BY Company, Year, Quarter) t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, SUM(Quantity) AS Quantity FROM #SomeTable3 GROUP BY Company, Year, Quarter) t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
go
--insert into #timer values ('Finished "Pre-aggregated" Pivot',getdate());
go
--=====--
--===== "Pre-aggregated" Cross Tab
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
into #preA_CrossTab_prep
FROM (SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3 GROUP BY Company,Year,Quarter) d
GROUP BY Company, Year
ORDER BY Company, Year
go
--insert into #timer values ('Finished "Pre-aggregated" Cross Tab',getdate());
go
--=====--
insert into #timer values ('Finished Prep',getdate());
--=====--
--===== "Pre-aggregated" Pivot
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
into #preA_Pivot
FROM (SELECT Company, Year, Quarter, SUM(Amount) AS Amount FROM #SomeTable3 GROUP BY Company, Year, Quarter) t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, SUM(Quantity) AS Quantity FROM #SomeTable3 GROUP BY Company, Year, Quarter) t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
go
insert into #timer values ('Finished "Pre-aggregated" Pivot',getdate());
go
--=====--
--===== "Pre-aggregated" Cross Tab
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
into #preA_CrossTab
FROM (SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3 GROUP BY Company,Year,Quarter) d
GROUP BY Company, Year
ORDER BY Company, Year
go
insert into #timer values ('Finished "Pre-aggregated" Cross Tab',getdate());
go
--=====--
select
o.what
, started=isnull(convert(varchar(30),x.ended),o.ended)
, ended=convert(varchar(30),o.ended)
, DurationInMs=datediff(millisecond,x.ended,o.ended)
from #timer o
outer apply (select top 1 ended from #timer i where i.ended < o.ended order by i.ended desc) as x
retour:
+-------------------------------------+---------------------+---------------------+--------------+
| what | started | ended | DurationInMs |
+-------------------------------------+---------------------+---------------------+--------------+
| Start | Feb 19 2017 7:23PM | Feb 19 2017 7:23PM | NULL |
| Finished Loading Test Data | Feb 19 2017 7:23PM | Feb 19 2017 7:23PM | 5440 |
| Finished Prep | Feb 19 2017 7:23PM | Feb 19 2017 7:23PM | 1513 |
| Finished "Pre-aggregated" Pivot | Feb 19 2017 7:23PM | Feb 19 2017 7:23PM | 683 |
| Finished "Pre-aggregated" Cross Tab | Feb 19 2017 7:23PM | Feb 19 2017 7:23PM | 370 |
+-------------------------------------+---------------------+---------------------+--------------+
rextester: http://rextester.com/WCTJH5484
create table #timer (what varchar(64), ended datetime);
insert into #timer values ('Start',getdate());
go
SELECT TOP 300000 --<<Look! Change this number for testing different size tables
RowNum = IDENTITY(INT,1,1),
Company = CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65),
Amount = CAST(ABS(CHECKSUM(NEWID()))%1000000/100.0 AS MONEY),
Quantity = ABS(CHECKSUM(NEWID()))%50000+1,
Date = CAST(Rand(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
Year = CAST(NULL AS SMALLINT),
Quarter = CAST(NULL AS TINYINT)
INTO #SomeTable3
FROM Master.sys.SysColumns t1
CROSS JOIN
Master.sys.SysColumns t2
--===== Fill in the Year and Quarter columns from the Date column
UPDATE #SomeTable3
SET Year = DATEPART(yy,Date),
Quarter = DATEPART(qq,Date)
--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 1 second to execute.
ALTER TABLE #SomeTable3
ADD PRIMARY KEY CLUSTERED (RowNum)
CREATE NONCLUSTERED INDEX IX_#SomeTable3_Cover1
ON dbo.#SomeTable3 (Company, Year)
INCLUDE (Amount, Quantity, Quarter)
create statistics syearquarter on #sometable3(year,quarter) with fullscan, norecompute;
GO
insert into #timer values ('Finished Loading Test Data',getdate());
go
--=====--
--===== "Pre-aggregated" Pivot with CTE
;WITH
ctePreAgg AS
(SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3
GROUP BY Company,Year,Quarter
)
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
into #prea_Pivot_wcte_prep
FROM (SELECT Company, Year, Quarter, Amount FROM ctePreAgg) AS t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, Quantity FROM ctePreAgg) AS t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
go
--insert into #timer values ('Finished "Pre-aggregated" Pivot with CTE',getdate());
go
--=====--
--===== "Pre-aggregated" Cross Tab with CTE
;WITH
ctePreAgg AS
(SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3
GROUP BY Company,Year,Quarter
)
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
into #prea_CrossTab_wcte_prep
FROM ctePreAgg
GROUP BY Company, Year
ORDER BY Company, Year
go
--insert into #timer values ('Finished "Pre-aggregated" Cross Tab with CTE',getdate());
go
--=====--
insert into #timer values ('Finished Prep',getdate());
go
--=====--
--===== "Pre-aggregated" Pivot with CTE
;WITH
ctePreAgg AS
(SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3
GROUP BY Company,Year,Quarter
)
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
into #prea_Pivot_wcte
FROM (SELECT Company, Year, Quarter, Amount FROM ctePreAgg) AS t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, Quantity FROM ctePreAgg) AS t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
go
insert into #timer values ('Finished "Pre-aggregated" Pivot with CTE',getdate());
go
--=====--
--===== "Pre-aggregated" Cross Tab with CTE
;WITH
ctePreAgg AS
(SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3
GROUP BY Company,Year,Quarter
)
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
into #prea_CrossTab_wcte
FROM ctePreAgg
GROUP BY Company, Year
ORDER BY Company, Year
go
insert into #timer values ('Finished "Pre-aggregated" Cross Tab with CTE',getdate());
go
--=====--
select
o.what
, started=isnull(convert(varchar(30),x.ended),o.ended)
, ended=convert(varchar(30),o.ended)
, DurationInMs=datediff(millisecond,x.ended,o.ended)
from #timer o
outer apply (select top 1 ended from #timer i where i.ended < o.ended order by i.ended desc) as x
retour:
+----------------------------------------------+---------------------+---------------------+--------------+
| what | started | ended | DurationInMs |
+----------------------------------------------+---------------------+---------------------+--------------+
| Start | Feb 19 2017 7:25PM | Feb 19 2017 7:25PM | NULL |
| Finished Loading Test Data | Feb 19 2017 7:25PM | Feb 19 2017 7:26PM | 5723 |
| Finished Prep | Feb 19 2017 7:26PM | Feb 19 2017 7:26PM | 950 |
| Finished "Pre-aggregated" Pivot with CTE | Feb 19 2017 7:26PM | Feb 19 2017 7:26PM | 580 |
| Finished "Pre-aggregated" Cross Tab with CTE | Feb 19 2017 7:26PM | Feb 19 2017 7:26PM | 323 |
+----------------------------------------------+---------------------+---------------------+--------------+
Juste une note pour corriger une fausse hypothèse:
Il existe trois, et non deux, les principales façons d'écrire un pivot. Le troisième utilise une table de pilotage et plusieurs jointures (avec LEFT JOIN
ou OUTER/CROSS APPLY
):
Cela peut être plus ou moins efficace, selon plusieurs détails (distributions de table, index, etc.) et les exigences de l'opération de pivot spécifique. Il présente plusieurs différences avec la méthode SUM/GROUP BY:
il peut éviter d'analyser la table entière s'il y a des index appropriés (signification appropriée: différent selon la clause WHERE
, la GROUP BY
et les colonnes à agréger). Dans l'exemple spécifique, un index sur (RateItemTypeID, RateID) INCLUDE (UnitPrice)
si _WherClause_
est vide.
RateItemTypeID
différentes mais que notre requête n'est intéressée que par quelques-unes. En parcourant la table entière (ou même un index entier) par rapport à la recherche d'une plus petite partie d'un NCI plus étroit, je m'attendrais à ce que le 2e soit plus efficace.WHERE
et colonnes agrégées.le GROUP BY
et même toute la sous-requête de pilotage peut souvent être remplacée par une autre table (une table Rate
dans l'exemple spécifique).
sur plusieurs variantes de pivot, le GROUP BY
dans les sous-requêtes peut également être supprimé et les sous-requêtes converties en simples jointures LEFT
(si par exemple il existe une contrainte UNIQUE
sur (RateID, RateItemTypeID)
dans le cas spécifique). Cela montre que le SUM
dans la méthode "SUM/GROUP BY" est (dans ces cas) uniquement à cause du GROUP BY
et en additionnant une valeur (et plusieurs Nulls).
La requête:
SELECT
d.RateID,
Sum1 = COALESCE(s1.Sum1, 0),
Sum2 = COALESCE(s2.Sum2, 0),
Sum3 = COALESCE(s3.Sum3, 0)
FROM
( SELECT RateID --
FROM rate_item
WHERE _WhereClause_
GROUP BY RateID
) AS d -- driving table with the DISTINCT RateID values
OUTER APPLY
( SELECT Sum1 = SUM(r1.UnitPrice)
FROM rate_item AS r1
WHERE _WhereClause_
AND r1.RateItemTypeID = 1
AND r1.RateID = d.RateID
) AS s1
OUTER APPLY
( SELECT Sum2 = SUM(r2.UnitPrice)
FROM rate_item AS r2
WHERE _WhereClause_
AND r2.RateItemTypeID = 2
AND r2.RateID = d.RateID
) AS s2
OUTER APPLY
( SELECT Sum3 = SUM(r3.UnitPrice)
FROM rate_item AS r3
WHERE _WhereClause_
AND r3.RateItemTypeID = 3
AND r3.RateID = d.RateID
) AS s3 ;
SQL Server transforme la requête ci-dessous:
SELECT
RateID, [1], [2], [3]
FROM PertinentRates
PIVOT (SUM(UnitPrice) FOR RateItemTypeID IN ([1], [2], [3])) PVT)
à:
SELECT
RateID
SUM(CASE WHEN RateItemTypeID = 1 THEN UnitPrice ELSE 0 END),
SUM(CASE WHEN RateItemTypeID = 2 THEN UnitPrice ELSE 0 END),
SUM(CASE WHEN RateItemTypeID = 3 THEN UnitPrice ELSE 0 END)
FROM rate_item WHERE supplierid = 2882874 AND rateplanid = 1 AND rateitemtypeid IN (1, 2, 3)
GROUP BY RateID
donc en choisissant l'un sur l'autre, AFAIK se résume à la lisibilité
Voici une courte démo:
CREATE TABLE #Sales (EmpId INT, Yr INT, Sales MONEY)
INSERT #Sales VALUES(1, 2005, 12000)
INSERT #Sales VALUES(1, 2006, 18000)
INSERT #Sales VALUES(1, 2007, 25000)
INSERT #Sales VALUES(2, 2005, 15000)
INSERT #Sales VALUES(2, 2006, 6000)
INSERT #Sales VALUES(3, 2006, 20000)
INSERT #Sales VALUES(3, 2007, 24000)
maintenant les requêtes
SELECT EmpId, [2005], [2006], [2007]
FROM (SELECT EmpId, Yr, Sales FROM #Sales) AS s
PIVOT (SUM(Sales) FOR Yr IN ([2005], [2006], [2007])) AS p
select
empid,
sum(Case when yr=2005 then sales end) '2005',
sum(Case when yr=2006 then sales end) '2006',
sum(Case when yr=2007 then sales end) '2007'
from
#sales
group by empid
maintenant, lorsque les deux requêtes sont exécutées en batch, les deux coûtent la même chose et les plans sont sensiblement les mêmes:
le SHOWPLAN_TEXT
est
|--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1018]=(0) THEN NULL ELSE [Expr1019] END, [Expr1004]=CASE WHEN [Expr1020]=(0) THEN NULL ELSE [Expr1021] END, [Expr1005]=CASE WHEN [Expr1022]=(0) THEN NULL ELSE [Expr1023] END))
|--Stream Aggregate(GROUP BY:([tempdb].[dbo].[#Sales].[EmpId]) DEFINE:([Expr1018]=COUNT_BIG(CASE WHEN [tempdb].[dbo].[#Sales].[Yr]=(2005) THEN [tempdb].[dbo].[#Sales].[Sales] ELSE NULL END), [Expr1019]=SUM(CASE WHEN [tempdb].[dbo].[#Sales].[Yr]=(2005) THEN [tempdb].[dbo].[#Sales].[Sales] ELSE NULL END), [Expr1020]=COUNT_BIG(CASE WHEN [tempdb].[dbo].[#Sales].[Yr]=(2006) THEN [tempdb].[dbo].[#Sales].[Sales] ELSE NULL END), [Expr1021]=SUM(CASE WHEN [tempdb].[dbo].[#Sales].[Yr]=(2006) THEN [tempdb].[dbo].[#Sales].[Sales] ELSE NULL END), [Expr1022]=COUNT_BIG(CASE WHEN [tempdb].[dbo].[#Sales].[Yr]=(2007) THEN [tempdb].[dbo].[#Sales].[Sales] ELSE NULL END), [Expr1023]=SUM(CASE WHEN [tempdb].[dbo].[#Sales].[Yr]=(2007) THEN [tempdb].[dbo].[#Sales].[Sales] ELSE NULL END)))
|--Sort(ORDER BY:([tempdb].[dbo].[#Sales].[EmpId] ASC))
|--Table Scan(OBJECT:([tempdb].[dbo].[#Sales]))
|--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1021]=(0) THEN NULL ELSE [Expr1022] END, [Expr1004]=CASE WHEN [Expr1023]=(0) THEN NULL ELSE [Expr1024] END, [Expr1005]=CASE WHEN [Expr1025]=(0) THEN NULL ELSE [Expr1026] END))
|--Stream Aggregate(GROUP BY:([tempdb].[dbo].[#sales].[EmpId]) DEFINE:([Expr1021]=COUNT_BIG([Expr1006]), [Expr1022]=SUM([Expr1006]), [Expr1023]=COUNT_BIG([Expr1007]), [Expr1024]=SUM([Expr1007]), [Expr1025]=COUNT_BIG([Expr1008]), [Expr1026]=SUM([Expr1008])))
|--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [tempdb].[dbo].[#sales].[Yr]=(2005) THEN [tempdb].[dbo].[#sales].[Sales] ELSE NULL END, [Expr1007]=CASE WHEN [tempdb].[dbo].[#sales].[Yr]=(2006) THEN [tempdb].[dbo].[#sales].[Sales] ELSE NULL END, [Expr1008]=CASE WHEN [tempdb].[dbo].[#sales].[Yr]=(2007) THEN [tempdb].[dbo].[#sales].[Sales] ELSE NULL END))
|--Sort(ORDER BY:([tempdb].[dbo].[#sales].[EmpId] ASC))
|--Table Scan(OBJECT:([tempdb].[dbo].[#sales]))