sql-server - w3schools - sql partition function
SQL Server: Difference between PARTITION BY and GROUP BY (7)
PARTITION BY Divides the result set into partitions. The window function is applied to each partition separately and computation restarts for each partition.
Found at this link: OVER Clause
I've been using
GROUP BY for all types of aggregate queries over the years. Recently, I've been reverse-engineering some code that uses
PARTITION BY to perform aggregations. In reading through all the documentation I can find about
PARTITION BY, it sounds a lot like
GROUP BY, maybe with a little extra functionality added in? Are they two versions of the same general functionality, or are they something different entirely?
PARTITION BY is analytic, while
GROUP BY is aggregate. In order to use
PARTITION BY, you have to contain it with an OVER clause.
As of my understanding Partition By is almost identical to Group By, but with the following differences:
That group by actually groups the result set returning one row per group, which results therefore in SQL Server only allowing in the SELECT list aggregate functions or columns that are part of the group by clause (in which case SQL Server can guarantee that there are unique results for each group).
Consider for example MySQL which allows to have in the SELECT list columns that are not defined in the Group By clause, in which case one row is still being returned per group, however if the column doesn't have unique results then there is no guarantee what will be the output!
But with Partition By, although the results of the function are identical to the results of an aggregate function with Group By, still you are getting the normal result set, which means that one is getting one row per underlying row, and not one row per group, and because of this one can have columns that are not unique per group in the SELECT list.
So as a summary, Group By would be best when needs an output of one row per group, and Partition By would be best when one needs all the rows but still wants the aggregate function based on a group.
Of course there might also be performance issues, see http://social.msdn.microsoft.com/Forums/ms-MY/transactsql/thread/0b20c2b5-1607-40bc-b7a7-0c60a2a55fba.
It provides rolled-up data without rolling up
i.e. Suppose I want to return the relative position of sales region
Using PARTITION BY, I can return the sales amount for a given region and the MAX amount across all sales regions in the same row.
This does mean you will have repeating data, but it may suit the end consumer in the sense that data has been aggregated but no data has been lost - as would be the case with GROUP BY.
Suppose we have 14 records of
name column in table
select name,count(*) as totalcount from person where name='Please fill out' group BY name;
it will give count in single row i.e 14
select row_number() over (partition by name) as total from person where name = 'Please fill out';
it will 14 rows of increase in count
They're used in different places.
group by modifies the entire query, like:
select customerId, count(*) as orderCount from Orders group by customerId
partition by just works on a window function, like
select row_number() over (partition by customerId order by orderId) as OrderNumberForThisCustomer from Orders
group by normally reduces the number of rows returned by rolling them up and calculating averages or sums for each row.
partition by does not affect the number of rows returned, but it changes how a window function's result is calculated.
-- BELOW IS A SAMPLE WHICH OUTLINES THE SIMPLE DIFFERENCES -- READ IT AND THEN EXECUTE IT -- THERE ARE THREE ROWS OF EACH COLOR INSERTED INTO THE TABLE -- CREATE A database called testDB -- use testDB USE [TestDB] GO -- create Paints table CREATE TABLE [dbo].[Paints]( [Color] [varchar](50) NULL, [glossLevel] [varchar](50) NULL ) ON [PRIMARY] GO -- Populate Table insert into paints (color, glossLevel) select 'red', 'eggshell' union select 'red', 'glossy' union select 'red', 'flat' union select 'blue', 'eggshell' union select 'blue', 'glossy' union select 'blue', 'flat' union select 'orange', 'glossy' union select 'orange', 'flat' union select 'orange', 'eggshell' union select 'green', 'eggshell' union select 'green', 'glossy' union select 'green', 'flat' union select 'black', 'eggshell' union select 'black', 'glossy' union select 'black', 'flat' union select 'purple', 'eggshell' union select 'purple', 'glossy' union select 'purple', 'flat' union select 'salmon', 'eggshell' union select 'salmon', 'glossy' union select 'salmon', 'flat' /* COMPARE 'GROUP BY' color to 'OVER (PARTITION BY Color)' */ -- GROUP BY Color -- row quantity defined by group by -- aggregate (count(*)) defined by group by select count(*) from paints group by color -- OVER (PARTITION BY... Color -- row quantity defined by main query -- aggregate defined by OVER-PARTITION BY select color , glossLevel , count(*) OVER (Partition by color) from paints /* COMPARE 'GROUP BY' color, glossLevel to 'OVER (PARTITION BY Color, GlossLevel)' */ -- GROUP BY Color, GlossLevel -- row quantity defined by GROUP BY -- aggregate (count(*)) defined by GROUP BY select count(*) from paints group by color, glossLevel -- Partition by Color, GlossLevel -- row quantity defined by main query -- aggregate (count(*)) defined by OVER-PARTITION BY select color , glossLevel , count(*) OVER (Partition by color, glossLevel) from paints