w3schools - sum over sql




The SQL OVER() clause-when and why is it useful? (5)

    USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
    ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
    ,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'
    ,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'
    ,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'
FROM Sales.SalesOrderDetail 
WHERE SalesOrderID IN(43659,43664);

I read about that clause and I don't understand why I need it. What does the function Over do? What does Partitioning By do? Why can't I make a query with writing Group By SalesOrderID?


If you only wanted to GROUP BY the SalesOrderID then you wouldn't be able to include the ProductID and OrderQty columns in the SELECT clause.

The PARTITION BY clause let's you break up your aggregate functions. One obvious and useful example would be if you wanted to generate line numbers for order lines on an order:

SELECT
    O.order_id,
    O.order_date,
    ROW_NUMBER() OVER(PARTITION BY O.order_id) AS line_item_no,
    OL.product_id
FROM
    Orders O
INNER JOIN Order_Lines OL ON OL.order_id = O.order_id

(My syntax might be off slightly)

You would then get back something like:

order_id    order_date    line_item_no    product_id
--------    ----------    ------------    ----------
    1       2011-05-02         1              5
    1       2011-05-02         2              4
    1       2011-05-02         3              7
    2       2011-05-12         1              8
    2       2011-05-12         2              1

Let me explain with an example and you would be able to see how it works.

Assuming you have the following table DIM_EQUIPMENT:

VIN         MAKE    MODEL   YEAR    COLOR
-----------------------------------------
1234ASDF    Ford    Taurus  2008    White
1234JKLM    Chevy   Truck   2005    Green
5678ASDF    Ford    Mustang 2008    Yellow

Run below SQL

SELECT VIN,
  MAKE,
  MODEL,
  YEAR,
  COLOR ,
  COUNT(*) OVER (PARTITION BY YEAR) AS COUNT2
FROM DIM_EQUIPMENT

The result would be as below

VIN         MAKE    MODEL   YEAR    COLOR     COUNT2
 ----------------------------------------------  
1234JKLM    Chevy   Truck   2005    Green     1
5678ASDF    Ford    Mustang 2008    Yellow    2
1234ASDF    Ford    Taurus  2008    White     2

See what happened.

You are able to count without Group By on YEAR and Match with ROW.

Another Interesting WAY to get same result if as below using WITH Clause, WITH works as in-line VIEW and can simplify the query especially complex ones, which is not the case here though since I am just trying to show usage

 WITH EQ AS
  ( SELECT YEAR AS YEAR2, COUNT(*) AS COUNT2 FROM DIM_EQUIPMENT GROUP BY YEAR
  )
SELECT VIN,
  MAKE,
  MODEL,
  YEAR,
  COLOR,
  COUNT2
FROM DIM_EQUIPMENT,
  EQ
WHERE EQ.YEAR2=DIM_EQUIPMENT.YEAR;

The OVER clause when combined with PARTITION BY state that the preceding function call must be done analytically by evaluating the returned rows of the query. Think of it as an inline GROUP BY statement.

OVER (PARTITION BY SalesOrderID) is stating that for SUM, AVG, etc... function, return the value OVER a subset of the returned records from the query, and PARTITION that subset BY the foreign key SalesOrderID.

So we will SUM every OrderQty record for EACH UNIQUE SalesOrderID, and that column name will be called 'Total'.

It is a MUCH more efficient means than using multiple inline views to find out the same information. You can put this query within an inline view and filter on Total then.

SELECT ...,
FROM (your query) inlineview
WHERE Total < 200

You can use GROUP BY SalesOrderID. The difference is, with GROUP BY you can only have the aggregated values for the columns that are not included in GROUP BY.

In contrast, using windowed aggregate functions instead of GROUP BY, you can retrieve both aggregated and non-aggregated values. That is, although you are not doing that in your example query, you could retrieve both individual OrderQty values and their sums, counts, averages etc. over groups of same SalesOrderIDs.

Here's a practical example of why windowed aggregates are great. Suppose you need to calculate what percent of a total every value is. Without windowed aggregates you'd have to first derive a list of aggregated values and then join it back to the original rowset, i.e. like this:

SELECT
  orig.[Partition],
  orig.Value,
  orig.Value * 100.0 / agg.TotalValue AS ValuePercent
FROM OriginalRowset orig
  INNER JOIN (
    SELECT
      [Partition],
      SUM(Value) AS TotalValue
    FROM OriginalRowset
    GROUP BY [Partition]
  ) agg ON orig.[Partition] = agg.[Partition]

Now look how you can do the same with a windowed aggregate:

SELECT
  [Partition],
  Value,
  Value * 100.0 / SUM(Value) OVER (PARTITION BY [Partition]) AS ValuePercent
FROM OriginalRowset orig

Much easier and cleaner, isn't it?


  • Also Called Query Petition Clause.
  • Similar to the Group By Clause

    • break up data into chunks (or partitions)
    • separate by partition bounds
    • function performs within partitions
    • re-initialised when crossing parting boundary

Syntax:
function (...) OVER (PARTITION BY col1 col3,...)

  • Functions

    • Familiar functions such as COUNT(), SUM(), MIN(), MAX(), etc
    • New Functions as well (eg ROW_NUMBER(), RATION_TO_REOIRT(), etc.)


More info with example : http://msdn.microsoft.com/en-us/library/ms189461.aspx







clause