values - sql sum group by multiple columns




How do I use T-SQL Group By (5)

I know I need to have (although I don't know why) a GROUP BY clause on the end of a SQL query that uses any aggregate functions like count, sum, avg, etc:

SELECT count(userID), userName
FROM users
GROUP BY userName

When else would GROUP BY be useful, and what are the performance ramifications?


GROUP BY is similar to DISTINCT in that it groups multiple records into one.

This example, borrowed from http://www.devguru.com/technologies/t-sql/7080.asp, lists distinct products in the Products table.

SELECT Product FROM Products GROUP BY Product

Product
-------------
Desktop
Laptop
Mouse
Network Card
Hard Drive
Software
Book
Accessory

The advantage of GROUP BY over DISTINCT, is that it can give you granular control when used with a HAVING clause.

SELECT Product, count(Product) as ProdCnt
FROM Products
GROUP BY Product
HAVING count(Product) > 2

Product      ProdCnt
--------------------
Desktop          10
Laptop            5
Mouse             3
Network Card      9
Software          6

Group By forces the entire set to be populated before records are returned (since it is an implicit sort).

For that reason (and many others), never use a Group By in a subquery.


You can only include select columns which are referenced, as your error indicates, in a GROUP BY or an aggregate function.

You either need to include more columns in your GROUP BY, or reduce your SELECT * to SELECT FK_psPatRegisters.

SELECT FK_psPatRegisters
FROM psPatDiscounts 
WHERE cancelflag = '0' AND refdate BETWEEN '2014-01-01 14:03:00' AND '2014-01-31 14:03:00'
GROUP BY FK_psPatRegisters

In essence, the issue is that a single cell can't include multiple values. So to group by only one column and select others, you need to specify how to handle those multiple values.


Plant Shift Report

Select 
  DL.Machine, 
  DL.ShiftName,
  @MyDate,
  case when T.Time is null then P.Time else T.Time end as Time,
  T.Temperature,
  NULL as Pressure
From
  DataLogging as DL
  inner join Temperature as T on "reading time is within this shift's timespan"
Where
  DL.Date = @MyDate
  and DL.ShiftName = @MyShiftName

UNION

Select 
  DL.Machine, 
  DL.ShiftName,
  @MyDate,
  case when T.Time is null then P.Time else T.Time end as Time,
  NULL as Temperature,
  P.Pressure
From
  DataLogging as DL
  inner join Pressure as P on "reading time is within this shift's timespan"
Where
  DL.Date = @MyDate
  and DL.ShiftName = @MyShiftName

Let me explain the method here...

  1. Start off with DataLogging. You are concerned with the employee who had a particular shift. You define that employee in the WHERE clause by specifying an input DATE and input SHIFT.

  2. Join up with the Temperature table, because you want to see all the temperature readings from that person's shift.

  3. Join up with the Pressure table, because you want to see all the pressure readings from that person's shift.

  4. Why do we UNION? Well, mainly to prevent dupes (Dupes would exist if we joined with both readings-tables in the same Select statement). As you've noticed, it's hard to join Temperature readings and Pressure readings because they have different timestamps. If they are one second apart, then there is nothing to join on. Therefore, I think the best solution is to simply have a record per reading as I have done above. The above will list all Temperature readings first, followed by all Pressure Readings -- and combine them into the same Table that is returned via the UNION.

Note: You should probably be using a datetime field instead of a Date column and a Time column, because it will store both the date and the time in one field.

Note: You can combine those readings that have the same date & time by using a subquery and grouping the subquery on the Time.

Note: "reading time is within this shift's timespan" is included above to indicate some code you need to fill in. You will probably want to use a case statement to adjust the ON Clause to only include those items with a particular shift

References


SQL Server 2012 Query - Group by

To have a group by, you need a select column and an aggregate function, something like this:

select  psPatRegisters.ID, Count(*) 
  from psPatDiscounts 
 where cancelflag = '0' 
   and refdate between  '2014-01-01 14:03:00' 
   and '2014-01-31 14:03:00'  group by psPatRegisters.ID

In this case Count(*) is the aggregate function and it is grouped by ID





group-by