unbounded - sum over sql
SQL over clause-dividing partition into numbered sub-partitions (2)
I have a challenge, that I've come across at multiple occasions but never been able to find an efficient solution to. Imagine I have a large table with data regarding e.g. bank accounts and their possible revolving moves from debit to credit:
AccountId DebitCredit AsOfDate --------- ----------- ---------- aaa d 2018-11-01 aaa d 2018-11-02 aaa c 2018-11-03 aaa c 2018-11-04 aaa c 2018-11-05 bbb d 2018-11-02 ccc c 2018-11-01 ccc d 2018-11-02 ccc d 2018-11-03 ccc c 2018-11-04 ccc d 2018-11-05 ccc c 2018-11-06
In the example above I would like to assign sub-partition numbers to the combination of AccountId and DebitCredit where the partition number is incremented each time DebitCredit shifts. In other words in the example above I would like this result:
AccountId DebitCredit AsOfDate PartNo --------- ----------- ---------- ------ aaa d 2018-11-01 1 aaa d 2018-11-02 1 aaa c 2018-11-03 2 aaa c 2018-11-04 2 aaa c 2018-11-05 2 bbb d 2018-11-02 1 ccc c 2018-11-01 1 ccc d 2018-11-02 2 ccc d 2018-11-03 2 ccc c 2018-11-04 3 ccc d 2018-11-05 4 ccc c 2018-11-06 5
I cannot really figure out how to do it quickly and efficiently. The operation has to be done daily on a tables with millions of rows.
In this example it is guaranteed that we will have consecutive rows for all accounts. However, of course the customer might open an account the 15th in the month and/or close his account the 26th.
The challenge is to be solved on an MSSQL 2016 server, but a solution that would work on 2012 (and maybe even 2008r2) would be nice.
As you can imagine there's no way of telling whether there will only be debit or credit rows or whether the account will be revolving each day.
If you have sql server 2012+, you can use lag() and a window summation to get this:
select *,sum(PartNoAdd) over (partition by AccountId order by AsOfDate asc) as PartNo_calc from ( select *, case when DebitCredit=lag(DebitCredit,1) over (partition by AccountId order by AsOfDate asc) then 0 else 1 end as PartNoAdd from t )t2 order by AccountId asc, AsOfDate asc
At the inner query,
PartNoAdd checks if the previous DebitCard for this account is the same. If it is, it returns 0 (we should add nothing), else it returns 1.
Then the outer query sums all the
PartNoAdd for this Account.
you can do this with a recursive cte
; with -- the purpose of `cte` is to generate running number in the order of AsOfDate cte as ( select AccountId, DebitCredit, AsOfDate, rn = row_number() over (partition by AccountId order by AsOfDate) from tbl ), -- this is the recursive CTE rcte as ( -- anchor member. Starts with `PartNo 1` select AccountId, DebitCredit, AsOfDate, rn, PartNo = 1 from cte where rn = 1 union all -- recursive member. Incrememt `PartNo` if there is a change in debitcredit select c.AccountId, c.DebitCredit, c.AsOfDate, c.rn, PartNo = case when r.DebitCredit = c.DebitCredit then r.PartNo else r.PartNo + 1 end from rcte r inner join cte c on r.AccountId = c.AccountId and r.rn = c.rn - 1 ) select * from rcte order by AccountId, AsOfDate