one - 在SQL Server中查找重複的行




sql group by (10)

我有一個組織的SQL Server數據庫,並且有許多重複的行。 我想運行一個select語句來獲取所有這些以及大量數據,但也返回與每個組織關聯的id。

像這樣的陳述:

SELECT     orgName, COUNT(*) AS dupes  
FROM         organizations  
GROUP BY orgName  
HAVING      (COUNT(*) > 1)

會返回類似的東西

orgName        | dupes  
ABC Corp       | 7  
Foo Federation | 5  
Widget Company | 2 

但我也想獲取它們的ID。 有沒有辦法做到這一點? 也許就像一個

orgName        | dupeCount | id  
ABC Corp       | 1         | 34  
ABC Corp       | 2         | 5  
...  
Widget Company | 1         | 10  
Widget Company | 2         | 2  

原因是這裡還有一個單獨的用戶鏈接表,這些用戶鏈接到這些組織,我想統一它們(因此刪除欺騙用戶鏈接到同一個組織,而不是dupe orgs)。 但我想手動部分,所以我不會搞砸任何東西,但我仍然需要一個聲明來返回所有重複組織的ID,以便我可以查看用戶列表。


你可以試試這個,最適合你

 WITH CTE AS
    (
    SELECT *,RN=ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY orgName DESC) FROM organizations 
    )
    select * from CTE where RN>1
    go

你可以這樣做:

SELECT
    o.id, o.orgName, d.intCount
FROM (
     SELECT orgName, COUNT(*) as intCount
     FROM organizations
     GROUP BY orgName
     HAVING COUNT(*) > 1
) AS d
    INNER JOIN organizations o ON o.orgName = d.orgName

如果您只想返回可以刪除的記錄(每個記錄都有一個),則可以使用:

SELECT
    id, orgName
FROM (
     SELECT 
         orgName, id,
         ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY id) AS intRow
     FROM organizations
) AS d
WHERE intRow != 1

編輯:SQL Server 2000沒有ROW_NUMBER()函數。 相反,您可以使用:

SELECT
    o.id, o.orgName, d.intCount
FROM (
     SELECT orgName, COUNT(*) as intCount, MIN(id) AS minId
     FROM organizations
     GROUP BY orgName
     HAVING COUNT(*) > 1
) AS d
    INNER JOIN organizations o ON o.orgName = d.orgName
WHERE d.minId != o.id

嘗試

SELECT orgName, id, count(*) as dupes
FROM organizations
GROUP BY orgName, id
HAVING count(*) > 1;

如果你想刪除重複項目:

WITH CTE AS(
   SELECT orgName,id,
       RN = ROW_NUMBER()OVER(PARTITION BY orgName ORDER BY Id)
   FROM organizations
)
DELETE FROM CTE WHERE RN > 1

您有幾種選擇duplicate rows

對於我的解決方案,首先考慮這個表格

CREATE TABLE #Employee
(
ID          INT,
FIRST_NAME  NVARCHAR(100),
LAST_NAME   NVARCHAR(300)
)

INSERT INTO #Employee VALUES ( 1, 'Ardalan', 'Shahgholi' );
INSERT INTO #Employee VALUES ( 2, 'name1', 'lname1' );
INSERT INTO #Employee VALUES ( 3, 'name2', 'lname2' );
INSERT INTO #Employee VALUES ( 2, 'name1', 'lname1' );
INSERT INTO #Employee VALUES ( 3, 'name2', 'lname2' );
INSERT INTO #Employee VALUES ( 4, 'name3', 'lname3' );

第一解

SELECT DISTINCT *
FROM   #Employee;

WITH #DeleteEmployee AS (
                     SELECT ROW_NUMBER()
                            OVER(PARTITION BY ID, First_Name, Last_Name ORDER BY ID) AS
                            RNUM
                     FROM   #Employee
                 )

SELECT *
FROM   #DeleteEmployee
WHERE  RNUM > 1

SELECT DISTINCT *
FROM   #Employee

Secound解決方案:使用identity字段

SELECT DISTINCT *
FROM   #Employee;

ALTER TABLE #Employee ADD UNIQ_ID INT IDENTITY(1, 1)

SELECT *
FROM   #Employee
WHERE  UNIQ_ID < (
    SELECT MAX(UNIQ_ID)
    FROM   #Employee a2
    WHERE  #Employee.ID = a2.ID
           AND #Employee.FIRST_NAME = a2.FIRST_NAME
           AND #Employee.LAST_NAME = a2.LAST_NAME
)

ALTER TABLE #Employee DROP COLUMN UNIQ_ID

SELECT DISTINCT *
FROM   #Employee

並且所有解決方案的結尾都使用此命令

DROP TABLE #Employee

我想我知道你需要什麼,我需要在答案中混合,我想我得到了他想要的解決方案:

select o.id,o.orgName, oc.dupeCount, oc.id,oc.orgName
from organizations o
inner join (
    SELECT MAX(id) as id, orgName, COUNT(*) AS dupeCount
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) oc on o.orgName = oc.orgName

擁有最大ID將給你dublicate的ID和原來的一個,他要求的是:

id org name , dublicate count (missing out in this case) 
id doublicate org name , doub count (missing out again because does not help in this case)

只有悲傷的事情,你會以這種形式表達出來

id , name , dubid , name

希望它仍然有幫助


Select * from (Select orgName,id,
ROW_NUMBER() OVER(Partition By OrgName ORDER by id DESC) Rownum
From organizations )tbl Where Rownum>1

因此,rowum> 1的記錄將成為表中的重複記錄。 '按照'劃分第一組,然後通過給它們序列號序列化它們。 所以rownum> 1將是可以刪除的重複記錄。


select * from [Employees]

查找重複記錄1)使用CTE

with mycte
as
(
select Name,EmailId,ROW_NUMBER() over(partition by Name,EmailId order by id) as Duplicate from [Employees]
)
select * from mycte

2)通過使用GroupBy

select Name,EmailId,COUNT(name) as Duplicate from  [Employees] group by Name,EmailId 

select column_name, count(column_name)
from table_name
group by column_name
having count (column_name) > 1;

Src: https://.com/a/59242/1465252 : https://.com/a/59242/1465252


select o.orgName, oc.dupeCount, o.id
from organizations o
inner join (
    SELECT orgName, COUNT(*) AS dupeCount
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) oc on o.orgName = oc.orgName




duplicates