mysql刪除重複 - mysql去重複
如何選擇具有MAX(列值)的行,DISTINCT由SQL中的另一列決定? (12)
@Michae在大多數情況下,接受的答案都可以正常工作,但下面的答案會失敗。
如果有2行的HomeID和Datetime相同,則查詢將返回兩行,而不是按需要返回不同的HomeID,以便在查詢中添加Distinct,如下所示。
SELECT DISTINCT tt.home , tt.MaxDateTime
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime
我的表是:
id home datetime player resource
---|-----|------------|--------|---------
1 | 10 | 04/03/2009 | john | 399
2 | 11 | 04/03/2009 | juliet | 244
5 | 12 | 04/03/2009 | borat | 555
3 | 10 | 03/03/2009 | john | 300
4 | 11 | 03/03/2009 | juliet | 200
6 | 12 | 03/03/2009 | borat | 500
7 | 13 | 24/12/2008 | borat | 600
8 | 13 | 01/01/2009 | borat | 700
我需要選擇每個不同的home
保持datetime
的最大值。
結果將是:
id home datetime player resource
---|-----|------------|--------|---------
1 | 10 | 04/03/2009 | john | 399
2 | 11 | 04/03/2009 | juliet | 244
5 | 12 | 04/03/2009 | borat | 555
8 | 13 | 01/01/2009 | borat | 700
我努力了:
-- 1 ..by the MySQL manual:
SELECT DISTINCT
home,
id,
datetime AS dt,
player,
resource
FROM topten t1
WHERE datetime = (SELECT
MAX(t2.datetime)
FROM topten t2
GROUP BY home)
GROUP BY datetime
ORDER BY datetime DESC
不起作用。 結果集有130行,儘管數據庫可以保存187.結果包括home
一些重複。
-- 2 ..join
SELECT
s1.id,
s1.home,
s1.datetime,
s1.player,
s1.resource
FROM topten s1
JOIN (SELECT
id,
MAX(datetime) AS dt
FROM topten
GROUP BY id) AS s2
ON s1.id = s2.id
ORDER BY datetime
不。 給出所有記錄。
-- 3 ..something exotic:
有了各種結果。
你也可以試試這個,對於大型表來說查詢性能會更好。 它適用於每個住宅不超過兩個記錄且日期不同的情況。 更好的一般MySQL查詢是來自Michael La Voie上面的一個。
SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
FROM t_scores_1 t1
INNER JOIN t_scores_1 t2
ON t1.home = t2.home
WHERE t1.date > t2.date
或者在Postgres或提供分析函數的那些dbs的情況下嘗試
SELECT t.* FROM
(SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
, row_number() over (partition by t1.home order by t1.date desc) rw
FROM topten t1
INNER JOIN topten t2
ON t1.home = t2.home
WHERE t1.date > t2.date
) t
WHERE t.rw = 1
即使您擁有兩個或更多行,每個home
都有相同的DATETIME
:
SELECT id, home, datetime, player, resource
FROM (
SELECT (
SELECT id
FROM topten ti
WHERE ti.home = t1.home
ORDER BY
ti.datetime DESC
LIMIT 1
) lid
FROM (
SELECT DISTINCT home
FROM topten
) t1
) ro, topten t2
WHERE t2.id = ro.lid
另一種方法是使用一個子查詢按照每個組計算最近一行,該查詢基本上為每個組的每行計算一個排名,然後按照排名= 1過濾掉最近的行
select a.*
from topten a
where (
select count(*)
from topten b
where a.home = b.home
and a.`datetime` < b.`datetime`
) +1 = 1
以下是為了更好理解每行的視覺演示
通過閱讀一些評論,如果有兩行具有相同的'家'和'日期時間'字段值?
以上查詢將失敗,並將返回超過1行以上的情況。 為了掩蓋這種情況,需要另一個標準/參數/列來決定哪一行應該在上述情況下進行。 通過查看示例數據集我假設有一個主鍵列id
應該設置為自動增量。 所以我們可以使用這個列通過在CASE
語句的幫助下調整相同的查詢來選擇最近的行
select a.*
from topten a
where (
select count(*)
from topten b
where a.home = b.home
and case
when a.`datetime` = b.`datetime`
then a.id < b.id
else a.`datetime` < b.`datetime`
end
) + 1 = 1
上面的查詢將在相同的datetime
值中選擇id最高的行
可視化演示 ,每行排名不分先後
我認為這會給你想要的結果:
SELECT home, MAX(datetime)
FROM my_table
GROUP BY home
但是如果您還需要其他欄目,只需加入原始表格(查看Michael La Voie
答案)
最好的祝福。
最快的MySQL
解決方案,沒有內部查詢和GROUP BY
:
SELECT m.* -- get the row that contains the max value
FROM topten m -- "m" from "max"
LEFT JOIN topten b -- "b" from "bigger"
ON m.home = b.home -- match "max" row with "bigger" row by `home`
AND m.datetime < b.datetime -- want "bigger" than "max"
WHERE b.datetime IS NULL -- keep only if there is no bigger than max
說明 :
使用home
欄加入表格本身。 使用LEFT JOIN
確保表m
中的所有行都出現在結果集中。 那些在表b
沒有匹配的對於b
的列將具有NULL
。
JOIN
上的另一個條件要求僅匹配來自b
的datetime
列上的值大於來自m
的行的行。
使用問題中發布的數據, LEFT JOIN
將生成以下對:
+------------------------------------------+--------------------------------+
| the row from `m` | the matching row from `b` |
|------------------------------------------|--------------------------------|
| id home datetime player resource | id home datetime ... |
|----|-----|------------|--------|---------|------|------|------------|-----|
| 1 | 10 | 04/03/2009 | john | 399 | NULL | NULL | NULL | ... | *
| 2 | 11 | 04/03/2009 | juliet | 244 | NULL | NULL | NULL | ... | *
| 5 | 12 | 04/03/2009 | borat | 555 | NULL | NULL | NULL | ... | *
| 3 | 10 | 03/03/2009 | john | 300 | 1 | 10 | 04/03/2009 | ... |
| 4 | 11 | 03/03/2009 | juliet | 200 | 2 | 11 | 04/03/2009 | ... |
| 6 | 12 | 03/03/2009 | borat | 500 | 5 | 12 | 04/03/2009 | ... |
| 7 | 13 | 24/12/2008 | borat | 600 | 8 | 13 | 01/01/2009 | ... |
| 8 | 13 | 01/01/2009 | borat | 700 | NULL | NULL | NULL | ... | *
+------------------------------------------+--------------------------------+
最後, WHERE
子句只保留b
列中有NULL
的對(它們在上面的表中用*
標記); 這意味著,由於JOIN
子句的第二個條件,從m
中選擇的行在列datetime
具有最大值。
閱讀SQL反模式:避免數據庫編程的陷阱為其他SQL技巧。
由於人們似乎一直在使用此線程(評論日期範圍從1.5年)並不是那麼簡單:
SELECT * FROM (SELECT * FROM topten ORDER BY datetime DESC) tmp GROUP BY home
不需要聚合功能...
乾杯。
試試這個SQL Server:
WITH cte AS (
SELECT home, MAX(year) AS year FROM Table1 GROUP BY home
)
SELECT * FROM Table1 a INNER JOIN cte ON a.home = cte.home AND a.year = cte.year
這裡是T-SQL版本:
-- Test data
DECLARE @TestTable TABLE (id INT, home INT, date DATETIME,
player VARCHAR(20), resource INT)
INSERT INTO @TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700
-- Answer
SELECT id, home, date, player, resource
FROM (SELECT id, home, date, player, resource,
RANK() OVER (PARTITION BY home ORDER BY date DESC) N
FROM @TestTable
)M WHERE N = 1
-- and if you really want only home with max date
SELECT T.id, T.home, T.date, T.player, T.resource
FROM @TestTable T
INNER JOIN
( SELECT TI.id, TI.home, TI.date,
RANK() OVER (PARTITION BY TI.home ORDER BY TI.date) N
FROM @TestTable TI
WHERE TI.date IN (SELECT MAX(TM.date) FROM @TestTable TM)
)TJ ON TJ.N = 1 AND T.id = TJ.id
編輯
不幸的是,MySQL中沒有RANK()OVER函數。
但是它可以被模擬,請參閱使用MySQL模擬分析(AKA排名)函數 。
所以這是MySQL版本:
SELECT id, home, date, player, resource
FROM TestTable AS t1
WHERE
(SELECT COUNT(*)
FROM TestTable AS t2
WHERE t2.home = t1.home AND t2.date > t1.date
) = 0
這裡是MySQL版本,它只打印一個條目,其中組中有重複的MAX(日期時間)。
你可以在這裡測試http://www.sqlfiddle.com/#!2/0a4ae/1
樣本數據
mysql> SELECT * from topten;
+------+------+---------------------+--------+----------+
| id | home | datetime | player | resource |
+------+------+---------------------+--------+----------+
| 1 | 10 | 2009-04-03 00:00:00 | john | 399 |
| 2 | 11 | 2009-04-03 00:00:00 | juliet | 244 |
| 3 | 10 | 2009-03-03 00:00:00 | john | 300 |
| 4 | 11 | 2009-03-03 00:00:00 | juliet | 200 |
| 5 | 12 | 2009-04-03 00:00:00 | borat | 555 |
| 6 | 12 | 2009-03-03 00:00:00 | borat | 500 |
| 7 | 13 | 2008-12-24 00:00:00 | borat | 600 |
| 8 | 13 | 2009-01-01 00:00:00 | borat | 700 |
| 9 | 10 | 2009-04-03 00:00:00 | borat | 700 |
| 10 | 11 | 2009-04-03 00:00:00 | borat | 700 |
| 12 | 12 | 2009-04-03 00:00:00 | borat | 700 |
+------+------+---------------------+--------+----------+
帶有用戶變量的MySQL版本
SELECT *
FROM (
SELECT ord.*,
IF (@prev_home = ord.home, 0, 1) AS is_first_appear,
@prev_home := ord.home
FROM (
SELECT t1.id, t1.home, t1.player, t1.resource
FROM topten t1
INNER JOIN (
SELECT home, MAX(datetime) AS mx_dt
FROM topten
GROUP BY home
) x ON t1.home = x.home AND t1.datetime = x.mx_dt
ORDER BY home
) ord, (SELECT @prev_home := 0, @seq := 0) init
) y
WHERE is_first_appear = 1;
+------+------+--------+----------+-----------------+------------------------+
| id | home | player | resource | is_first_appear | @prev_home := ord.home |
+------+------+--------+----------+-----------------+------------------------+
| 9 | 10 | borat | 700 | 1 | 10 |
| 10 | 11 | borat | 700 | 1 | 11 |
| 12 | 12 | borat | 700 | 1 | 12 |
| 8 | 13 | borat | 700 | 1 | 13 |
+------+------+--------+----------+-----------------+------------------------+
4 rows in set (0.00 sec)
已接受的答案'outout
SELECT tt.*
FROM topten tt
INNER JOIN
(
SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home
) groupedtt ON tt.home = groupedtt.home AND tt.datetime = groupedtt.MaxDateTime
+------+------+---------------------+--------+----------+
| id | home | datetime | player | resource |
+------+------+---------------------+--------+----------+
| 1 | 10 | 2009-04-03 00:00:00 | john | 399 |
| 2 | 11 | 2009-04-03 00:00:00 | juliet | 244 |
| 5 | 12 | 2009-04-03 00:00:00 | borat | 555 |
| 8 | 13 | 2009-01-01 00:00:00 | borat | 700 |
| 9 | 10 | 2009-04-03 00:00:00 | borat | 700 |
| 10 | 11 | 2009-04-03 00:00:00 | borat | 700 |
| 12 | 12 | 2009-04-03 00:00:00 | borat | 700 |
+------+------+---------------------+--------+----------+
7 rows in set (0.00 sec)
SELECT tt.*
FROM TestTable tt
INNER JOIN
(
SELECT coord, MAX(datetime) AS MaxDateTime
FROM rapsa
GROUP BY
krd
) groupedtt
ON tt.coord = groupedtt.coord
AND tt.datetime = groupedtt.MaxDateTime
SELECT c1, c2, c3, c4, c5 FROM table1 WHERE c3 = (select max(c3) from table)
SELECT * FROM table1 WHERE c3 = (select max(c3) from table1)