Wednesday, March 28, 2012

How to Determine the unique IDs of duplicated records

> This is a common problem with some solution

/************************************************** *********************************
*
* Problem:
* Determine the Duplicated Records in a table using single SELECT.
*
* We shall be using Northwind database, add some duplicate records.
*
* Here we want to know if 2 columns (CompanyName,
* PHone) are duplicated in a table.
*
*
* ShipperID CompanyName Phone
* ---- -------- ------
* 1 Speedy Express (503) 555-9831
* 2 United Package (503) 555-3199
* 3 Federal Shipping (503) 555-9931
* 4 Federal Shipping (503) 555-9931
* 5 Speedy Express (503) 555-9831
* 6 Federal Shipping (503) 555-9931
*
*
*
************************************************** **/

==================================================

SOLUTION 1: Gives me the IDs that are duplicated.

==================================================

SELECT
ShipperID, CompanyName, Phone
FROM
SHIPPERS
WHERE
EXISTS (
SELECT
NULL
FROM
SHIPPERS b
WHERE
b.CompanyName = SHIPPERS.CompanyName
AND b.Phone = SHIPPERS.Phone
GROUP BY
b.CompanyName, b.Phone
HAVING
SHIPPERS.ShipperID < MAX( b.ShipperID )
)

/* ********************
* Output results
********************/

ShipperID CompanyName Phone

---- ------------
--------
1 Speedy Express (503) 555-9831
3 Federal Shipping (503) 555-9931
4 Federal Shipping (503) 555-9931

(3 row(s) affected)

================================================== ===========

SOLUTION 2: Gives me the data which are duplicate but
not the IDs

================================================== ===========

SELECT
CompanyName, Phone
FROM
SHIPPERS
GROUP BY
CompanyName, Phone
HAVING
COUNT(*) > 1

/* ********************
* Output results
********************/

CompanyName Phone
------------ --------
Speedy Express (503) 555-9831
Federal Shipping (503) 555-9931

(2 row(s) affected)anonieko@.hotmail.com wrote:
> > This is a common problem with some solution
> /************************************************** *********************************
> *
> * Problem:
> * Determine the Duplicated Records in a table using single SELECT.
> *
> * We shall be using Northwind database, add some duplicate records.
> *
> * Here we want to know if 2 columns (CompanyName,
> * PHone) are duplicated in a table.
> *
> *
> * ShipperID CompanyName Phone
> * ---- -------- ------
> * 1 Speedy Express (503) 555-9831
> * 2 United Package (503) 555-3199
> * 3 Federal Shipping (503) 555-9931
> * 4 Federal Shipping (503) 555-9931
> * 5 Speedy Express (503) 555-9831
> * 6 Federal Shipping (503) 555-9931
> *
> *
> *
> ************************************************** **/
> ==================================================
> SOLUTION 1: Gives me the IDs that are duplicated.
> ==================================================
> SELECT
> ShipperID, CompanyName, Phone
> FROM
> SHIPPERS
> WHERE
> EXISTS (
> SELECT
> NULL
> FROM
> SHIPPERS b
> WHERE
> b.CompanyName = SHIPPERS.CompanyName
> AND b.Phone = SHIPPERS.Phone
> GROUP BY
> b.CompanyName, b.Phone
> HAVING
> SHIPPERS.ShipperID < MAX( b.ShipperID )
> )
> /* ********************
> * Output results
> ********************/
> ShipperID CompanyName Phone
> ---- ------------
> --------
> 1 Speedy Express (503) 555-9831
> 3 Federal Shipping (503) 555-9931
> 4 Federal Shipping (503) 555-9931
> (3 row(s) affected)
>
> ================================================== ===========
> SOLUTION 2: Gives me the data which are duplicate but
> not the IDs
> ================================================== ===========
>
> SELECT
> CompanyName, Phone
> FROM
> SHIPPERS
> GROUP BY
> CompanyName, Phone
> HAVING
> COUNT(*) > 1
>
> /* ********************
> * Output results
> ********************/
>
> CompanyName Phone
> ------------ --------
> Speedy Express (503) 555-9831
> Federal Shipping (503) 555-9931
> (2 row(s) affected)

Those aren't solutions, they are diagnostics. The solution is to fix
the stupid design of the Shippers table by adding a proper key.

:-)

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/...US,SQL.90).aspx
--sql

No comments:

Post a Comment