Duplicate records in Interbase 
Author Message
 Duplicate records in Interbase

Hi all,

Could someone please offer some advice.

I need to be able to locate duplicate records in an interbase table and
unfortunately do not possess enough delphi / SQL / Interbase knowledge
to do this on my own.

The table has 4 fields but only 2 -  Code and Name have any relevance
and there are indexes for both fields.

First prize would be to find all duplicate values in either of those 2
fields and output them to a file and / printer.  If possible also with a
count of the number of duplicates.

I would also be happy with an SQL script I could run in DB Explorer
which would just list the duplicates so that I could write them down.

Thanks in advance.

Doug Norval



Wed, 18 Jun 1902 08:00:00 GMT  
 Duplicate records in Interbase
On Fri, 2 Jun 2000 17:42:33 +0200, "Doug Norval"

Quote:

>Hi all,

>Could someone please offer some advice.

>I need to be able to locate duplicate records in an interbase table and
>unfortunately do not possess enough Delphi / SQL / Interbase knowledge
>to do this on my own.

>The table has 4 fields but only 2 -  Code and Name have any relevance
>and there are indexes for both fields.

>First prize would be to find all duplicate values in either of those 2
>fields and output them to a file and / printer.  If possible also with a
>count of the number of duplicates.

>I would also be happy with an SQL script I could run in DB Explorer
>which would just list the duplicates so that I could write them down.

In its simplest form:

  select code, name, count(code)
  from <YourTable>
  group by code, name
  having count(code) > 1

The above will give you duplicated values in both code and name.
Combine that with a search on duplicates in code or duplicates in name
to get them all:

  select code, name, count(code)
  from <YourTable>
  group by code, name
  having count(code) > 1
  union
  select t1.code, t1.name, -1
  from <YourTable> t1, <YourTable> t2
  where t1.code = t2.code
    and t1.name <> t2.name
  union
  select t1.code, t1.name, -2
  from <YourTable> t1, <YourTable> t2
  where t1.code <> t2.code
    and t1.name = t2.name
  order by 1,2

The count column will then indicate what you have;
- positive: true duplicates
- -1: duplicate code only
- -2: duplicate name only

This will only work well (performance considerations left out of it)
if you have no null values in either code or name, so you will have to
eliminate that first or cope with it in code too, but you get the
general idea.

If you run it with wisql (the little Interbase tool to fire queries to
the database), you should next pick File|Save result to file from the
menu to save yourself possibly a lot of writing.

Note: keep *very* concentrated if this is for deleting! And set a
primary key on both immediately afterwards. :) Good luck!

Jasper D.



Wed, 18 Jun 1902 08:00:00 GMT  
 Duplicate records in Interbase
Hi Jasper,

Thanks to you and those who privately responded to my query.

I will be trying this out very shortly.

What I had done previously was a general ISQL query to get all the
records and then saved that result to a text file that I queried using a
program I wrote in Clipper that outputted the duplicates to a printer.

This will make things much more effecient for our support department.

Quote:
> Note: keep *very* concentrated if this is for deleting! And set a
> primary key on both immediately afterwards. :) Good luck!

The code field is actually a primary key field.  The programmer's just
kept saying that "there was a bug".

I have just joined the programmer's team (from today) 'cos I and the
last programmer (Delphi) that was hired, wrote the company's internal
Programmer's exam together and I beat the programmer.  So this info is
going to help me no end !!

Thanks again to all who responded.

Doug

Johannesburg
South Africa



Wed, 18 Jun 1902 08:00:00 GMT  
 Duplicate records in Interbase
On Mon, 5 Jun 2000 11:52:36 +0200, "Doug Norval"

Quote:

>> Note: keep *very* concentrated if this is for deleting! And set a
>> primary key on both immediately afterwards. :) Good luck!

>The code field is actually a primary key field.  The programmer's just
>kept saying that "there was a bug".

Hmmm. I was a bit hasty when typing that line. Actually, a primary key
on code alone is much better as it will inhibit duplicates on code
alone, hence what I called 'true duplicates' as well, while a pk on
both would only inhibit those true duplicates, but not a duplicated
code (with a different name).

So, you should only find duplicates on name. If that must be
restricted too, you then should add a unique constraint on name as
well.

Jasper D.



Wed, 18 Jun 1902 08:00:00 GMT  
 Duplicate records in Interbase


Quote:

> In its simplest form:

>   select code, name, count(code)
>   from <YourTable>
>   group by code, name
>   having count(code) > 1

Hi Jasper,

It works like a dream !!!!

This will certainly save me plenty of time.  I plan to put a little
utility program toghether for the support department and will include
this  type of checking as there are a few errors caused by these
duplicates.

Thanks again to all who answered my query.

Doug Norval

Johannesburg
South Africa.



Wed, 18 Jun 1902 08:00:00 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. duplicate records in join

2. Duplicate Error on insert record

3. Handling Duplicate records

4. Interbase tables show duplicate rows in DBGrid

5. Using SQL to delete duplicate records

6. Duplicate records

7. Tracking duplicate records?

8. Duplicating Records

9. Duplicating records D4 & Paradox

10. How to Duplicate Records with TTable Component

11. duplicate records in TDBGrid

12. append records without duplicates keys

 

 
Powered by phpBB® Forum Software