Duplicates: The Horror!!!

December 27, 2006

Through my own stupidity and lack of organization I have managed to duplicate a large mass of files over the years. These duplicates are scattered all over the place and aren’t guaranteed to be in the same directory from backup disk to backup disk. Also some of the backups are more recent backups so the files contents are somewhat different. Then there’s the horror of duplicate tree’s within duplicate trees from attempts to organize the mess! (not to mention duplicates that have been compressed and projects that have been duplicated to rework) There’s several gigs worth of data that I’d like to salvage in there! Surely someone has made a tool to deal with this type of situation. I’m currently trying some apps found on freshmeat.net to see if there’s anything at all to help.

I need something that can
find exact duplicated – md5 hash?
find similar duplicates and be able to view differences/merge contents- files that have been updated
and finally, files with the same name that have the same name, but are completely different
cross platform would be nice

Then I have my paper stacks to digitize, but I’ll save that for another day ;)

:P

Entry Filed under: Linux. Tags: , , , , , .

4 Comments Add your own

  • 1. Sebastiao Correia  |  January 7, 2007 at 5:00 pm

    Did you try Unison ? It is very useful for synchronizing files between different places.
    It works on windows and linux.

    I hope this helps.

    Reply
  • 2. dosnlinux  |  January 8, 2007 at 11:55 pm

    Thanks. I’ll give it a try. :)

    Reply
  • 3. dosnlinux  |  February 17, 2007 at 11:30 pm

    Unison didn’t work out to well. It is used to synchronize files, these files are not stored in a similar directory structure, so synchronizing isn’t much help for this.

    Reply
  • 4. The Hunt for Duplicates Continues « Life at the CLI  |  June 18, 2007 at 7:17 am

    [...] 17th, 2007 A while back I wrote about having to sort through several backup DVD’s to find a ton of duplicate files. I [...]

    Reply

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed


Feeds

My Projects

Slackware

Archives