The Hunt for Duplicates Continues
February 17, 2007
A while back I wrote about having to sort through several backup DVD’s to find a ton of duplicate files. I decided to give fdupes a try. fdupes uses a combination of file size, md5 hash, and bit by bit comparison, so it should be fairly safe to trust that the results are exact duplicates. (I will double check the first couple just to make sure)
I moved the data over from 9 DVD’s (but there’s more :rolleyes: ) to an external hard drive and sent fdupes off to do its work. It seems to be fairly fast. It compared about 141,000 files with sizes between 0 bits to over a GB in about an hour and a half. Now I need to find out how to pass the results to rm leaving only 1 copy. (I think fdupes has an option for this)
I’ll try to post a tutorial if I get everything working ok. I’ll try running it with a couple of different options and run diff on the outputs to see if running with omit first (-f) will preserve only 1 file.
Entry Filed under: Linux. Tags: CLI, Command Line, Learning, Linux, software.
1 Comment Add your own
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed
1.
J | February 4, 2009 at 2:19 am
So… How did this turn out?