linux - sizes - write a script to list all the differences between two directories
Given two directory trees, how can I find out which files differ? (6)
Diffoscope is a great command line based directory diff tool.
I especially like about it that it can diff into files:
It will recursively unpack archives of many kinds and transform various binary formats into more human readable form to compare them. It can compare two tarballs, ISO images, or PDF just as easily.
It will not only tell you which files differ, but also how they differ.
If I want find the differences between two directory trees, I usually just execute:
diff -r dir1/ dir2/
This outputs exactly what the differences are between corresponding files. I'm interested in just getting a list of corresponding files whose content differs. I assumed that this would simply be a matter of passing a command line option to
diff, but I couldn't find anything on the man page.
Meld is also a great tool for comparing two directories:
meld dir1/ dir2/
Meld has many options for comparing files or directories. If two files differ, it's easy to enter file comparison mode and see the exact differences.
I like to use
git diff --no-index dir1/ dir2/, because it can show the differences in color (if you have that option set in your git config) and because it shows all of the differences in a long paged output using "less".
The command I use is:
diff -qr dir1/ dir2/
It is exactly the same as Mark's :) But his answer bothered me as it uses different types of flags, and it made me look twice. Using Mark's more verbose flags it would be:
diff --brief --recursive dir1/ dir2/
I apologise for posting when the other answer is perfectly acceptable. Could not stop myself... working on being less pedantic.
These two commands do basically the thing asked for:
diff --brief --recursive --no-dereference --new-file --no-ignore-file-name-case /dir1 /dir2 > dirdiff_1.txt rsync --recursive --delete --links --checksum --verbose --dry-run /dir1/ /dir2/ > dirdiff_2.txt
The choice between them depends on the location of dir1 and dir2:
When the directories reside on two seperate drives, diff outperforms rsync. But when the two directories compared are on the same drive, rsync is faster. It's because diff puts an almost equal load on both directories in parallel, maximizing load on the two drives.
rsync calculates checksums in large chunks before actually comparing them. That groups the i/o operations in large chunks and leads to a more efficient processing when things take place on a single drive.
You can also use
find $FOLDER -type f | cut -d/ -f2- | sort > /tmp/file_list_$FOLDER
But files with the same names and in the same subfolders, but with different content, will not be shown in the lists.
If you are a fan of GUI, you may check Meld that @Alexander mentioned. It works fine in both windows and linux.