|
|
Directory Tree Differ (difftree) can be found on
SourceForge.
What is dt?
dt is short for difftree and it is a fast directory comparison tool.
It is a command-line utility to compare two or more
directories. It detects changes in file metadata including
size, ownership, permissions, etc.
Why use it?
I build dt during a security incident to compare directory snapshots on
a large SAN. I attemped to use both tripwire and osiris and
neither could complete the comparison of directories in a reasonable
amount of time. This tool sacrifices absolute comparisons
using hashing and databases for speed and a minimal set of comparisons
of data available from fstat. Not to mention, it's fast. The
following runs are against
four copies of my project directories with 2,994 files
totalling 2.8Gb of data.
As a baseline, here is how long it takes find to process the
first of the four copies of the directory tree:
% time find ~/cvs > /dev/null
real 0m0.011s user 0m0.003s sys 0m0.007s
A quick scan of all my project directories with comparisons across four versions:
% time ./src/dt -q ~/cvs ~/c1 ~/c2 ~/c3 > /dev/null
real 0m0.058s user 0m0.031s sys 0m0.019s
A standard scan of all my project directories with comparisons across four versions:
% time ./src/dt ~/cvs ~/c1 ~/c2 ~/c3 > /dev/null
real 0m0.085s user 0m0.035s sys 0m0.039s
A full scan with md5 hashing of all my project directories with comparisons across four versions:
% time ./src/dt -m ~/cvs ~/c1 ~/c2 ~/c3 > /dev/null
real 1m28.804s user 0m5.043s sys 0m34.487s |
More details can be found in the README
and INSTALL files which are
included in the distribution.
How to use it.
Here is an example of running dt against a set of directories.
Each directory passed to dt will be compared to the previous
argument. This allows a quick comparison between each
directory to get a summary of the changes over time.
% ./src/dt ~/cvs ~/c1 ~/c2 ~/c3
Processing
dir [/home/rdilley/cvs]
Processing
dir [/home/rdilley/c1]
mt[2011/07/11@00:32:31->2011/07/10@20:10:05]
d
[/home/rdilley/c1/difftree]
mt[2011/07/10@23:52:12->2011/07/10@15:06:10]
d
[/home/rdilley/c1/difftree/CVS]
s[457->435]
f [/home/rdilley/c1/difftree/CVS/Entries]
+
f [/difftree/CVS/Entries.Log]
mt[2011/07/11@01:07:39->2011/07/10@21:53:23]
d
[/home/rdilley/c1/difftree/src]
g[100>1000]
d [/home/rdilley/c1/pdnsd]
p[rwxr-xr-x->rwxr-x---]
d [/home/rdilley/c1/imspy]
-
f [/bc.tar.gz]
-
f [/difftree/configure.scan]
-
f [/difftree/autoscan.log]
Processing
dir [/home/rdilley/c2]
s[686328114->0]
f [/home/rdilley/c2/dictionary.txt]
g[1000>100]
f [/home/rdilley/c2/difftree/src/hash.c]
g[1000>100]
d [/home/rdilley/c2/pdnsd]
g[100>1000]
p[rwxr-xr-x->rwxr-sr-x] d
[/home/rdilley/c2/quickparser]
p[rwxr-x---->rwxr-xr-x]
d [/home/rdilley/c2/imspy]
+
f [/bc.tar.gz]
mt[2007/08/22@22:34:40->2011/07/11@01:08:34]
f
[/home/rdilley/c2/wsd-0.1.config]
Processing
dir [/home/rdilley/c3]
g[1000>100]
p[rwxr-sr-x->rwxr-xr-x] d
[/home/rdilley/c3/quickparser]
+
f [/psmd.config.orig]
s[654->16]
t[f->sl] p[rwxr-xr-x->rwxrwxrwx] sl
[/home/rdilley/c3/psmd.config]
mt[2011/07/11@01:08:34->2007/08/22@22:34:40]
f
[/home/rdilley/c3/wsd-0.1.config]
-
f [/dictionary.txt] |
Here is an example of running dt against a directory tree and saving the data:
% ./dt -m -w cvs_dir.txt ~/cvs
Processing dir [/home/rdilley/cvs] |
You can then use the file as one of the directory arguments to compare an existing
directory to the one previously saves with the '-w' option.
% ./dt -m cvs_dir.txt ~/cvs
Processing file [cvs_dir.txt]
Read [3006] and loaded [3005] lines from file [/home/rdilley/cvs] dated [2011/07/24@18:51:16] Processing dir [/home/rdilley/cvs]
mt[2011/07/24@18:40:38->2011/07/24@18:51:45] d [/home/rdilley/cvs/difftree] s[20480->24576] mt[2011/07/24@18:50:07->2011/07/24@18:52:02] f [/home/rdilley/cvs/difftree/.README.swp] + f [/difftree/cvs_dir.txt] |
You can combine comparing a file to the current directory and writing a new file
into a single run:
% ./dt -m -w cvs_dir_new.txt cvs_dir.txt ~/cvs
Processing file [cvs_dir.txt]
Read [3006] and loaded [3005] lines from file [/home/rdilley/cvs] dated [2011/07/24@18:51:16] Processing dir [/home/rdilley/cvs]
mt[2011/07/24@18:40:38->2011/07/24@18:51:45] d [/home/rdilley/cvs/difftree] s[20480->24576] mt[2011/07/24@18:50:07->2011/07/24@18:53:05] f [/home/rdilley/cvs/difftree/.README.swp] + f [/difftree/cvs_dir.txt] |
You can then compare just the two files:
% ./dt -m cvs_dir.txt cvs_dir_new.txt
Processing file [cvs_dir.txt] Read [3006] and loaded [3005] lines from file [/home/rdilley/cvs] dated [2011/07/24@18:51:16] Processing file [cvs_dir_new.txt] + f [/difftree/cvs_dir.txt] mt[2011/07/24@18:40:38->2011/07/24@18:51:45] d [/home/rdilley/cvs/difftree] s[20480->24576] mt[2011/07/24@18:50:07->2011/07/24@18:53:05] f [/home/rdilley/cvs/difftree/.README.swp] Read [3007] and loaded [3006] lines from file [/home/rdilley/cvs] dated [2011/07/24@18:53:36] |
To monitor a directory tree over time, you can run this tool once a week/day/hour
and store the files so that you can compare an arbitrary point in the in the past
to the current directory tree, or any other point that you have stored.
% ./dt -m -w cvs_dir.`date '+%Y%m%d%H%M%S'`.txt ~/cvs
Any changes are noted as {type}[{old}->{new}] and a file can
have multiple changes.
The change types are as
follows:
| + |
New file |
| - |
Missing file |
| s |
Size changed |
| t | File type changed |
| u |
UID changed |
| g |
GID changed |
| p |
Permissions changed |
| mt |
Modify time changed |
| at |
Access time changed (disabled) |
| ct |
Create time changed (disabled) |
| md5 |
Hash has changed |
The second column is the
file type:
| f |
File |
| d |
Directory |
| sl |
Soft Link |
| blk |
Block Device |
| fifo |
FIFO |
| chr |
Character Device |
| sok |
Socket |
The third column is the fully qualified filename.
dt comes with a minimal set of options as follows:
% ./dt -h
dt
v0.5 [Jul 26 2011 - 23:52:35]
syntax:
dt [options] {dir}|{file} [{dir}|{file} ...]
-d
{lvl} enable debugging info
-h
this info
-l
{dir} directory to create logs in
(default: /var/log/dt)
-m
hash files and compare (disables -q|--quick mode)
-q
do quick comparisons only
-v
display version information |
The -l option is disabled.
The -q option detects new and missing files and changes in file size.
The -d option is only useful if dt is compiled with --enable-debug.
What is in the works?
I am working on porting this utility to more operating systems
including those that don't directly support ftw/nftw.
Thanks to the sarcastic comments of a friend, I will be
looking at adding the ability to run this tool as client-server to
simplify execution across large numbers of systems with local storage
as well.
|