annotate README.md @ 116:d2a7c0913ef1 default tip

Add project README
author Tom Fredrik Blenning <bfg@bfgconsult.no>
date Wed, 20 May 2026 23:25:34 +0200
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
116
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
1 # DeDupe
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
2
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
3 Qt/C++ file indexer and duplicate browser.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
4
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
5 DeDupe scans a directory tree, records file metadata in a SQLite database, and
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
6 helps find likely duplicate files by checksum, name, size, modification time, or
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
7 filename edit distance. It includes both a Qt GUI (`DeDupe.App`) for browsing
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
8 and deleting duplicates, and command-line helpers for updating and querying the
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
9 index.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
10
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
11 ## Features
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
12
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
13 - Recursively indexes regular files without following symlinks.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
14 - Stores paths, sizes, modification times, and SHA-1 checksums in
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
15 `~/.DeDupe.sqlite`.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
16 - Avoids recomputing hashes for files whose modification time has not changed.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
17 - Removes database entries for files that no longer exist under the scanned
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
18 prefix.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
19 - GUI filters for duplicate name, size, modification time, checksum, and similar
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
20 names by edit distance.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
21 - Double-click opens a file with the desktop default application.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
22 - Context-menu delete removes a file from disk and updates the database.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
23 - Shell scripts report duplicate sets, duplicate statistics, and directory
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
24 comparison results from the SQLite database.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
25
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
26 ## Requirements
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
27
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
28 The build is based on CMake and older Qt 4-era dependencies:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
29
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
30 - C++ compiler, tested by the build files as `g++`.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
31 - CMake 2.6.4 or newer.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
32 - Qt 4 with QtGui, QtXml, QtSql, and QtOpenGL.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
33 - SQLite 3 development headers and runtime.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
34 - Boost filesystem, system, and test components.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
35 - Optional `ccache`.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
36 - Optional `lcov` and `genhtml` for coverage targets.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
37
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
38 On Debian-like systems, `setup.sh` attempts to install the main dependencies:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
39
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
40 ```sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
41 ./setup.sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
42 ```
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
43
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
44 ## Building
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
45
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
46 Use an out-of-tree build directory:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
47
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
48 ```sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
49 mkdir build
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
50 cd build
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
51 ../setup.sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
52 make
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
53 ```
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
54
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
55 The main build products are:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
56
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
57 - `DeDupe.App` - Qt GUI application.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
58 - `updateDeDupe` - command-line index updater.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
59 - `DeDupe` - project library used by the apps and tests.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
60
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
61 ## Updating the Index
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
62
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
63 Index the current directory:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
64
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
65 ```sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
66 ./updateDeDupe
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
67 ```
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
68
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
69 Index one or more explicit paths:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
70
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
71 ```sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
72 ./updateDeDupe /path/to/photos /path/to/archive
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
73 ```
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
74
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
75 The index is stored in `~/.DeDupe.sqlite` by default. Paths are canonicalized
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
76 before indexing.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
77
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
78 ## Browsing Duplicates
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
79
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
80 Run the GUI from the build directory:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
81
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
82 ```sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
83 ./DeDupe.App
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
84 ```
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
85
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
86 The GUI scans the current directory, updates the database, and then shows
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
87 possible duplicates. The toolbar controls which signals are used: name, size,
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
88 modification time, checksum, and edit distance. The "Show full path" menu item
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
89 switches between filenames and full paths.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
90
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
91 ## Command-Line Reports
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
92
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
93 The helper scripts query `~/.DeDupe.sqlite` directly:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
94
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
95 ```sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
96 scripts/duplicates.sh /path/prefix
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
97 scripts/duplicates.sh -s /path/prefix
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
98 scripts/statistics.sh /path/prefix
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
99 scripts/dircompare.sh /path/one /path/two
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
100 ```
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
101
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
102 `duplicates.sh -s` strips the supplied prefix from the displayed paths.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
103
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
104 ## Tests and Coverage
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
105
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
106 The CMake project defines unit-test executables for the database layer,
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
107 bit-array/bit-decoder code, edit distance, Huffman structures, exception types,
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
108 and other core classes.
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
109
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
110 After building, run tests through CTest:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
111
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
112 ```sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
113 ctest
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
114 ```
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
115
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
116 Coverage support can be enabled with:
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
117
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
118 ```sh
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
119 cmake -DCOVERAGE=ON ..
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
120 make coverage_presentation
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
121 ```
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
122
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
123 ## Repository Status
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
124
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
125 This is a Mercurial repository. The code targets an older Qt 4/CMake toolchain,
d2a7c0913ef1 Add project README
Tom Fredrik Blenning <bfg@bfgconsult.no>
parents:
diff changeset
126 so modern systems may need compatibility packages or small build adjustments.