comparison README.md @ 116:d2a7c0913ef1 default tip

Add project README
author Tom Fredrik Blenning <bfg@bfgconsult.no>
date Wed, 20 May 2026 23:25:34 +0200
parents
children
comparison
equal deleted inserted replaced
115:404795616b1e 116:d2a7c0913ef1
1 # DeDupe
2
3 Qt/C++ file indexer and duplicate browser.
4
5 DeDupe scans a directory tree, records file metadata in a SQLite database, and
6 helps find likely duplicate files by checksum, name, size, modification time, or
7 filename edit distance. It includes both a Qt GUI (`DeDupe.App`) for browsing
8 and deleting duplicates, and command-line helpers for updating and querying the
9 index.
10
11 ## Features
12
13 - Recursively indexes regular files without following symlinks.
14 - Stores paths, sizes, modification times, and SHA-1 checksums in
15 `~/.DeDupe.sqlite`.
16 - Avoids recomputing hashes for files whose modification time has not changed.
17 - Removes database entries for files that no longer exist under the scanned
18 prefix.
19 - GUI filters for duplicate name, size, modification time, checksum, and similar
20 names by edit distance.
21 - Double-click opens a file with the desktop default application.
22 - Context-menu delete removes a file from disk and updates the database.
23 - Shell scripts report duplicate sets, duplicate statistics, and directory
24 comparison results from the SQLite database.
25
26 ## Requirements
27
28 The build is based on CMake and older Qt 4-era dependencies:
29
30 - C++ compiler, tested by the build files as `g++`.
31 - CMake 2.6.4 or newer.
32 - Qt 4 with QtGui, QtXml, QtSql, and QtOpenGL.
33 - SQLite 3 development headers and runtime.
34 - Boost filesystem, system, and test components.
35 - Optional `ccache`.
36 - Optional `lcov` and `genhtml` for coverage targets.
37
38 On Debian-like systems, `setup.sh` attempts to install the main dependencies:
39
40 ```sh
41 ./setup.sh
42 ```
43
44 ## Building
45
46 Use an out-of-tree build directory:
47
48 ```sh
49 mkdir build
50 cd build
51 ../setup.sh
52 make
53 ```
54
55 The main build products are:
56
57 - `DeDupe.App` - Qt GUI application.
58 - `updateDeDupe` - command-line index updater.
59 - `DeDupe` - project library used by the apps and tests.
60
61 ## Updating the Index
62
63 Index the current directory:
64
65 ```sh
66 ./updateDeDupe
67 ```
68
69 Index one or more explicit paths:
70
71 ```sh
72 ./updateDeDupe /path/to/photos /path/to/archive
73 ```
74
75 The index is stored in `~/.DeDupe.sqlite` by default. Paths are canonicalized
76 before indexing.
77
78 ## Browsing Duplicates
79
80 Run the GUI from the build directory:
81
82 ```sh
83 ./DeDupe.App
84 ```
85
86 The GUI scans the current directory, updates the database, and then shows
87 possible duplicates. The toolbar controls which signals are used: name, size,
88 modification time, checksum, and edit distance. The "Show full path" menu item
89 switches between filenames and full paths.
90
91 ## Command-Line Reports
92
93 The helper scripts query `~/.DeDupe.sqlite` directly:
94
95 ```sh
96 scripts/duplicates.sh /path/prefix
97 scripts/duplicates.sh -s /path/prefix
98 scripts/statistics.sh /path/prefix
99 scripts/dircompare.sh /path/one /path/two
100 ```
101
102 `duplicates.sh -s` strips the supplied prefix from the displayed paths.
103
104 ## Tests and Coverage
105
106 The CMake project defines unit-test executables for the database layer,
107 bit-array/bit-decoder code, edit distance, Huffman structures, exception types,
108 and other core classes.
109
110 After building, run tests through CTest:
111
112 ```sh
113 ctest
114 ```
115
116 Coverage support can be enabled with:
117
118 ```sh
119 cmake -DCOVERAGE=ON ..
120 make coverage_presentation
121 ```
122
123 ## Repository Status
124
125 This is a Mercurial repository. The code targets an older Qt 4/CMake toolchain,
126 so modern systems may need compatibility packages or small build adjustments.