Mercurial > dedupe
comparison README.md @ 116:d2a7c0913ef1 default tip
Add project README
| author | Tom Fredrik Blenning <bfg@bfgconsult.no> |
|---|---|
| date | Wed, 20 May 2026 23:25:34 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 115:404795616b1e | 116:d2a7c0913ef1 |
|---|---|
| 1 # DeDupe | |
| 2 | |
| 3 Qt/C++ file indexer and duplicate browser. | |
| 4 | |
| 5 DeDupe scans a directory tree, records file metadata in a SQLite database, and | |
| 6 helps find likely duplicate files by checksum, name, size, modification time, or | |
| 7 filename edit distance. It includes both a Qt GUI (`DeDupe.App`) for browsing | |
| 8 and deleting duplicates, and command-line helpers for updating and querying the | |
| 9 index. | |
| 10 | |
| 11 ## Features | |
| 12 | |
| 13 - Recursively indexes regular files without following symlinks. | |
| 14 - Stores paths, sizes, modification times, and SHA-1 checksums in | |
| 15 `~/.DeDupe.sqlite`. | |
| 16 - Avoids recomputing hashes for files whose modification time has not changed. | |
| 17 - Removes database entries for files that no longer exist under the scanned | |
| 18 prefix. | |
| 19 - GUI filters for duplicate name, size, modification time, checksum, and similar | |
| 20 names by edit distance. | |
| 21 - Double-click opens a file with the desktop default application. | |
| 22 - Context-menu delete removes a file from disk and updates the database. | |
| 23 - Shell scripts report duplicate sets, duplicate statistics, and directory | |
| 24 comparison results from the SQLite database. | |
| 25 | |
| 26 ## Requirements | |
| 27 | |
| 28 The build is based on CMake and older Qt 4-era dependencies: | |
| 29 | |
| 30 - C++ compiler, tested by the build files as `g++`. | |
| 31 - CMake 2.6.4 or newer. | |
| 32 - Qt 4 with QtGui, QtXml, QtSql, and QtOpenGL. | |
| 33 - SQLite 3 development headers and runtime. | |
| 34 - Boost filesystem, system, and test components. | |
| 35 - Optional `ccache`. | |
| 36 - Optional `lcov` and `genhtml` for coverage targets. | |
| 37 | |
| 38 On Debian-like systems, `setup.sh` attempts to install the main dependencies: | |
| 39 | |
| 40 ```sh | |
| 41 ./setup.sh | |
| 42 ``` | |
| 43 | |
| 44 ## Building | |
| 45 | |
| 46 Use an out-of-tree build directory: | |
| 47 | |
| 48 ```sh | |
| 49 mkdir build | |
| 50 cd build | |
| 51 ../setup.sh | |
| 52 make | |
| 53 ``` | |
| 54 | |
| 55 The main build products are: | |
| 56 | |
| 57 - `DeDupe.App` - Qt GUI application. | |
| 58 - `updateDeDupe` - command-line index updater. | |
| 59 - `DeDupe` - project library used by the apps and tests. | |
| 60 | |
| 61 ## Updating the Index | |
| 62 | |
| 63 Index the current directory: | |
| 64 | |
| 65 ```sh | |
| 66 ./updateDeDupe | |
| 67 ``` | |
| 68 | |
| 69 Index one or more explicit paths: | |
| 70 | |
| 71 ```sh | |
| 72 ./updateDeDupe /path/to/photos /path/to/archive | |
| 73 ``` | |
| 74 | |
| 75 The index is stored in `~/.DeDupe.sqlite` by default. Paths are canonicalized | |
| 76 before indexing. | |
| 77 | |
| 78 ## Browsing Duplicates | |
| 79 | |
| 80 Run the GUI from the build directory: | |
| 81 | |
| 82 ```sh | |
| 83 ./DeDupe.App | |
| 84 ``` | |
| 85 | |
| 86 The GUI scans the current directory, updates the database, and then shows | |
| 87 possible duplicates. The toolbar controls which signals are used: name, size, | |
| 88 modification time, checksum, and edit distance. The "Show full path" menu item | |
| 89 switches between filenames and full paths. | |
| 90 | |
| 91 ## Command-Line Reports | |
| 92 | |
| 93 The helper scripts query `~/.DeDupe.sqlite` directly: | |
| 94 | |
| 95 ```sh | |
| 96 scripts/duplicates.sh /path/prefix | |
| 97 scripts/duplicates.sh -s /path/prefix | |
| 98 scripts/statistics.sh /path/prefix | |
| 99 scripts/dircompare.sh /path/one /path/two | |
| 100 ``` | |
| 101 | |
| 102 `duplicates.sh -s` strips the supplied prefix from the displayed paths. | |
| 103 | |
| 104 ## Tests and Coverage | |
| 105 | |
| 106 The CMake project defines unit-test executables for the database layer, | |
| 107 bit-array/bit-decoder code, edit distance, Huffman structures, exception types, | |
| 108 and other core classes. | |
| 109 | |
| 110 After building, run tests through CTest: | |
| 111 | |
| 112 ```sh | |
| 113 ctest | |
| 114 ``` | |
| 115 | |
| 116 Coverage support can be enabled with: | |
| 117 | |
| 118 ```sh | |
| 119 cmake -DCOVERAGE=ON .. | |
| 120 make coverage_presentation | |
| 121 ``` | |
| 122 | |
| 123 ## Repository Status | |
| 124 | |
| 125 This is a Mercurial repository. The code targets an older Qt 4/CMake toolchain, | |
| 126 so modern systems may need compatibility packages or small build adjustments. |
