Mercurial > dedupe
diff README.md @ 116:d2a7c0913ef1 default tip
Add project README
| author | Tom Fredrik Blenning <bfg@bfgconsult.no> |
|---|---|
| date | Wed, 20 May 2026 23:25:34 +0200 |
| parents | |
| children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.md Wed May 20 23:25:34 2026 +0200 @@ -0,0 +1,126 @@ +# DeDupe + +Qt/C++ file indexer and duplicate browser. + +DeDupe scans a directory tree, records file metadata in a SQLite database, and +helps find likely duplicate files by checksum, name, size, modification time, or +filename edit distance. It includes both a Qt GUI (`DeDupe.App`) for browsing +and deleting duplicates, and command-line helpers for updating and querying the +index. + +## Features + +- Recursively indexes regular files without following symlinks. +- Stores paths, sizes, modification times, and SHA-1 checksums in + `~/.DeDupe.sqlite`. +- Avoids recomputing hashes for files whose modification time has not changed. +- Removes database entries for files that no longer exist under the scanned + prefix. +- GUI filters for duplicate name, size, modification time, checksum, and similar + names by edit distance. +- Double-click opens a file with the desktop default application. +- Context-menu delete removes a file from disk and updates the database. +- Shell scripts report duplicate sets, duplicate statistics, and directory + comparison results from the SQLite database. + +## Requirements + +The build is based on CMake and older Qt 4-era dependencies: + +- C++ compiler, tested by the build files as `g++`. +- CMake 2.6.4 or newer. +- Qt 4 with QtGui, QtXml, QtSql, and QtOpenGL. +- SQLite 3 development headers and runtime. +- Boost filesystem, system, and test components. +- Optional `ccache`. +- Optional `lcov` and `genhtml` for coverage targets. + +On Debian-like systems, `setup.sh` attempts to install the main dependencies: + +```sh +./setup.sh +``` + +## Building + +Use an out-of-tree build directory: + +```sh +mkdir build +cd build +../setup.sh +make +``` + +The main build products are: + +- `DeDupe.App` - Qt GUI application. +- `updateDeDupe` - command-line index updater. +- `DeDupe` - project library used by the apps and tests. + +## Updating the Index + +Index the current directory: + +```sh +./updateDeDupe +``` + +Index one or more explicit paths: + +```sh +./updateDeDupe /path/to/photos /path/to/archive +``` + +The index is stored in `~/.DeDupe.sqlite` by default. Paths are canonicalized +before indexing. + +## Browsing Duplicates + +Run the GUI from the build directory: + +```sh +./DeDupe.App +``` + +The GUI scans the current directory, updates the database, and then shows +possible duplicates. The toolbar controls which signals are used: name, size, +modification time, checksum, and edit distance. The "Show full path" menu item +switches between filenames and full paths. + +## Command-Line Reports + +The helper scripts query `~/.DeDupe.sqlite` directly: + +```sh +scripts/duplicates.sh /path/prefix +scripts/duplicates.sh -s /path/prefix +scripts/statistics.sh /path/prefix +scripts/dircompare.sh /path/one /path/two +``` + +`duplicates.sh -s` strips the supplied prefix from the displayed paths. + +## Tests and Coverage + +The CMake project defines unit-test executables for the database layer, +bit-array/bit-decoder code, edit distance, Huffman structures, exception types, +and other core classes. + +After building, run tests through CTest: + +```sh +ctest +``` + +Coverage support can be enabled with: + +```sh +cmake -DCOVERAGE=ON .. +make coverage_presentation +``` + +## Repository Status + +This is a Mercurial repository. The code targets an older Qt 4/CMake toolchain, +so modern systems may need compatibility packages or small build adjustments.
