changeset 116:d2a7c0913ef1 default tip

Add project README
author Tom Fredrik Blenning <bfg@bfgconsult.no>
date Wed, 20 May 2026 23:25:34 +0200
parents 404795616b1e
children
files README.md
diffstat 1 files changed, 126 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/README.md	Wed May 20 23:25:34 2026 +0200
@@ -0,0 +1,126 @@
+# DeDupe
+
+Qt/C++ file indexer and duplicate browser.
+
+DeDupe scans a directory tree, records file metadata in a SQLite database, and
+helps find likely duplicate files by checksum, name, size, modification time, or
+filename edit distance. It includes both a Qt GUI (`DeDupe.App`) for browsing
+and deleting duplicates, and command-line helpers for updating and querying the
+index.
+
+## Features
+
+- Recursively indexes regular files without following symlinks.
+- Stores paths, sizes, modification times, and SHA-1 checksums in
+  `~/.DeDupe.sqlite`.
+- Avoids recomputing hashes for files whose modification time has not changed.
+- Removes database entries for files that no longer exist under the scanned
+  prefix.
+- GUI filters for duplicate name, size, modification time, checksum, and similar
+  names by edit distance.
+- Double-click opens a file with the desktop default application.
+- Context-menu delete removes a file from disk and updates the database.
+- Shell scripts report duplicate sets, duplicate statistics, and directory
+  comparison results from the SQLite database.
+
+## Requirements
+
+The build is based on CMake and older Qt 4-era dependencies:
+
+- C++ compiler, tested by the build files as `g++`.
+- CMake 2.6.4 or newer.
+- Qt 4 with QtGui, QtXml, QtSql, and QtOpenGL.
+- SQLite 3 development headers and runtime.
+- Boost filesystem, system, and test components.
+- Optional `ccache`.
+- Optional `lcov` and `genhtml` for coverage targets.
+
+On Debian-like systems, `setup.sh` attempts to install the main dependencies:
+
+```sh
+./setup.sh
+```
+
+## Building
+
+Use an out-of-tree build directory:
+
+```sh
+mkdir build
+cd build
+../setup.sh
+make
+```
+
+The main build products are:
+
+- `DeDupe.App` - Qt GUI application.
+- `updateDeDupe` - command-line index updater.
+- `DeDupe` - project library used by the apps and tests.
+
+## Updating the Index
+
+Index the current directory:
+
+```sh
+./updateDeDupe
+```
+
+Index one or more explicit paths:
+
+```sh
+./updateDeDupe /path/to/photos /path/to/archive
+```
+
+The index is stored in `~/.DeDupe.sqlite` by default. Paths are canonicalized
+before indexing.
+
+## Browsing Duplicates
+
+Run the GUI from the build directory:
+
+```sh
+./DeDupe.App
+```
+
+The GUI scans the current directory, updates the database, and then shows
+possible duplicates. The toolbar controls which signals are used: name, size,
+modification time, checksum, and edit distance. The "Show full path" menu item
+switches between filenames and full paths.
+
+## Command-Line Reports
+
+The helper scripts query `~/.DeDupe.sqlite` directly:
+
+```sh
+scripts/duplicates.sh /path/prefix
+scripts/duplicates.sh -s /path/prefix
+scripts/statistics.sh /path/prefix
+scripts/dircompare.sh /path/one /path/two
+```
+
+`duplicates.sh -s` strips the supplied prefix from the displayed paths.
+
+## Tests and Coverage
+
+The CMake project defines unit-test executables for the database layer,
+bit-array/bit-decoder code, edit distance, Huffman structures, exception types,
+and other core classes.
+
+After building, run tests through CTest:
+
+```sh
+ctest
+```
+
+Coverage support can be enabled with:
+
+```sh
+cmake -DCOVERAGE=ON ..
+make coverage_presentation
+```
+
+## Repository Status
+
+This is a Mercurial repository. The code targets an older Qt 4/CMake toolchain,
+so modern systems may need compatibility packages or small build adjustments.