Reimplementation of prune

This commit is contained in:
Alexander Weiss
2020-07-19 07:55:14 +02:00
committed by Alexander Neumann
parent 3b591ed987
commit 7f9a0a5907
5 changed files with 553 additions and 211 deletions

View File

@@ -23,12 +23,11 @@ data that was referenced by the snapshot from the repository. This can
be automated with the ``--prune`` option of the ``forget`` command,
which runs ``prune`` automatically if snapshots have been removed.
.. Warning::
Pruning snapshots can be a very time-consuming process, taking nearly
as long as backups themselves. During a prune operation, the index is
locked and backups cannot be completed. Performance improvements are
planned for this feature.
Pruning snapshots can be a time-consuming process, depending on the
amount of snapshots and data to process. During a prune operation, the
repository is locked and backups cannot be completed. Please plan your
pruning so that there's time to complete it and it doesn't interfere with
regular backup runs.
It is advisable to run ``restic check`` after pruning, to make sure
you are alerted, should the internal data structures of the repository
@@ -82,20 +81,32 @@ command must be run:
$ restic -r /srv/restic-repo prune
enter password for repository:
repository 33002c5e opened successfully, password is correct
loading all snapshots...
loading indexes...
finding data that is still in use for 4 snapshots
[0:00] 100.00% 4 / 4 snapshots
searching used packs...
collecting packs for deletion and repacking
[0:00] 100.00% 5 / 5 packs processed
to repack: 69 blobs / 1.078 MiB
this removes 67 blobs / 1.047 MiB
to delete: 7 blobs / 25.726 KiB
total prune: 74 blobs / 1.072 MiB
remaining: 16 blobs / 38.003 KiB
unused size after prune: 0 B (0.00% of remaining size)
repacking packs
[0:00] 100.00% 2 / 2 packs repacked
counting files in repo
building new index for repo
[0:00] 100.00% 22 / 22 files
repository contains 22 packs (8512 blobs) with 100.092 MiB bytes
processed 8512 blobs: 0 duplicate blobs, 0B duplicate
load all snapshots
find data that is still in use for 1 snapshots
[0:00] 100.00% 1 / 1 snapshots
found 8433 of 8512 data blobs still in use
will rewrite 3 packs
creating new index
[0:00] 86.36% 19 / 22 files
saved new index as 544a5084
[0:00] 100.00% 3 / 3 packs
finding old index files
saved new indexes as [59270b3a]
remove 4 old index files
[0:00] 100.00% 4 / 4 files deleted
removing 3 old packs
[0:00] 100.00% 3 / 3 files deleted
done
Afterwards the repository is smaller.
@@ -119,19 +130,31 @@ to ``forget``:
8c02b94b 2017-02-21 10:48:33 mopped /home/user/work
1 snapshots have been removed, running prune
counting files in repo
building new index for repo
[0:00] 100.00% 37 / 37 packs
repository contains 37 packs (5521 blobs) with 151.012 MiB bytes
processed 5521 blobs: 0 duplicate blobs, 0B duplicate
load all snapshots
find data that is still in use for 1 snapshots
loading all snapshots...
loading indexes...
finding data that is still in use for 1 snapshots
[0:00] 100.00% 1 / 1 snapshots
found 5323 of 5521 data blobs still in use, removing 198 blobs
will delete 0 packs and rewrite 27 packs, this frees 22.106 MiB
creating new index
[0:00] 100.00% 30 / 30 packs
saved new index as b49f3e68
searching used packs...
collecting packs for deletion and repacking
[0:00] 100.00% 5 / 5 packs processed
to repack: 69 blobs / 1.078 MiB
this removes 67 blobs / 1.047 MiB
to delete: 7 blobs / 25.726 KiB
total prune: 74 blobs / 1.072 MiB
remaining: 16 blobs / 38.003 KiB
unused size after prune: 0 B (0.00% of remaining size)
repacking packs
[0:00] 100.00% 2 / 2 packs repacked
counting files in repo
[0:00] 100.00% 3 / 3 packs
finding old index files
saved new indexes as [59270b3a]
remove 4 old index files
[0:00] 100.00% 4 / 4 files deleted
removing 3 old packs
[0:00] 100.00% 3 / 3 files deleted
done
Removing snapshots according to a policy
@@ -282,3 +305,44 @@ last-day-of-the-months (11 or 12 depends if the 5 weeklies cross a month).
And finally 75 last-day-of-the-year snapshots. All other snapshots are
removed.
Customize pruning
*****************
To understand the custom options, we first explain how the pruning process works:
- First all snapshots and directories within snapshots are scanned to determine
which data is still in use.
- Then for all pack files ``prune`` finds out if the file is fully used, partly
used or completely unused.
- Completely unused packs are marked for deletion. Fully used packs are kept.
A partially used pack is either kept or marked for repacking depending on user
options.
Note that for repacking, restic must download the file from the repository
storage and reupload the needed data in the repository. This can be very
time-consuming for remote repositories.
- After deciding what to do, ``prune`` will actually perform the repack, modify
the index according to the changes and delete the obsolete files.
The ``prune`` command accepts the following options:
- ``--max-unused limit`` allow unused data up to the specified limit within the repository.
This allows restic to keep partly used packs instead of repacking them.
The limit can be specified as size, e.g. "200M" or in percentage with respect to the total
repository size, e.g. "0.5%".
``prune`` tries to repack as little data as possible while still ensuring this
limit for unused data.
If you want to minimize the space used by your repository, use a value of 0%.
If you want to minimize the time and bandwidth used by the ``prune`` command, use a
high value. A value of 100% will not require any pack file to be repacked.
The default value is 5%.
- ``--max-repack-size size`` if set limits the total size of packs to repack.
As ``prune`` first stores all repacked packs and deletes the obsolete packs at the end,
this option might be handy if you expect many packs to be repacked and fear to run low
on storage.
- ``--repack-cacheable-only`` if set to true only pack files which are cacheable are repacked.
Other pack files are not repacked, if this option is set.
This allows a very fast repacking using only cached data. It can, however, imply that the
unused data in your repository exceeds the value given by ``--max-unused-percent``.
The default value is false.
- ``--dry-run`` only show what ``prune`` would do.
- ``--verbose`` increased verbosity shows additional statistics for ``prune``.