A file-backup utility
Creates a directory named as the original file, containing a tarred copy of the file, optionally compressed.
Files are added to the tar archive only if they were changed, i.e. modification time is greater as compared to the last archive and size (or checksum) is different.
The directory containing tar files is placed in a mirrored directory tree. Each backup is a separate tar file.
- Install Python (at least 3.10)
- Download rumar.py
- Download rumar.toml to the same directory as
rumar.py - Edit
rumar.tomland adapt it to your needs – see settings details - Open a console/terminal (e.g. PowerShell) and change to the directory containing
rumar.py - If your installed Python version is below 3.11, run
python -m pip install tomlito install the module tomli - Run
python rumar.py list-profiles→ you should see your profile name(s) printed in the console - Run
python rumar.py create --profile "My Documents"to create a backup using the profile "My Documents" - Optionally, add the create command to Task Scheduler or cron, to be run at an interval (e.g. each day/night)
- Run
python rumar.py sweep --profile "My Documents" --dry-runand verify the files to be removed - Run
python rumar.py sweep --profile "My Documents"to remove old backups - Optionally, add the sweep command to Task Scheduler or cron, to be run at an interval (e.g. each day/night)
Note: when --dry-run is used, rumar.py counts the backup files and selects those to be removed based on settings, but no files are actually deleted.
Unless specified by --toml path/to/your/settings.toml,
settings are loaded from rumar.toml in the same directory as rumar.py or located in rumar/rumar.toml inside $XDG_CONFIG_HOME ($HOME/.config if not set) on POSIX,
or inside %APPDATA% on NT (Windows).
rumar.toml
# schema version
version = 2
# settings common for all profiles
backup_base_dir = 'C:\Users\Mac\Backup'
# setting for individual profiles - override any common ones
["My Documents"]
source_dir = 'C:\Users\Mac\Documents'
excluded_top_dirs = ['My Music', 'My Pictures', 'My Videos']
excluded_files_as_glob = ['desktop.ini', 'Thumbs.db']
[Desktop]
source_dir = 'C:\Users\Mac\Desktop'
excluded_files_as_glob = ['desktop.ini', '*.exe', '*.msi']
["# this profile's name starts with a hash, therefore it will be ignored"]
source_dir = "this setting won't be loaded"# schema version
version = 3
# settings common for all profiles
backup_base_dir = 'C:\Users\Mac\Backup'
# setting for individual profiles - override any common ones
["My Documents"]
source_dir = 'C:\Users\Mac\Documents'
excluded_files = ['My Music\**', 'My Pictures\**', 'My Videos\**', '**\desktop.ini', '**\Thumbs.db']
[Desktop]
source_dir = 'C:\Users\Mac\Desktop'
excluded_files = ['**\desktop.ini', '**\*.exe', '**\*.msi']
["# this profile's name starts with a hash, therefore it will be ignored"]
source_dir = "this setting won't be loaded"Each profile whose name starts with a hash # is ignored when rumar.toml is loaded.
version indicates the schema version – currently 3.
- backup_base_dir: str used by: create, sweep
path to the base directory used for backup; usually set in the global space, common for all profiles
ⓘ note: backup directory for each profile, i.e. backup_dir, is constructed as{backup_base_dir}/{profile}, unless backup_dir is set, which takes precedence - backup_dir: str = None used by: create, extract, sweep
path to the backup directory used for the profile
⚠️ caution: usually left unset; if so, its value defaults to{backup_base_dir}/{profile} - archive_format: Literal['tar', 'tar.gz', 'tar.bz2', 'tar.xz', 'tar.zst'] = 'tar.zst' used by: create, sweep
format of archive files to be created
'tar.zst' requires Python 3.14 or higher or backports.zstd - compression_level: int = 3 used by: create
0 to 9 for 'tar.gz', 'tar.bz2', 'tar.xz'
0 to 22 for 'tar.zst' - no_compression_suffixes_default: str = '7z,zip,zipx,jar,rar,tgz,gz,tbz,bz2,xz,zst,zstd,xlsx,docx,pptx,ods,odt,odp,odg,odb,epub,mobi,cbz,png,jpg,gif,mp4,mov,avi,mp3,m4a,aac,ogg,ogv,opus,flac,kdbx' used by: create
comma-separated string of the default lower-case suffixes for which to use no compression - no_compression_suffixes: str = '' used by: create
extra lower-case suffixes in addition to no_compression_suffixes_default - tar_format: Literal[0, 1, 2] = 1 (tarfile.GNU_FORMAT) used by: create
see also https://docs.python.org/3/library/tarfile.html#supported-tar-formats and https://www.gnu.org/software/tar/manual/html_section/Formats.html - source_dir: str used by: create, extract
path to the directory which is to be archived - included_files: list[str] used by: create, sweep
⚠️ caution: uses PurePath.full_match(...), which is available on Python 3.13 or higher
a list of glob patterns, also known as shell-style wildcards, i.e.** * ? [seq] [!seq];**means zero or more segments,*means a single segment or a part of a segment (as inMy*)
if present, only the matching files will be considered, together with included_files_as_regex, included_files_as_glob, included_top_dirs, included_dirs_as_regex
the paths/globs can be absolute or relative to source_dir (or backup_dir in case of sweep), e.g.C:\My Documents\*.txt,my-file-in-source-dir.log
absolute paths start with a root (/or{drive}:\)
on Windows, global-pattern matching is case-insensitive, and both\and/can be used
see also https://docs.python.org/3.13/library/pathlib.html#pathlib-pattern-language - excluded_files: list[str] used by: create, sweep
⚠️ caution: uses PurePath.full_match(...), which is available on Python 3.13 or higher
the matching files will be ignored, together with excluded_files_as_regex, excluded_files_as_glob, excluded_top_dirs, excluded_dirs_as_regex
see also included_files - included_top_dirs: list[str] used by: create, sweep
❌ deprecated: use included_files instead, if on Python 3.13 or higher, e.g.['top dir 1/**',]
a list of top-directory paths
if present, only the files from the directories and their descendant subdirs will be considered, together with included_dirs_as_regex, included_files, included_files_as_regex, included_files_as_glob,
the paths can be relative to source_dir or absolute, but always under source_dir (or backup_dir in case of sweep)
absolute paths start with a root (/or{drive}:\) - excluded_top_dirs: list[str] used by: create, sweep
❌ deprecated: use excluded_files instead, if on Python 3.13 or higher, e.g.['top dir 3/**',]
the files from the directories and their subdirs will be ignored, together with excluded_dirs_as_regex, excluded_files, excluded_files_as_regex, excluded_files_as_glob
see also included_top_dirs - included_dirs_as_regex: list[str] used by: create, sweep
a list of regex patterns (each to be passed to re.compile)
if present, only the file from the matching directories will be considered, together with included_top_dirs, included_files, included_files_as_regex, included_files_as_glob
/must be used as the path separator, also on Windows
the patterns are matched (using re.search) against a path relative to source_dir (or backup_dir in case of sweep)
the first segment in the relative path to match against also starts with a slash
e.g.['/B$',]will match each directory namedB, at any level;['^/B$',]will match only{source_dir}/B(or{backup_dir}/Bin case of sweep)
regex-pattern matching is case-sensitive – use(?i)at each pattern's beginning for case-insensitive matching, e.g.['(?i)/b$',]
see also https://docs.python.org/3/library/re.html - excluded_dirs_as_regex: list[str] used by: create, sweep
the files from the matching directories will be ignored, together with excluded_top_dirs, excluded_files, excluded_files_as_regex, excluded_files_as_glob
see also included_dirs_as_regex - included_files_as_glob: list[str] used by: create, sweep
❌ deprecated: use included_files instead, if on Python 3.13 or higher
a list of glob patterns, also known as shell-style wildcards, i.e.* ? [seq] [!seq]
if present, only the matching files will be considered, together with included_files, included_files_as_regex, included_top_dirs, included_dirs_as_regex
the paths/globs can be partial, relative to source_dir or absolute, but always under source_dir (or backup_dir in case of sweep)
unlike with glob patterns used in included_files, here matching is done from the right if the pattern is relative, e.g.['B\b1.txt',]will matchC:\A\B\b1.txtandC:\B\b1.txt
⚠️ caution: a leading path separator indicates an absolute path, but on Windows you also need a drive letter, e.g.['\A\a1.txt']will never match; use['C:\A\a1.txt']instead
on Windows, global-pattern matching is case-insensitive, and both\and/can be used
see also https://docs.python.org/3/library/fnmatch.html and https://en.wikipedia.org/wiki/Glob_(programming) - excluded_files_as_glob: list[str] used by: create, sweep
❌ deprecated: use excluded_files instead, if on Python 3.13 or higher
the matching files will be ignored, together with excluded_files, excluded_files_as_regex, excluded_top_dirs, excluded_dirs_as_regex
see also included_files_as_glob - included_files_as_regex: list[str] used by: create, sweep
if present, only the matching files will be considered, together with included_files, included_files_as_glob, included_top_dirs, included_dirs_as_regex
see also included_dirs_as_regex - excluded_files_as_regex: list[str] used by: create, sweep
the matching files will be ignored, together with excluded_files, excluded_files_as_glob, excluded_top_dirs, excluded_dirs_as_regex
see also included_dirs_as_regex - checksum_comparison_if_same_size: bool = False used by: create
when False, a file is considered changed if its mtime is later than the latest backup's mtime and its size changed
when True, BLAKE2b checksum is calculated to determine if the file changed despite having the same size
mtime := last modification time
see also https://en.wikipedia.org/wiki/File_verification - file_deduplication: bool = False used by: create
when True, an attempt is made to find and skip duplicates
a duplicate file has the same suffix and size and part of its name, case-insensitive (suffix, name) - min_age_in_days_of_backups_to_sweep: int = 2 used by: sweep
only the backups which are older than the specified number of days are considered for removal - number_of_backups_per_day_to_keep: int = 2 used by: sweep
for each file, the specified number of backups per day is kept, if available
more backups per day might be kept to satisfy number_of_backups_per_week_to_keep and/or number_of_backups_per_month_to_keep
oldest backups are removed first - number_of_backups_per_week_to_keep: int = 14 used by: sweep
for each file, the specified number of backups per week is kept, if available
more backups per week might be kept to satisfy number_of_backups_per_day_to_keep and/or number_of_backups_per_month_to_keep
oldest backups are removed first - number_of_backups_per_month_to_keep: int = 60 used by: sweep
for each file, the specified number of backups per month is kept, if available
more backups per month might be kept to satisfy number_of_backups_per_day_to_keep and/or number_of_backups_per_week_to_keep
oldest backups are removed first - commands_using_filters: list[str] = ['create'] used by: create, sweep
determines which commands can use the filters specified in the included_* and excluded_* settings
by default, filters are used only by create, i.e. sweep considers all created backups (no filter is applied)
a filter for sweep could be used to e.g. never remove backups from the first day of a month:
excluded_files = ['**/[0-9][0-9][0-9][0-9]-[0-9][0-9]-01_*.tar*']or
excluded_files_as_regex = ['/\d\d\d\d-\d\d-01_\d\d,\d\d,\d\d(\.\d{6})?[+-]\d\d,\d\d~\d+(~.+)?\.tar(\.(gz|bz2|xz|zst))?$']
it's best when the setting is part of a separate profile, i.e. a copy made for sweep,
otherwise create will also seek such files to be excluded - db_path: str = backup_base_dir/rumar.sqlite
Version 3 has the additional settings included_files and excluded_files.
They rely on PurePath.full_match(...), which was added in Python 3.13.
The new settings remove the need for the following ones:
- included_top_dirs
- excluded_top_dirs
- included_files_as_glob
- excluded_files_as_glob
Also backup_base_dir_for_profile is renamed to backup_dir.
Version 1 contained sha256_comparison_if_same_size.
In version 2 it's checksum_comparison_if_same_size.
Logging is controlled by settings located in rumar/rumar.logging.toml inside $XDG_CONFIG_HOME ($HOME/.config if not set) on POSIX,
or inside %APPDATA% on NT (Windows).
You can copy the below settings to your own file and modify them as needed.
By default, rumar.log is created in the current directory (where rumar.py is executed).
This can be changed by setting filename=/path/to/rumar.log.
To disable the creation of rumar.log,
put a hash # in front of "to_file", in [loggers.rumar].
version = 1
[formatters.f1]
format = "{levelShort} {asctime}: {funcName:24} {msg}"
style = "{"
validate = true
[handlers.to_console]
class = "logging.StreamHandler"
formatter = "f1"
#level = "DEBUG_14"
[handlers.to_file]
class = "logging.FileHandler"
filename = "rumar.log"
encoding = "UTF-8"
formatter = "f1"
#level = "DEBUG_14"
[loggers.rumar]
handlers = [
"to_console",
"to_file",
]
level = "DEBUG_14"More information: https://docs.python.org/3/library/logging.config.html#logging-config-dictschema
Copyright © 2023-2025 macmarrum
SPDX-License-Identifier: GPL-3.0-or-later

