diff --git a/docs/backup.config-example b/docs/backup.config-example new file mode 100644 index 000000000..39a4d417f --- /dev/null +++ b/docs/backup.config-example @@ -0,0 +1,140 @@ +# GitHub Enterprise Server backup configuration file + +# The hostname of the GitHub Enterprise Server appliance to back up. The host +# must be reachable via SSH from the backup host. +GHE_HOSTNAME="github.example.com" + +# Path to where backup data is stored. By default this is the "data" +# directory next to this file but can be set to an absolute path +# elsewhere for backing up to a separate partition / mount point. +GHE_DATA_DIR="data" + +# The number of backup snapshots to retain. Old snapshots are pruned after each +# successful ghe-backup run. This option should be tuned based on the frequency +# of scheduled backup runs. If backups are scheduled hourly, snapshots will be +# available for the past N hours; if backups are scheduled daily, snapshots will +# be available for the past N days ... +GHE_NUM_SNAPSHOTS=10 + +# Pruning snapshots can be scheduled outside of the backup process. +# If set to 'yes', snapshots will not be pruned by ghe-backup. +# Instead, ghe-prune-snapshots will need to be invoked separately via cron +#GHE_PRUNING_SCHEDULED=yes + +# If GHE_ROUTE_VERIFICATION is set to true then ghe-repository-backup and +# ghe-storage-backup will issue a warning if the repositories and objects in +# the backup do not match the pre-backup inventory of routes. +#GHE_ROUTE_VERIFICATION=false + +# If GHE_MANAGE_CONSOLE_PW_RESTORE is set to false then management-console password +# will not be restored from backed-up snapshot data, it is restored by default +#GHE_MANAGE_CONSOLE_PW_RESTORE=true + +# If GHE_SKIP_CHECKS is set to true (or if --skip-checks is used with ghe-backup) then ghe-host-check +# disk space validation and software version checks on the backup-host will be disabled. +#GHE_SKIP_CHECKS=false + +# Cluster filesystem to check if it's writable as part of ghe-host-check +# By default it is /data/user/tmp but can be updated if needed +#GHE_FILE_SYSTEM_WRITE_CHECK="/data/user/tmp" + +# The hostname of the GitHub appliance to restore. If you've set up a separate +# GitHub appliance to act as a standby for recovery, specify its IP or hostname +# here. The host to restore to may also be specified directly when running +# ghe-restore so use of this variable isn't strictly required. +# +#GHE_RESTORE_HOST="github-standby.example.com" + +# If set to 'yes', ghe-restore will omit the restore of audit logs. +# +#GHE_RESTORE_SKIP_AUDIT_LOGS=no + +# If set to 'yes', backup and restore of Elasticsearch indices will be skipped +# +#GHE_SKIP_SEARCH_INDICES=no + +# When verbose output is enabled with `-v`, it's written to stdout by default. If +# you'd prefer it to be written to a separate file, set this option. +# +#GHE_VERBOSE_LOG="/var/log/backup-verbose.log" + +# Any extra options passed to the SSH command. +# In a single instance environment, nothing is required by default. +# In a clustering environment, "-i abs-path-to-ssh-private-key" is required. +# +#GHE_EXTRA_SSH_OPTS="" +# +# All backup processes are ran with the lowest priority for scheduling by default. +# To change throttling behaviour/allow higher priority for backup processes, set higher values for following variables. +# default value for GHENICE=nice -n 19 +# default value for GHE_IONICE=ionice -c 3 +#GHE_NICE="" +#GHE_IONICE="" + +# Any extra options passed to the rsync command. Nothing required by default. +# +#GHE_EXTRA_RSYNC_OPTS="" + +# If set to 'yes', rsync will be set to use compression during backups and restores transfers. Defaults to 'no'. +# +#GHE_RSYNC_COMPRESSION_ENABLED=yes + +# If enabled and set to 'no', rsync warning message during backups will be suppressed. +#RSYNC_WARNING=no + + +# If set to 'yes', logging output will be colorized. +# +#OUTPUT_COLOR=no + +# If set to 'no', GHE_DATA_DIR will not be created automatically +# and restore/backup will exit 8 +# +#GHE_CREATE_DATA_DIR=yes + +# If set to 'yes', git fsck will run on the repositories +# and print some additional info. +# +# WARNING: do not enable this, only useful for debugging/development +#GHE_BACKUP_FSCK=no + +# Cadence of MSSQL backups +# ,, all in minutes +# e.g. +# - Full backup every week (10080 minutes) +# - Differential backup every day (1440 minutes) +# - Transactionlog backup every 15 minutes +# +#GHE_MSSQL_BACKUP_CADENCE=10080,1440,15 + +# If set to 'yes', ghe-backup jobs will run in parallel. Defaults to 'no'. +# +#GHE_PARALLEL_ENABLED=yes + +# Sets the maximum number of jobs to run in parallel. Defaults to the number +# of available processing units on the machine. +# +#GHE_PARALLEL_MAX_JOBS=2 + +# Sets the maximum number of rsync jobs to run in parallel. Defaults to the +# configured GHE_PARALLEL_MAX_JOBS, or the number of available processing +# units on the machine. +# +# GHE_PARALLEL_RSYNC_MAX_JOBS=3 + +# When jobs are running in parallel wait as needed to avoid starting new jobs +# when the system's load average is not below the specified percentage. Defaults to +# unrestricted. +# +#GHE_PARALLEL_MAX_LOAD=50 + +# When running an external mysql database, run this script to trigger a MySQL backup +# rather than attempting to backup via backup-utils directly. +#EXTERNAL_DATABASE_BACKUP_SCRIPT="/bin/false" + +# When running an external mysql database, run this script to trigger a MySQL restore +# rather than attempting to backup via backup-utils directly. +#EXTERNAL_DATABASE_RESTORE_SCRIPT="/bin/false" + +# If set to 'yes', Pages data will be included in backup and restore. Defaults to 'yes' +#GHE_BACKUP_PAGES=no diff --git a/docs/getting-started.md b/docs/getting-started.md index ac85e5073..f8806e616 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -37,5 +37,5 @@ [1]: https://github.com/github/backup-utils/releases [2]: https://github.com/github/backup-utils/releases/tag/v2.11.4 -[3]: https://github.com/github/enterprise-backup-site/blob/master/backup.config-example +[3]: https://github.com/github/backup-utils/blob/master/docs/backup.config-example [4]: https://docs.github.com/enterprise-server/admin/configuration/configuring-your-enterprise/accessing-the-administrative-shell-ssh diff --git a/docs/incremental-mysql-backups-and-restores.md b/docs/incremental-mysql-backups-and-restores.md deleted file mode 100644 index 53543c927..000000000 --- a/docs/incremental-mysql-backups-and-restores.md +++ /dev/null @@ -1,38 +0,0 @@ -# Incremental MySQL Backups and Restores - -Customers who have large MySQL databases who wish to save storage space can use the `--incremental` flag with `ghe-backup` and `ghe-restore`. -Using this flag performs backups for other parts of GHES as normal, but only performs a MySQL backup of the changes to the database from the previous snapshot. -For larger databases this can conserve a lot of storage space for backups. - -## Configuring number of backups - -In your backup.config file you will need to set the variable `GHE_INCREMENTAL_MAX_BACKUPS`. -This variable determines how many cycles of full and incremental backups will be performed before the next full backup is created. -For example, if `GHE_INCREMENTAL_MAX_BACKUPS` is set to 14, backup-utils will run 1 full backup and then 13 incremental backups before performing another full backup on the next cycle. - -Incremental backups require the previous snapshot backups before them to work. -This means they do not follow the pruning strategy based on `GHE_NUM_SNAPSHOTS`. - -## Performing incremental backups - -To perform incremental backups: - -`bin/ghe-backup --incremental` - -the program will detect whether it needs to performa full or incremental snapshot based on what is currently in `GHE_DATA_DIR`. - -To see what snapshots are part of your full and incremental backups, you can reference `GHE_DATA_DIR/inc_full_backup` and `GHE_DATA_DIR/inc_snapshot_data`, respectively. - -## Performing incremental restores - -To perform incremental restores: - -`bin/ghe-restore --incremental -s ` - -The program will use the MySQL folders from each previous incremental backup and the full backup to restore the database. - -:warning: Incremental restores require the other snapshots in the cycle to complete a restore. Erasing snapshot directories that are part of a cycle corrupts the restore and makes it impossible to restore for the MySQL database. - -### Previous cycles - -To ensure there is a rolling window of mySQL backups, incremental MySQL backups from the cycle before the current one are kept. Those snapshots are pre-pended with `inc_previous`. To perform a restore from there, just use the full directory name for the snapshot ID. diff --git a/docs/requirements.md b/docs/requirements.md index 6fa54e567..c87a0775a 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -34,6 +34,10 @@ However, if your rsync package has backported the CVE fix without backporting th Option #3 is required if your operating system's package manager does not have access to rsync v3.2.5 or later (e.g. Ubuntu Focal). +Please note that some operating systems have their own versioning scheme for packages (including `rsync`). +If your backup host is using one of these operating systems, you will not be able to rely on a version check to determine whether you are +affected by the `rsync` performance degredation described above. + ## Storage requirements Storage requirements vary based on current Git repository disk usage and growth diff --git a/docs/scheduling-backups.md b/docs/scheduling-backups.md index 0e00a5d67..3a7b0d12c 100644 --- a/docs/scheduling-backups.md +++ b/docs/scheduling-backups.md @@ -3,7 +3,16 @@ Regular backups should be scheduled using `cron(8)` or similar command scheduling service on the backup host. The backup frequency will dictate the worst case [recovery point objective (RPO)][1] in your backup plan. We recommend -hourly backups at the least. +hourly backups as a starting point. + +It's important to consider the duration of each backup operation on the +GitHub Enterprise Server (GHES) appliance. Backups of large datasets or +over slow network links can take more than an hour. Additionally, +maintenance queues are paused during a portion of a backup runs. +We recommend scheduling backups to allow sufficient time for jobs +waiting in maintenance queues to process between backup runs + +Only one backup may be in progress at a time. ## Example scheduling of backups @@ -19,7 +28,8 @@ storage. To schedule hourly backup snapshots with verbose informational output written to a log file and errors generating an email: -``` + +```shell MAILTO=admin@example.com 0 * * * * /opt/backup-utils/bin/ghe-backup -v 1>>/opt/backup-utils/backup.log 2>&1 @@ -27,13 +37,13 @@ MAILTO=admin@example.com To schedule nightly backup snapshots instead, use: -``` +```shell MAILTO=admin@example.com 0 0 * * * /opt/backup-utils/bin/ghe-backup -v 1>>/opt/backup-utils/backup.log 2>&1 ``` -## Example snapshot pruning +## Example snapshot pruning By default all expired and incomplete snapshots are deleted at the end of the main backup process `ghe-backup`. If pruning these snapshots takes a long time you can @@ -44,7 +54,7 @@ If this option is enabled you will need to schedule the pruning script `ghe-prun To schedule daily snapshot pruning, use: -``` +```shell MAILTO=admin@example.com 0 3 * * * /opt/backup-utils/share/github-backup-utils/ghe-prune-snapshots 1>>/opt/backup-utils/prune-snapshots.log 2>&1 diff --git a/docs/usage.md b/docs/usage.md index 4324b58cc..58a816bc8 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -13,7 +13,7 @@ These commands are run on the host you [installed][1] Backup Utilities on. You can supply your own configuration file or use the example configuration file as a template where you can set up your environment for backing up and restoring. -An example configuration file with documentation on possible settings can found in [backup.config-example](../backup.config-example). +An example configuration file with documentation on possible settings can found in [backup.config-example](backup.config-example). There are a number of command-line options that can also be passed to the `ghe-restore` command. Of particular note, if you use an external MySQL service but are restoring from a snapshot prior to enabling this, or vice versa, you must migrate the MySQL data outside of the context of backup-utils first, then pass the `--skip-mysql` flag to `ghe-restore`. @@ -112,8 +112,7 @@ Please refer to [GHES Documentation](https://docs.github.com/en/enterprise-serve ## Incremental MySQL Backups and Restores -If you are interested in performing incremental backups of the MySQL data in your GitHub Enterprise Server instance, see [Incremental MySQL Backups and Restores](incremental-mysql-backups-and-restores.md) for details. - +Incremental MySQL backup has been deprecated since 3.17 due to data integrity concerns. Restoring backups created with incremental backups remains supported for compatibility reasons. ## Rsync compression From backup-utils v3.11.0 onwards, we have disabled rsync compression by default to improve transfer speed and reduce CPU usage during the transfer process.