Backup and Restore a Standalone or Frontend install
Periodic backups of Chef Infra Server are essential to managing and maintaining a healthy configuration and ensuring the availability of important data for restoring your system, if required. The backup takes around 4 to 5 minutes for each GB of data on a t3.2xlarge AWS EC2 instance.
Requirements
- Chef Infra Server 14.11.36 or later
chef-server-ctl
For the majority of use cases, chef-server-ctl backup
is the recommended way to take backups of the Chef Infra Server. Use the following commands for managing backups of Chef Infra Server data, and for restoring those backups.
backup
The backup
subcommand is used to back up all Chef Infra Server data.
This subcommand:
- Requires rsync to be installed on the Chef Infra Server before running the command
- Requires a
chef-server-ctl reconfigure
before running the command - Should not be run in a Chef Infra Server configuration with an external PostgreSQL database; use knife ec backup instead
- Puts the initial backup in the
/var/opt/chef-backup
directory as a tar.gz file; move this backup to a new location for safe keeping
Options
This subcommand has the following options:
-y
,--yes
Use to specify if the Chef Infra Server can go offline during tar.gz-based backups.
--pg-options
Use to specify and pass additional options PostgreSQL during backups. See the PostgreSQL documentation for more information.
-c
,--config-only
Backup the Chef Infra Server configuration without backing up data.
-t
,--timeout
Set the maximum amount of time in seconds to wait for shell commands (default 600). This option should be set to greater than 600 for backups taking longer than 10 minutes.
-h
,--help
Show help message.
Syntax
This subcommand has the following syntax:
chef-server-ctl backup
restore
The restore
subcommand is used to restore Chef Infra Server data from
a backup that was created by the backup
subcommand. This subcommand
may also be used to add Chef Infra Server data to a newly-installed
server. Do not run this command in a Chef Infra Server configuration that uses an external PostgreSQL database; use knife ec backup instead. This subcommand:
- Requires rsync installed on the Chef Infra Server before running the command
- Requires a
chef-server-ctl reconfigure
before running the command
Ideally, the restore server will have the same FQDN as the server that you backed up. If the restore server has a different FQDN, then:
- Replace the FQDN in the
/etc/opscode/chef-server.rb
. - Replace the FQDN in the
/etc/opscode/chef-server-running.json
. - Delete the old SSL certificate, key and
-ssl.conf
file from/var/opt/opscode/nginx/ca
. - If you use a CA-issued certificate instead of a self-signed certificate, copy the CA-issued certificate and key into
/var/opt/opscode/nginx/ca
. - Update the
/etc/chef/client.rb
file on each client to point to the new server FQDN. - Run
chef-server-ctl reconfigure
. - Run
chef-server-ctl restore
.
Options
This subcommand has the following options:
-c
,--cleanse
Use to remove all existing data on the Chef Infra Server; it will be replaced by the data in the backup archive.
-d DIRECTORY
,--staging-dir DIRECTORY
Use to specify that the path to an empty directory to be used during the restore process. This directory must have enough disk space to expand all data in the backup archive.
--pg-options
Use to specify and pass additional options PostgreSQL during backups. See the PostgreSQL documentation for more information.
-t
,--timeout
Set the maximum amount of time in seconds to wait for shell commands. Set to greater than 600 for backups that take longer than 10 minutes. Default: 600.
-h
,--help
Show help message.
Syntax
This subcommand has the following syntax:
chef-server-ctl restore PATH_TO_BACKUP (options)
Examples
chef-server-ctl restore /path/to/tar/archive.tar.gz
Backup and restore a Chef Backend install
Warning
Chef Backend is deprecated and no longer under active development. Contact your Chef account representative for information about migrating to Chef Automate HA.
This document is no longer maintained.
In a disaster recovery scenario, the backup and restore processes allow you to restore a data backup into a newly built cluster. The restore process is not intended for recovering individual machine in the Chef Backend cluster or for a point-in-time rollback of an existing cluster.
Backup
Restoring your data in an emergency requires existing backups in the .tar
format of:
- The Chef Backend cluster data
- The Chef Infra Server configuration file
To make backups use in future disaster scenarios:
- On a follower Chef Backend node, create the back-end data backup with:
chef-backend-ctl backup
- On Chef Infra Server node, create the server configuration backup with:
chef-server-ctl backup --config-only
- Move the tar archives created in steps (1) and (2) to a long-term storage location
Restore
The restore process requires Chef Infra Server 14.11.36 or later.
Restoring Chef Backend for a Chef Infra Server cluster has two steps:
- Restore the back-end services
- Restore the front-end services
Backend Restore
Restoring the back-end services creates a new cluster. Select one node as the leader and restore the backup on that node first. Use the IP address of the leader node as the value for the
--publish_address
option.chef-backend-ctl restore --publish_address my.company.ip.address /path/to/backup.tar.gz
For example,
chef-backend-ctl restore --publish_address 198.52.1000.0 /backups/2021/backup.tar.gz
The restore process creates a new cluster and generates a JSON secrets file for setting up communication between the nodes. Locate the file in
/etc/chef-backend/chef-backend-secrets.json
and copy it to each node astmp/chef-backend-secrets.json
Join follower nodes to your new Chef Backend cluster. For each follower node, run the
join-cluster
subcommand to establish communication in the cluster. The command uses:- The IP address of the new leader node.
- The IP address of the follower node that joins through the
--publish_address
option. - The secrets option
-s
with the/tmp/chef-backend-secrets.json
file on the node.
The
join-cluster
command is:chef-backend-ctl join-cluster --accept-license --yes --quiet IP_OF_LEADER_NODE --publish_address IP_OF_FOLLOWER_NODE -s /tmp/chef-backend-secrets.json
For example:
chef-backend-ctl join-cluster --accept-license --yes --quiet 198.51.100.0 --publish_address 203.0.113.0 -s /tmp/chef-backend-secrets.json
Generate the configuration for the front end from the new cluster:
chef-backend-ctl gen-server-config chefserver.internal > /tmp/chef-server.rb
Frontend Restore
Note
Restore Chef Infra Server from your backed-up Infra Server configuration generated by the new cluster.
chef-server-ctl restore /path/to/chef-server-backup.tar.gz
Copy the Chef generated config
/tmp/chef-server.rb
, to the front end node and replace it onto/etc/opscode/chef-server.rb
.Run reconfigure to apply the changes.
chef-server-ctl reconfigure
Run the
reindex
command to re-populate your search indexchef-server-ctl reindex --all
Note
knife search
does not return the expected results and data is present in the Chef Infra Server after reindex, then verify the search index configuration.Verify
The best practice for maintaining useful backup is to periodically verify your backup by restoring:
- One Chef Backend node
- One Chef Infra Server node
Verify that you can execute knife commands and Chef Infra Client runs against your these restored nodes.
Troubleshoot
The restore process requires Chef Infra Server 14.11.36 or later.
For a quick fix you can edit /opt/opscode/embedded/lib/ruby/gems/2.7.0/gems/chef-server-ctl-1.1.0/bin/chef-server-ctl
and add the following methods:
# External Solr/ElasticSearch Commands
def external_status_opscode_solr4(_detail_level)
solr = external_services['opscode-solr4']['external_url']
begin
Chef::HTTP.new(solr).get(solr_status_url)
puts "run: opscode-solr4: connected OK to #{solr}"
rescue StandardError => e
puts "down: opscode-solr4: failed to connect to #{solr}: #{e.message.split("\n")[0]}"
end
end
def external_cleanse_opscode_solr4(perform_delete)
log <<-EOM
Cleansing data in a remote Sol4 instance is not currently supported.
EOM
end
def solr_status_url
case running_service_config('opscode-erchef')['search_provider']
when "elasticsearch"
"/chef"
else
"/admin/ping?wt=json"
end
end