How to Back Up Your GitLab Server
GitLab is the premiere web-based solution to store all your Git repositories, enabling professionals to carry out all the tasks in a project, from project planning and source code management to monitoring and security.
However, GitLab hasn’t given its users a foolproof solution to back up their critical data, which exposes them to the risk of losing their critical business information in the event of cyberattacks, system failures, and other mishaps.
As a result, GitLab users have started coming up with their own backup solutions. Unfortunately, a majority of these “solutions” are often faulty and sometimes lead to even more problems. To help you avoid making GitLab backup mistakes, we’ve outlined the steps to back up your GitLab server correctly and easily.
Let’s take a look.
Method 1: The Manual Method to Back Up Your GitLab Server
Let’s take a look at how you can create a GitLab backup manually. Here’s how to go about it:
- Log into your GitLab server using Secure Shell (SSH).
- Use the command sudo gitlab-rake gitlab:backup:create. This will immediately initiate the creation of your GitLab backup.
- Next, use the command sudo gitlab-rake gitlab:backup:create SKIP = db, uploads to skip the directories you don’t want to include in the backup. Doing this will create a GitLab backup tar file, which will be stored in the /var/opt/gitlab/backups directory.
Note: In this case, SKIP is an environment variable that enables you to do the intended job. If you don’t have any files to skip, you can create the backup by following Step 2.
- To open and view your created backup file, move the shell to the backup file directory.
- Once in the directory, type ls-1 to see the GitLab backup tar file.
Congratulations! You’ve created your first manual backup of GitLab. If you want to use this backup, you will need to restore GitLab backup data.
As you can see, there are a fair number of commands to remember to backup GitLab manually. Not everyone has a sharp memory, which is why using an automated way to backup GitLab is the more preferred method of the two.
Method 2: The Automated Method to Back Up Your GitLab Server
When it comes to GitLab backups, going automated is both a simple and time-saving approach.
Here, you will use Backrightup, a user-friendly, GUI-based tool that makes it easier than ever to backup GitLab. You don’t have to remember any commands. Just a few clicks and you’re done.
The following are the steps to automatically back up GitLab using Backrightup.
- Create a Backrightup. account by visiting its website. Enter your password and email to confirm user registration.
- Enter your GitLab organization’s details and access levels. You can either give read-only access or full read/write access to the tool.
As soon as you register and sign in to your GitLab organization, Backrightup will automatically start the backup of all your GitLab data.
Backrightup will run one backup each day on all your repositories, but you have the option to change the settings. Go to Account settings and click on Repository settings. If you want to backup your GitLab data by choice, simply click on Run backup(s) on the interface screen and select the data you want to backup from your GitLab.
You can also change the storage settings for your GitLab backup data as Backrightup gives you its storage, allowing you to store all backups directly on the platform. In addition to this, you can provide your storage for GitLab backup as well. To do this, go to Account settings, followed by Storage settings, and make the changes accordingly.
That’s it—you have now successfully created an automated backup for your GitLab data.
What Does a GitLab Backup Include?
GitLab has a built-in backup utility that exports data created by users directly on your GitLab instance. This typically includes everything in that specific GitLab database and on your on-disk Git repositories.
As soon as you restore the backup, it will reinstate your projects, users, groups, issues, uploaded file attachments, and CI/CD job logs. The backup will also cover the GitLab Pages website and Docker images uploaded to the integrated container registry.
Keep in mind that packages added to GitLab‘s package registry are not supported. To save packages to an external object storage provider, you’ll have to configure your installation. This will allow you to save these packages and make them recoverable without worrying about a manual rebuild.
Common Problems When Backing Up Your GitLab Server
Below, we will review some of the most common mistakes that developers make when backing up their GitLab data. Keep the following errors in mind to avoid losing any critical or sensitive information:
Forgetting to Factor In Human Error
Human error is often the main cause of data loss. Thinking your staff or teammates won’t ever make a mistake is just wishful thinking. To avoid these errors, you need a backup solution for your critical systems.
Case in point: GitLab’s own massive backup failure.
According to reports, a system admin at GitLab tried to push a fix on the website by clearing out the backup database and restarting the copying process. Instead, he accidentally deleted the primary database, causing the platform to lose around 300 GB of data.
Not Testing Backup Systems
Testing backup systems is as important as having a backup.
Don’t postpone your testing — or make it a monthly or yearly event, especially if you create data at an accelerated pace. The quantity of data you lose every hour of downtime can be mind-boggling. GitLab devs learned this the hard way.
The platform had five backup/replication techniques deployed, but none of them were working reliably or set up properly. In the end, devs had to restore a six-hour old backup, which you can imagine was a huge loss, contributing to the 300 GB data loss.
Understanding GitLab’s Optional Copy Strategy
GitLab has a default backup strategy that involves streaming data continuously to the tar archive. While this is usually a good option, it can create problems on very active GitLab instances. For instance, your data may change in the source directory before it’s finished reaching the archive, causing tar to skip it with a file changed as we read it error.
It’s to prevent situations like this that GitLab introduced an optional copy strategy.
Under this, the platform will copy all eligible backup data to a temporary directory before streaming the copied content into the final tar archive. This makes sure tar isn’t reading from a live GitLab instance. While this does eliminate the above problem, it can temporarily increase GitLab’s storage consumption and have a negative effect on backup performance, especially on slower storage devices.
To activate the copy strategy, you need to set the STRATEGY environment variable when running the backup command.
Make sure you’ve got enough disk space available. A good rule of thumb is to have double the size of your largest data type as GitLab will run the backup in data type stages. For example, if you have 2 GB of GitLab repositories and 5 GB of container registries, you will need 5 GB of extra available space and not 7 GB.
Not Prioritizing the Backup of Your Config File
GitLab’s backup script only focuses on and manages user-created data. In addition to this, there are two other critical files that are important for your GitLab server to operate properly, and therefore, must be backed up to ensure the successful recovery of your entire instance.
They are as follows:
- /etc/gitlab/gitlab.rb – This is your GitLab configuration file. All installations except the most basic ones commonly acquire many modifications over a period of time. When you back this particular file, you can drop it into a new GitLab installation without having to start from scratch.
- /etc/gitlab/gitlab-secrets.json – This is another critical file that you must back up because it includes your database encryption key, secrets used for two-factor authentication, and other (non-recoverable) sensitive information. If you misplace this file, any chances of recovering the data are squashed, even when you have a functioning backup archive with you.
We recommend using another cron task to back up these two files. They should be copied off of your server to ensure you still have access to them in case of a hardware failure.
Helpful Posts to Navigate GitLab
Did you enjoy this piece and learning about GitLab backups? We’re going to assume you want to explore GitLab and its functionalities in more detail. As the system is packed with tons of amazing (sometimes confusing) features, we have created other comprehensive blog posts to help developers.
Here are a few of our top recommendations: