[{"data":1,"prerenderedAt":715},["ShallowReactive",2],{"/en-us/blog/postmortem-of-database-outage-of-january-31/":3,"navigation-en-us":32,"banner-en-us":460,"footer-en-us":477,"GitLab":687,"next-steps-en-us":700},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"seo":8,"content":16,"config":22,"_id":25,"_type":26,"title":27,"_source":28,"_file":29,"_stem":30,"_extension":31},"/en-us/blog/postmortem-of-database-outage-of-january-31","blog",false,"",{"title":9,"description":10,"ogTitle":9,"ogDescription":10,"noIndex":6,"ogImage":11,"ogUrl":12,"ogSiteName":13,"ogType":14,"canonicalUrls":12,"schema":15},"Postmortem of database outage of January 31","Postmortem on the database outage of January 31 2017 with the lessons we learned.","https://res.cloudinary.com/about-gitlab-com/image/upload/v1749663397/Blog/Hero%20Images/logoforblogpost.jpg","https://about.gitlab.com/blog/postmortem-of-database-outage-of-january-31","https://about.gitlab.com","article","\n                        {\n        \"@context\": \"https://schema.org\",\n        \"@type\": \"Article\",\n        \"headline\": \"Postmortem of database outage of January 31\",\n        \"author\": [{\"@type\":\"Person\",\"name\":\"GitLab\"}],\n        \"datePublished\": \"2017-02-10\",\n      }",{"title":9,"description":10,"authors":17,"heroImage":11,"date":19,"body":20,"category":21},[18],"GitLab","2017-02-10","\n\nOn January 31st 2017, we experienced a major service outage for one of our products, the online service GitLab.com. The outage was caused by an accidental removal of data from our primary database server.\n\nThis incident caused the GitLab.com service to be unavailable for many hours. We also lost some production data that we were eventually unable to recover. Specifically, we lost modifications to database data such as projects, comments, user accounts, issues and snippets, that took place between 17:20 and 00:00 UTC on January 31. Our best estimate is that it affected roughly 5,000 projects, 5,000 comments and 700 new user accounts. Code repositories or wikis hosted on GitLab.com were unavailable during the outage, but were not affected by the data loss. [GitLab Enterprise](/enterprise/) customers, GitHost customers, and self-managed GitLab CE users were not affected by the outage, or the data loss.\n\nLosing production data is unacceptable. To ensure this does not happen again we're working on multiple improvements to our operations & recovery procedures for GitLab.com. In this article we'll look at what went wrong, what we did to recover, and what we'll do to prevent this from happening in the future.\n\nTo the GitLab.com users whose data we lost and to the people affected by the outage: we're sorry. I apologize personally, as GitLab's CEO, and on behalf of everyone at GitLab.\n\n## Database setup\n\nGitLab.com currently uses a single primary and a single secondary in hot-standby\nmode. The standby is only used for failover purposes. In this setup a single\ndatabase has to handle all the load, which is not ideal. The primary's hostname\nis `db1.cluster.gitlab.com`, while the secondary's hostname is\n`db2.cluster.gitlab.com`.\n\nIn the past we've had various other issues with this particular setup due to\n`db1.cluster.gitlab.com` being a single point of failure. For example:\n\n* [A database outage on November 28th, 2016 due to project_authorizations having too much bloat](https://gitlab.com/gitlab-com/infrastructure/issues/791)\n* [CI distributed heavy polling and exclusive row locking for seconds takes GitLab.com down](https://gitlab.com/gitlab-com/infrastructure/issues/514)\n* [Scary DB spikes](https://gitlab.com/gitlab-com/infrastructure/issues/364)\n\n## Timeline\n\nOn January 31st an engineer started setting up multiple PostgreSQL servers in\nour staging environment. The plan was to try out\n[pgpool-II](http://www.pgpool.net/mediawiki/index.php/Main_Page) to see if it\nwould reduce the load on our database by load balancing queries between the\navailable hosts. Here is the issue for that plan:\n[infrastructure#259](https://gitlab.com/gitlab-com/infrastructure/issues/259).\n\n**± 17:20 UTC:** prior to starting this work, our engineer took an LVM snapshot\nof the production database and loaded this into the staging environment. This was\nnecessary to ensure the staging database was up to date, allowing for more\naccurate load testing. This procedure normally happens automatically once every\n24 hours (at 01:00 UTC), but they wanted a more up to date copy of the\ndatabase.\n\n**± 19:00 UTC:** GitLab.com starts experiencing an increase in database load due\nto what we suspect was spam. In the week leading up to this event GitLab.com had\nbeen experiencing similar problems, but not this severe. One of the problems\nthis load caused was that many users were not able to post comments on issues\nand merge requests. Getting the load under control took several hours.\n\nWe would later find out that part of the load was caused by a background job\ntrying to remove a GitLab employee and their associated data. This was the\nresult of their account being flagged for abuse and accidentally scheduled for removal. More information regarding this particular problem can be found in the\nissue [\"Removal of users by spam should not hard\ndelete\"](https://gitlab.com/gitlab-org/gitlab-ce/issues/27581).\n\n**± 23:00 UTC:** Due to the increased load, our PostgreSQL secondary's\nreplication process started to lag behind. The replication failed as WAL\nsegments needed by the secondary were already removed from the primary. As\nGitLab.com was not using WAL archiving, the secondary had to be re-synchronised\nmanually. This involves removing the\nexisting data directory on the secondary, and running\n[pg_basebackup](https://www.postgresql.org/docs/9.6/static/app-pgbasebackup.html)\nto copy over the database from the primary to the secondary.\n\nOne of the engineers went to the secondary and wiped the data directory, then\nran `pg_basebackup`. Unfortunately `pg_basebackup` would hang, producing no\nmeaningful output, despite the `--verbose` option being set. After a few tries\n`pg_basebackup` mentioned that it could not connect due to the master not having\nenough available replication connections (as controlled by the `max_wal_senders`\noption).\n\nTo resolve this our engineers decided to temporarily increase\n`max_wal_senders` from the default value of `3` to `32`. When applying the\nsettings, PostgreSQL refused to restart, claiming too many semaphores were being\ncreated. This can happen when, for example, `max_connections` is set too high. In\nour case this was set to `8000`. Such a value is way too high, yet it had been\napplied almost a year ago and was working fine until that point. To resolve this\nthe setting's value was reduced to `2000`, resulting in PostgreSQL restarting\nwithout issues.\n\nUnfortunately this did not resolve the problem of `pg_basebackup` not starting\nreplication immediately. One of the engineers decided to run it with `strace` to\nsee what it was blocking on. `strace` showed that `pg_basebackup` was hanging in\na `poll` call, but that did not provide any other meaningful information that might\nhave explained why.\n\n**± 23:30 UTC:** one of the engineers thinks that perhaps `pg_basebackup`\ncreated some files in the PostgreSQL data directory of the secondary during the\nprevious attempts to run it. While normally `pg_basebackup` prints an error when\nthis is the case, the engineer in question wasn't too sure what was going on. It\nwould later be revealed by another engineer (who wasn't around at the time) that\nthis is normal behaviour: `pg_basebackup` will wait for the primary to start\nsending over replication data and it will sit and wait silently until that time.\nUnfortunately this was not clearly documented in our [engineering\nrunbooks](https://gitlab.com/gitlab-com/runbooks) nor in the official\n`pg_basebackup` document.\n\nTrying to restore the replication process, an engineer proceeds to wipe the\nPostgreSQL database directory, errantly thinking they were doing so on the\nsecondary. Unfortunately this process was executed on the primary instead. The\nengineer terminated the process a second or two after noticing their mistake,\nbut at this point around 300 GB of data had already been removed.\n\nHoping they could restore the database the engineers involved went to look for\nthe database backups, and asked for help on Slack. Unfortunately the process of\nboth finding and using backups failed completely.\n\n## Broken recovery procedures\n\nThis brings us to the recovery procedures. Normally in an event like this, one\nshould be able to restore a database in relatively little time using a recent\nbackup, though some form of data loss can not always be prevented. For\nGitLab.com we have the following procedures in place:\n\n1. Every 24 hours a backup is generated using `pg_dump`, this backup is uploaded\n   to Amazon S3. Old backups are automatically removed after some time.\n1. Every 24 hours we generate an LVM snapshot of the disk storing the production\n   database data. This snapshot is then loaded into the staging environment,\n   allowing us to more safely test changes without impacting our production\n   environment. Direct access to the staging database is restricted, similar to\n   our production database.\n1. For various servers (e.g. the NFS servers storing Git data) we use Azure disk\n   snapshots. These snapshots are taken once per 24 hours.\n1. Replication between PostgreSQL hosts, primarily used for failover purposes\n   and not for disaster recovery.\n\nAt this point the replication process was broken and data had already been wiped\nfrom both the primary and secondary, meaning we could not restore from either\nhost.\n\n### Database backups using pg_dump\n\nWhen we went to look for the `pg_dump` backups we found out they were not there.\nThe S3 bucket was empty, and there was no recent backup to be found anywhere.\nUpon closer inspection we found out that the backup procedure was using\n`pg_dump` 9.2, while our database is running PostgreSQL 9.6 (for Postgres, 9.x\nreleases are considered major). A difference in major versions results in\n`pg_dump` producing an error, terminating the backup procedure.\n\nThe difference is the result of how our Omnibus package works. We currently\nsupport both PostgreSQL 9.2 and 9.6, allowing users to upgrade (either manually\nor using commands provided by the package). To determine the correct version to\nuse the Omnibus package looks at the PostgreSQL version of the database cluster\n(as determined by `$PGDIR/PG_VERSION`, with `$PGDIR` being the path to the data\ndirectory). When PostgreSQL 9.6 is detected Omnibus ensures all binaries use\nPostgreSQL 9.6, otherwise it defaults to PostgreSQL 9.2.\n\nThe `pg_dump` procedure was executed on a regular application server, not the\ndatabase server. As a result there is no PostgreSQL data directory present on\nthese servers, thus Omnibus defaults to PostgreSQL 9.2. This in turn resulted in\n`pg_dump` terminating with an error.\n\nWhile notifications are enabled for any cronjobs that error, these notifications\nare sent by email. For GitLab.com we use [DMARC](https://dmarc.org/).\nUnfortunately DMARC was not enabled for the cronjob emails, resulting in them\nbeing rejected by the receiver. This means we were never aware of the backups\nfailing, until it was too late.\n\n### Azure disk snapshots\n\nAzure disk snapshots are used to generate a snapshot of an entire disk. These\nsnapshots don't make it easy to restore individual chunks of data (e.g. a lost\nuser account), though it's possible. The primary purpose is to restore entire\ndisks in case of disk failure.\n\nIn Azure a snapshot belongs to a storage account, and a storage account in turn\nis linked to one or more hosts. Each storage account has a limit of roughly 30\nTB. When restoring a snapshot using a host in the same storage account, the\nprocedure usually completes very quickly. However, when using a host in a\ndifferent storage account the procedure can take hours if not days to complete.\nFor example, in one such case it took over a week to restore a snapshot. As a\nresult we try not to rely on this system too much.\n\nWhile enabled for the NFS servers, these snapshots were not enabled for any of\nthe database servers as we assumed that our other backup procedures were\nsufficient enough.\n\n### LVM snapshots\n\nThe LVM snapshots are primarily used to easily copy data from our production\nenvironment to our staging environment. While this process was working as\nintended, the produced snapshots are not really meant to be used for disaster\nrecovery. At the time of the outage we had two snapshots available:\n\n1. A snapshot created for our staging environment every 24 hours, almost 24\n   hours before the outage happened.\n1. A snapshot created manually by one of the engineers roughly 6 hours before\n   the outage.\n\nWhen we generate a snapshot the following steps are taken:\n\n1. Generate a snapshot of production.\n1. Copy the snapshot to staging.\n1. Create a new disk using this snapshot.\n1. Remove all webhooks from the resulting database, to prevent them from being\n   triggered by accident.\n\n## Recovering GitLab.com\n\nTo recover GitLab.com we decided to use the LVM snapshot created 6 hours before\nthe outage, as it was our only option to reduce data loss as much as possible\n(the alternative was to lose almost 24 hours of data). This process would\ninvolve the following steps:\n\n1. Copy the existing staging database to production, which would not contain any\n   webhooks.\n1. In parallel, copy the snapshot used to set up the database as this snapshot\n   might still contain the webhooks (we weren't entirely sure).\n1. Set up a production database using the snapshot from step 1.\n1. Set up a separate database using the snapshot from step 2.\n1. Restore webhooks using the database set up in the previous step.\n1. Increment all database sequences by 100,000 so one can't re-use IDs that\n   might have been used before the outage.\n1. Gradually re-enable GitLab.com.\n\nFor our staging environment we were using Azure classic, without Premium Storage.\nThis is primarily done to save costs as premium storage is quite expensive. As a\nresult the disks are very slow, resulting in them being the main bottleneck in\nthe restoration process. Because LVM snapshots are stored on the hosts they are\ntaken for we had two options to restore data:\n\n1. Copy over the LVM snapshot\n1. Copy over the PostgreSQL data directory\n\nIn both cases the amount of data to copy would be roughly the same. Since\ncopying over and restoring the data directory would be easier we decided to go\nwith this solution.\n\nCopying the data from the staging to the production host took around 18 hours. These disks are network disks and are throttled to a really low number (around 60Mbps), there is no way to move from cheap storage to premium, so this was the performance we would get out of it. There was no network or processor bottleneck, the bottleneck was in the drives.\nOnce copied we were able to restore the database (including webhooks) to the\nstate it was at January 31st, 17:20 UTC.\n\nOn February 1st at 17:00 UTC we managed to restore the GitLab.com database\nwithout webhooks. Restoring webhooks was done by creating a separate staging\ndatabase using the LVM snapshot, but without triggering the removal of webhooks.\nThis allowed us to generate a SQL dump of the table and import this into the\nrestored GitLab.com database.\n\nAround 18:00 UTC we finished the final restoration procedures such as restoring\nthe webhooks and confirming everything was operating as expected.\n\n## Publication of the outage\n\nIn the spirit of transparency we kept track of progress and notes in a\n[publicly visible Google document](https://docs.google.com/document/d/1GCK53YDcBWQveod9kfzW-VCxIABGiryG7_z_6jHdVik/pub).\nWe also streamed the recovery procedure on YouTube, with a peak viewer count of\naround 5000 (resulting in the stream being the #2 live stream on YouTube for\nseveral hours). The stream was used to give our users live updates about the\nrecovery procedure. Finally we used Twitter (\u003Chttps://twitter.com/gitlabstatus>)\nto inform those that might not be watching the stream.\n\nThe document in question was initially private to GitLab employees and contained\nname of the engineer who accidentally removed the data. While the name was added\nby the engineer themselves (and they had no problem with this being public), we\nwill redact names in future cases as other engineers may not be comfortable with\ntheir name being published.\n\n## Data loss impact\n\nDatabase data such as projects, issues, snippets, etc. created between January\n31st 17:20 UTC and 23:30 UTC has been lost. Git repositories and Wikis were not\nremoved as they are stored separately.\n\nIt's hard to estimate how much data has been lost exactly, but we estimate we\nhave lost at least 5000 projects, 5000 comments, and roughly 700 users. This\nonly affected users of GitLab.com, self-managed instances or GitHost instances\nwere not affected.\n\n## Impact on GitLab itself\n\nSince GitLab uses GitLab.com to develop GitLab the outage meant that for some it\nwas harder to get work done. Most developers could continue working using their\nlocal Git repositories, but creating issues and such had to be delayed. To\npublish the blog post [\"GitLab.com Database\nIncident\"](/blog/gitlab-dot-com-database-incident/)\nwe used a private GitLab instance we normally use for private/sensitive\nworkflows (e.g. security releases). This allowed us to build and deploy a new\nversion of the website while GitLab.com was unavailable.\n\nWe also have a public monitoring website located at\n\u003Chttps://dashboards.gitlab.com/>. Unfortunately the current setup for this website\nwas not able to handle the load produced by users using this service during the\noutage. Fortunately our internal monitoring systems (which dashboards.gitlab.com is\nbased on) were not affected.\n\n## Root cause analysis\n\nTo analyse the root cause of these problems we'll use a technique called [\"The 5\nWhys\"](https://en.wikipedia.org/wiki/5_Whys). We'll break up the incident into 2\nmain problems: GitLab.com being down, and it taking a long time to restore\nGitLab.com.\n\n**Problem 1:** GitLab.com was down for about 18 hours.\n\n1. **Why was GitLab.com down?** - The database directory of the primary database\n   was removed by accident, instead of removing the database directory of the\n   secondary.\n1. **Why was the database directory removed?** - Database replication stopped,\n   requiring the secondary to be reset/rebuilt. This in turn requires that the\n   PostgreSQL data directory is empty. Restoring this required manual work as\n   this was not automated, nor was it documented properly.\n1. **Why did replication stop?** - A spike in database load caused the database\n   replication process to stop. This was due to the primary removing WAL\n   segments before the secondary could replicate them.\n1. **Why did the database load increase?** - This was caused by two events\n   happening at the same time: an increase in spam, and a process trying to\n   remove a GitLab employee and their associated data.\n1. **Why was a GitLab employee scheduled for removal?** - The employee was\n   reported for abuse by a troll. The current system used for responding to\n   abuse reports makes it too easy to overlook the details of those reported. As\n   a result the employee was accidentally scheduled for removal.\n\n**Problem 2:** restoring GitLab.com took over 18 hours.\n\n1. **Why did restoring GitLab.com take so long?** - GitLab.com had to be\n   restored using a copy of the staging database. This was hosted on slower\n   Azure VMs in a different region.\n1. **Why was the staging database needed for restoring GitLab.com?** - Azure\n   disk snapshots were not enabled for the database servers, and the periodic\n   database backups using `pg_dump` were not working.\n1. **Why could we not fail over to the secondary database host?** - The\n   secondary database's data was wiped as part of restoring database\n   replication. As such it could not be used for disaster recovery.\n1. **Why could we not use the standard backup procedure?** - The standard backup\n   procedure uses `pg_dump` to perform a logical backup of the database. This\n   procedure failed silently because it was using PostgreSQL 9.2, while\n   GitLab.com runs on PostgreSQL 9.6.\n1. **Why did the backup procedure fail silently?** - Notifications were\n   sent upon failure, but because of the Emails being rejected there was no\n   indication of failure. The sender was an automated process with no other\n   means to report any errors.\n1. **Why were the Emails rejected?** - Emails were rejected by the receiving\n   mail server due to the Emails not being signed using DMARC.\n1. **Why were Azure disk snapshots not enabled?** - We assumed our other backup\n   procedures were sufficient. Furthermore, restoring these snapshots can take\n   days.\n1. **Why was the backup procedure not tested on a regular basis?** - Because\n   there was no ownership, as a result nobody was responsible for testing this\n   procedure.\n\n## Improving recovery procedures\n\nWe are currently working on fixing and improving our various recovery\nprocedures. Work is split across the following issues:\n\n1. [Overview of status of all issues listed in this blog post (#1684)](https://gitlab.com/gitlab-com/infrastructure/issues/1684)\n1. [Update PS1 across all hosts to more clearly differentiate between hosts and environments (#1094)](https://gitlab.com/gitlab-com/infrastructure/issues/1094)\n1. [Prometheus monitoring for backups (#1095)](https://gitlab.com/gitlab-com/infrastructure/issues/1095)\n1. [Set PostgreSQL's max_connections to a sane value (#1096)](https://gitlab.com/gitlab-com/infrastructure/issues/1096)\n1. [Investigate Point in time recovery & continuous archiving for PostgreSQL (#1097)](https://gitlab.com/gitlab-com/infrastructure/issues/1097)\n1. [Hourly LVM snapshots of the production databases (#1098)](https://gitlab.com/gitlab-com/infrastructure/issues/1098)\n1. [Azure disk snapshots of production databases (#1099)](https://gitlab.com/gitlab-com/infrastructure/issues/1099)\n1. [Move staging to the ARM environment (#1100)](https://gitlab.com/gitlab-com/infrastructure/issues/1100)\n1. [Recover production replica(s) (#1101)](https://gitlab.com/gitlab-com/infrastructure/issues/1101)\n1. [Automated testing of recovering PostgreSQL database backups (#1102)](https://gitlab.com/gitlab-com/infrastructure/issues/1102)\n1. [Improve PostgreSQL replication documentation/runbooks (#1103)](https://gitlab.com/gitlab-com/infrastructure/issues/1103)\n1. [Investigate pgbarman for creating PostgreSQL backups (#1105)](https://gitlab.com/gitlab-com/infrastructure/issues/1105)\n1. [Investigate using WAL-E as a means of Database Backup and Realtime Replication (#494)](https://gitlab.com/gitlab-com/infrastructure/issues/494)\n1. [Build Streaming Database Restore](https://gitlab.com/gitlab-com/infrastructure/issues/1152)\n1. [Assign an owner for data durability](https://gitlab.com/gitlab-com/infrastructure/issues/1163)\n\nWe are also working on setting up multiple secondaries and balancing the load\namongst these hosts. More information on this can be found at:\n\n* [Bundle pgpool-II 3.6.1 (!1251)](https://gitlab.com/gitlab-org/omnibus-gitlab/merge_requests/1251)\n* [Connection pooling/load balancing for PostgreSQL (#259)](https://gitlab.com/gitlab-com/infrastructure/issues/259)\n\nOur main focus is to improve disaster recovery, and making it more obvious as to\nwhat host you're using; instead of preventing production engineers from running\ncertain commands. For example, one could alias `rm` to something safer but in\ndoing so would only protect themselves against accidentally running `rm -rf\n/important-data`, not against disk corruption or any of the many other ways you\ncan lose data.\n\nAn ideal environment is one in which you _can_ make mistakes but easily and\nquickly recover from them with minimal to no impact. This in turn requires you\nto be able to perform these procedures on a regular basis, and make it easy to\ntest and roll back any changes. For example, we are in the process of setting up\nprocedures that allow developers to test their database migrations. More\ninformation on this can be found in the issue\n[\"Tool for executing and reverting Rails migrations on staging\"](https://gitlab.com/gitlab-com/infrastructure/issues/811).\n\nWe're also looking into ways to build better recovery procedures for the entire\nGitLab.com infrastructure, and not just the database; and to ensure there is\nownership of these procedures. The issue for this is\n[\"Disaster recovery for everything that is not the database\"](https://gitlab.com/gitlab-com/infrastructure/issues/1161).\n\nMonitoring wise we also started working on a public backup monitoring dashboard,\nwhich can be found at \u003Chttps://dashboards.gitlab.com/dashboard/db/postgresql-backups>.\nCurrently this dashboard only contains data of our `pg_dump` backup procedure,\nbut we aim to add more data over time.\n\nOne might notice that at the moment our `pg_dump` backups are 3 days old.  We\nperform these backups on a secondary as `pg_dump` can put quite a bit of\npressure on a database. Since we are in the process of rebuilding our\nsecondaries the `pg_dump` backup procedure is suspended for the time being. Fear\nnot however, as LVM snapshots are now taken every hour instead of once per 24\nhours. Enabling Azure disk snapshots is something we're still looking into.\n\nFinally, we're looking into improving our abuse reporting and response system.\nMore information regarding this can be found in the issue\n[\"Removal of users by spam should not hard delete\"](https://gitlab.com/gitlab-org/gitlab-ce/issues/27581).\n\nIf you think there are additional measures we can take to prevent incidents like this please let us know in the comments.\n\n## Troubleshooting FAQ\n\n### Some of my merge requests are shown as being open, but their commits have already been merged into the default branch. How can I resolve this?\n\nPushing to the default branch will automatically update the merge request so\nthat it's aware of there not being any differences between the source and target\nbranch. At this point you can safely close the merge request.\n\n### My merge request has not yet been merged, and I am not seeing my changes. How can I resolve this?\n\nThere are 3 options to resolve this:\n\n1. Close the MR and create a new one\n1. Push new changes to the merge request's source branch\n1. Rebase/amend, and force push to the merge request's source branch\n\n### My GitLab Pages website was not updated. How can I solve this?\n\nGo to your project, then \"Pipelines\", \"New Pipeline\", use \"master\" as the\nbranch, then create the pipeline. This will create and start a new pipeline\nusing your master branch, which should result in your website being updated.\n\n### My Pipelines were not executed\n\nMost likely they were, but the database is not aware of this. To solve this,\ncreate a new pipeline using the right branch and run it.\n\n### Some commits are not showing up\n\nPushing new commits should automatically solve this. Alternatively you can try\nforce pushing to the target branch.\n\n### I created a project after 17:20 UTC and it shows up, but my issues are gone.  What happened?\n\nProject details are stored in the database. This meant that this data was lost\nfor projects created after 17:20. We ran a procedure to restore these\nprojects based on their Git repositories that were still stored in our NFS\ncluster. This procedure however was only able to restore projects in their most\nbasic form, without associated data such as issues and merge requests.\n","company",{"slug":23,"featured":6,"template":24},"postmortem-of-database-outage-of-january-31","BlogPost","content:en-us:blog:postmortem-of-database-outage-of-january-31.yml","yaml","Postmortem Of Database Outage Of January 31","content","en-us/blog/postmortem-of-database-outage-of-january-31.yml","en-us/blog/postmortem-of-database-outage-of-january-31","yml",{"_path":33,"_dir":34,"_draft":6,"_partial":6,"_locale":7,"data":35,"_id":456,"_type":26,"title":457,"_source":28,"_file":458,"_stem":459,"_extension":31},"/shared/en-us/main-navigation","en-us",{"logo":36,"freeTrial":41,"sales":46,"login":51,"items":56,"search":387,"minimal":418,"duo":437,"pricingDeployment":446},{"config":37},{"href":38,"dataGaName":39,"dataGaLocation":40},"/","gitlab logo","header",{"text":42,"config":43},"Get free trial",{"href":44,"dataGaName":45,"dataGaLocation":40},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com&glm_content=default-saas-trial/","free trial",{"text":47,"config":48},"Talk to sales",{"href":49,"dataGaName":50,"dataGaLocation":40},"/sales/","sales",{"text":52,"config":53},"Sign in",{"href":54,"dataGaName":55,"dataGaLocation":40},"https://gitlab.com/users/sign_in/","sign in",[57,101,199,204,309,368],{"text":58,"config":59,"cards":61,"footer":84},"Platform",{"dataNavLevelOne":60},"platform",[62,68,76],{"title":58,"description":63,"link":64},"The most comprehensive AI-powered DevSecOps Platform",{"text":65,"config":66},"Explore our Platform",{"href":67,"dataGaName":60,"dataGaLocation":40},"/platform/",{"title":69,"description":70,"link":71},"GitLab Duo (AI)","Build software faster with AI at every stage of development",{"text":72,"config":73},"Meet GitLab Duo",{"href":74,"dataGaName":75,"dataGaLocation":40},"/gitlab-duo/","gitlab duo ai",{"title":77,"description":78,"link":79},"Why GitLab","10 reasons why Enterprises choose GitLab",{"text":80,"config":81},"Learn more",{"href":82,"dataGaName":83,"dataGaLocation":40},"/why-gitlab/","why gitlab",{"title":85,"items":86},"Get started with",[87,92,97],{"text":88,"config":89},"Platform Engineering",{"href":90,"dataGaName":91,"dataGaLocation":40},"/solutions/platform-engineering/","platform engineering",{"text":93,"config":94},"Developer Experience",{"href":95,"dataGaName":96,"dataGaLocation":40},"/developer-experience/","Developer experience",{"text":98,"config":99},"MLOps",{"href":100,"dataGaName":98,"dataGaLocation":40},"/topics/devops/the-role-of-ai-in-devops/",{"text":102,"left":103,"config":104,"link":106,"lists":110,"footer":181},"Product",true,{"dataNavLevelOne":105},"solutions",{"text":107,"config":108},"View all Solutions",{"href":109,"dataGaName":105,"dataGaLocation":40},"/solutions/",[111,136,160],{"title":112,"description":113,"link":114,"items":119},"Automation","CI/CD and automation to accelerate deployment",{"config":115},{"icon":116,"href":117,"dataGaName":118,"dataGaLocation":40},"AutomatedCodeAlt","/solutions/delivery-automation/","automated software delivery",[120,124,128,132],{"text":121,"config":122},"CI/CD",{"href":123,"dataGaLocation":40,"dataGaName":121},"/solutions/continuous-integration/",{"text":125,"config":126},"AI-Assisted Development",{"href":74,"dataGaLocation":40,"dataGaName":127},"AI assisted development",{"text":129,"config":130},"Source Code Management",{"href":131,"dataGaLocation":40,"dataGaName":129},"/solutions/source-code-management/",{"text":133,"config":134},"Automated Software Delivery",{"href":117,"dataGaLocation":40,"dataGaName":135},"Automated software delivery",{"title":137,"description":138,"link":139,"items":144},"Security","Deliver code faster without compromising security",{"config":140},{"href":141,"dataGaName":142,"dataGaLocation":40,"icon":143},"/solutions/security-compliance/","security and compliance","ShieldCheckLight",[145,150,155],{"text":146,"config":147},"Application Security Testing",{"href":148,"dataGaName":149,"dataGaLocation":40},"/solutions/application-security-testing/","Application security testing",{"text":151,"config":152},"Software Supply Chain Security",{"href":153,"dataGaLocation":40,"dataGaName":154},"/solutions/supply-chain/","Software supply chain security",{"text":156,"config":157},"Software Compliance",{"href":158,"dataGaName":159,"dataGaLocation":40},"/solutions/software-compliance/","software compliance",{"title":161,"link":162,"items":167},"Measurement",{"config":163},{"icon":164,"href":165,"dataGaName":166,"dataGaLocation":40},"DigitalTransformation","/solutions/visibility-measurement/","visibility and measurement",[168,172,176],{"text":169,"config":170},"Visibility & Measurement",{"href":165,"dataGaLocation":40,"dataGaName":171},"Visibility and Measurement",{"text":173,"config":174},"Value Stream Management",{"href":175,"dataGaLocation":40,"dataGaName":173},"/solutions/value-stream-management/",{"text":177,"config":178},"Analytics & Insights",{"href":179,"dataGaLocation":40,"dataGaName":180},"/solutions/analytics-and-insights/","Analytics and insights",{"title":182,"items":183},"GitLab for",[184,189,194],{"text":185,"config":186},"Enterprise",{"href":187,"dataGaLocation":40,"dataGaName":188},"/enterprise/","enterprise",{"text":190,"config":191},"Small Business",{"href":192,"dataGaLocation":40,"dataGaName":193},"/small-business/","small business",{"text":195,"config":196},"Public Sector",{"href":197,"dataGaLocation":40,"dataGaName":198},"/solutions/public-sector/","public sector",{"text":200,"config":201},"Pricing",{"href":202,"dataGaName":203,"dataGaLocation":40,"dataNavLevelOne":203},"/pricing/","pricing",{"text":205,"config":206,"link":208,"lists":212,"feature":296},"Resources",{"dataNavLevelOne":207},"resources",{"text":209,"config":210},"View all resources",{"href":211,"dataGaName":207,"dataGaLocation":40},"/resources/",[213,246,268],{"title":214,"items":215},"Getting started",[216,221,226,231,236,241],{"text":217,"config":218},"Install",{"href":219,"dataGaName":220,"dataGaLocation":40},"/install/","install",{"text":222,"config":223},"Quick start guides",{"href":224,"dataGaName":225,"dataGaLocation":40},"/get-started/","quick setup checklists",{"text":227,"config":228},"Learn",{"href":229,"dataGaLocation":40,"dataGaName":230},"https://university.gitlab.com/","learn",{"text":232,"config":233},"Product documentation",{"href":234,"dataGaName":235,"dataGaLocation":40},"https://docs.gitlab.com/","product documentation",{"text":237,"config":238},"Best practice videos",{"href":239,"dataGaName":240,"dataGaLocation":40},"/getting-started-videos/","best practice videos",{"text":242,"config":243},"Integrations",{"href":244,"dataGaName":245,"dataGaLocation":40},"/integrations/","integrations",{"title":247,"items":248},"Discover",[249,254,258,263],{"text":250,"config":251},"Customer success stories",{"href":252,"dataGaName":253,"dataGaLocation":40},"/customers/","customer success stories",{"text":255,"config":256},"Blog",{"href":257,"dataGaName":5,"dataGaLocation":40},"/blog/",{"text":259,"config":260},"Remote",{"href":261,"dataGaName":262,"dataGaLocation":40},"https://handbook.gitlab.com/handbook/company/culture/all-remote/","remote",{"text":264,"config":265},"TeamOps",{"href":266,"dataGaName":267,"dataGaLocation":40},"/teamops/","teamops",{"title":269,"items":270},"Connect",[271,276,281,286,291],{"text":272,"config":273},"GitLab Services",{"href":274,"dataGaName":275,"dataGaLocation":40},"/services/","services",{"text":277,"config":278},"Community",{"href":279,"dataGaName":280,"dataGaLocation":40},"/community/","community",{"text":282,"config":283},"Forum",{"href":284,"dataGaName":285,"dataGaLocation":40},"https://forum.gitlab.com/","forum",{"text":287,"config":288},"Events",{"href":289,"dataGaName":290,"dataGaLocation":40},"/events/","events",{"text":292,"config":293},"Partners",{"href":294,"dataGaName":295,"dataGaLocation":40},"/partners/","partners",{"backgroundColor":297,"textColor":298,"text":299,"image":300,"link":304},"#2f2a6b","#fff","Insights for the future of software development",{"altText":301,"config":302},"the source promo card",{"src":303},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758208064/dzl0dbift9xdizyelkk4.svg",{"text":305,"config":306},"Read the latest",{"href":307,"dataGaName":308,"dataGaLocation":40},"/the-source/","the source",{"text":310,"config":311,"lists":312},"Company",{"dataNavLevelOne":21},[313],{"items":314},[315,320,326,328,333,338,343,348,353,358,363],{"text":316,"config":317},"About",{"href":318,"dataGaName":319,"dataGaLocation":40},"/company/","about",{"text":321,"config":322,"footerGa":325},"Jobs",{"href":323,"dataGaName":324,"dataGaLocation":40},"/jobs/","jobs",{"dataGaName":324},{"text":287,"config":327},{"href":289,"dataGaName":290,"dataGaLocation":40},{"text":329,"config":330},"Leadership",{"href":331,"dataGaName":332,"dataGaLocation":40},"/company/team/e-group/","leadership",{"text":334,"config":335},"Team",{"href":336,"dataGaName":337,"dataGaLocation":40},"/company/team/","team",{"text":339,"config":340},"Handbook",{"href":341,"dataGaName":342,"dataGaLocation":40},"https://handbook.gitlab.com/","handbook",{"text":344,"config":345},"Investor relations",{"href":346,"dataGaName":347,"dataGaLocation":40},"https://ir.gitlab.com/","investor relations",{"text":349,"config":350},"Trust Center",{"href":351,"dataGaName":352,"dataGaLocation":40},"/security/","trust center",{"text":354,"config":355},"AI Transparency Center",{"href":356,"dataGaName":357,"dataGaLocation":40},"/ai-transparency-center/","ai transparency center",{"text":359,"config":360},"Newsletter",{"href":361,"dataGaName":362,"dataGaLocation":40},"/company/contact/","newsletter",{"text":364,"config":365},"Press",{"href":366,"dataGaName":367,"dataGaLocation":40},"/press/","press",{"text":369,"config":370,"lists":371},"Contact us",{"dataNavLevelOne":21},[372],{"items":373},[374,377,382],{"text":47,"config":375},{"href":49,"dataGaName":376,"dataGaLocation":40},"talk to sales",{"text":378,"config":379},"Get help",{"href":380,"dataGaName":381,"dataGaLocation":40},"/support/","get help",{"text":383,"config":384},"Customer portal",{"href":385,"dataGaName":386,"dataGaLocation":40},"https://customers.gitlab.com/customers/sign_in/","customer portal",{"close":388,"login":389,"suggestions":396},"Close",{"text":390,"link":391},"To search repositories and projects, login to",{"text":392,"config":393},"gitlab.com",{"href":54,"dataGaName":394,"dataGaLocation":395},"search login","search",{"text":397,"default":398},"Suggestions",[399,401,405,407,411,415],{"text":69,"config":400},{"href":74,"dataGaName":69,"dataGaLocation":395},{"text":402,"config":403},"Code Suggestions (AI)",{"href":404,"dataGaName":402,"dataGaLocation":395},"/solutions/code-suggestions/",{"text":121,"config":406},{"href":123,"dataGaName":121,"dataGaLocation":395},{"text":408,"config":409},"GitLab on AWS",{"href":410,"dataGaName":408,"dataGaLocation":395},"/partners/technology-partners/aws/",{"text":412,"config":413},"GitLab on Google Cloud",{"href":414,"dataGaName":412,"dataGaLocation":395},"/partners/technology-partners/google-cloud-platform/",{"text":416,"config":417},"Why GitLab?",{"href":82,"dataGaName":416,"dataGaLocation":395},{"freeTrial":419,"mobileIcon":424,"desktopIcon":429,"secondaryButton":432},{"text":420,"config":421},"Start free trial",{"href":422,"dataGaName":45,"dataGaLocation":423},"https://gitlab.com/-/trials/new/","nav",{"altText":425,"config":426},"Gitlab Icon",{"src":427,"dataGaName":428,"dataGaLocation":423},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203874/jypbw1jx72aexsoohd7x.svg","gitlab icon",{"altText":425,"config":430},{"src":431,"dataGaName":428,"dataGaLocation":423},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203875/gs4c8p8opsgvflgkswz9.svg",{"text":433,"config":434},"Get Started",{"href":435,"dataGaName":436,"dataGaLocation":423},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com/compare/gitlab-vs-github/","get started",{"freeTrial":438,"mobileIcon":442,"desktopIcon":444},{"text":439,"config":440},"Learn more about GitLab Duo",{"href":74,"dataGaName":441,"dataGaLocation":423},"gitlab duo",{"altText":425,"config":443},{"src":427,"dataGaName":428,"dataGaLocation":423},{"altText":425,"config":445},{"src":431,"dataGaName":428,"dataGaLocation":423},{"freeTrial":447,"mobileIcon":452,"desktopIcon":454},{"text":448,"config":449},"Back to pricing",{"href":202,"dataGaName":450,"dataGaLocation":423,"icon":451},"back to pricing","GoBack",{"altText":425,"config":453},{"src":427,"dataGaName":428,"dataGaLocation":423},{"altText":425,"config":455},{"src":431,"dataGaName":428,"dataGaLocation":423},"content:shared:en-us:main-navigation.yml","Main Navigation","shared/en-us/main-navigation.yml","shared/en-us/main-navigation",{"_path":461,"_dir":34,"_draft":6,"_partial":6,"_locale":7,"title":462,"button":463,"image":468,"config":472,"_id":474,"_type":26,"_source":28,"_file":475,"_stem":476,"_extension":31},"/shared/en-us/banner","is now in public beta!",{"text":464,"config":465},"Try the Beta",{"href":466,"dataGaName":467,"dataGaLocation":40},"/gitlab-duo/agent-platform/","duo banner",{"altText":469,"config":470},"GitLab Duo Agent Platform",{"src":471},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1753720689/somrf9zaunk0xlt7ne4x.svg",{"layout":473},"release","content:shared:en-us:banner.yml","shared/en-us/banner.yml","shared/en-us/banner",{"_path":478,"_dir":34,"_draft":6,"_partial":6,"_locale":7,"data":479,"_id":683,"_type":26,"title":684,"_source":28,"_file":685,"_stem":686,"_extension":31},"/shared/en-us/main-footer",{"text":480,"source":481,"edit":487,"contribute":492,"config":497,"items":502,"minimal":675},"Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license",{"text":482,"config":483},"View page source",{"href":484,"dataGaName":485,"dataGaLocation":486},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/","page source","footer",{"text":488,"config":489},"Edit this page",{"href":490,"dataGaName":491,"dataGaLocation":486},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/content/","web ide",{"text":493,"config":494},"Please contribute",{"href":495,"dataGaName":496,"dataGaLocation":486},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/CONTRIBUTING.md/","please contribute",{"twitter":498,"facebook":499,"youtube":500,"linkedin":501},"https://twitter.com/gitlab","https://www.facebook.com/gitlab","https://www.youtube.com/channel/UCnMGQ8QHMAnVIsI3xJrihhg","https://www.linkedin.com/company/gitlab-com",[503,526,582,611,645],{"title":58,"links":504,"subMenu":509},[505],{"text":506,"config":507},"DevSecOps platform",{"href":67,"dataGaName":508,"dataGaLocation":486},"devsecops platform",[510],{"title":200,"links":511},[512,516,521],{"text":513,"config":514},"View plans",{"href":202,"dataGaName":515,"dataGaLocation":486},"view plans",{"text":517,"config":518},"Why Premium?",{"href":519,"dataGaName":520,"dataGaLocation":486},"/pricing/premium/","why premium",{"text":522,"config":523},"Why Ultimate?",{"href":524,"dataGaName":525,"dataGaLocation":486},"/pricing/ultimate/","why ultimate",{"title":527,"links":528},"Solutions",[529,534,536,538,543,548,552,555,559,564,566,569,572,577],{"text":530,"config":531},"Digital transformation",{"href":532,"dataGaName":533,"dataGaLocation":486},"/topics/digital-transformation/","digital transformation",{"text":146,"config":535},{"href":148,"dataGaName":146,"dataGaLocation":486},{"text":135,"config":537},{"href":117,"dataGaName":118,"dataGaLocation":486},{"text":539,"config":540},"Agile development",{"href":541,"dataGaName":542,"dataGaLocation":486},"/solutions/agile-delivery/","agile delivery",{"text":544,"config":545},"Cloud transformation",{"href":546,"dataGaName":547,"dataGaLocation":486},"/topics/cloud-native/","cloud transformation",{"text":549,"config":550},"SCM",{"href":131,"dataGaName":551,"dataGaLocation":486},"source code management",{"text":121,"config":553},{"href":123,"dataGaName":554,"dataGaLocation":486},"continuous integration & delivery",{"text":556,"config":557},"Value stream management",{"href":175,"dataGaName":558,"dataGaLocation":486},"value stream management",{"text":560,"config":561},"GitOps",{"href":562,"dataGaName":563,"dataGaLocation":486},"/solutions/gitops/","gitops",{"text":185,"config":565},{"href":187,"dataGaName":188,"dataGaLocation":486},{"text":567,"config":568},"Small business",{"href":192,"dataGaName":193,"dataGaLocation":486},{"text":570,"config":571},"Public sector",{"href":197,"dataGaName":198,"dataGaLocation":486},{"text":573,"config":574},"Education",{"href":575,"dataGaName":576,"dataGaLocation":486},"/solutions/education/","education",{"text":578,"config":579},"Financial services",{"href":580,"dataGaName":581,"dataGaLocation":486},"/solutions/finance/","financial services",{"title":205,"links":583},[584,586,588,590,593,595,597,599,601,603,605,607,609],{"text":217,"config":585},{"href":219,"dataGaName":220,"dataGaLocation":486},{"text":222,"config":587},{"href":224,"dataGaName":225,"dataGaLocation":486},{"text":227,"config":589},{"href":229,"dataGaName":230,"dataGaLocation":486},{"text":232,"config":591},{"href":234,"dataGaName":592,"dataGaLocation":486},"docs",{"text":255,"config":594},{"href":257,"dataGaName":5,"dataGaLocation":486},{"text":250,"config":596},{"href":252,"dataGaName":253,"dataGaLocation":486},{"text":259,"config":598},{"href":261,"dataGaName":262,"dataGaLocation":486},{"text":272,"config":600},{"href":274,"dataGaName":275,"dataGaLocation":486},{"text":264,"config":602},{"href":266,"dataGaName":267,"dataGaLocation":486},{"text":277,"config":604},{"href":279,"dataGaName":280,"dataGaLocation":486},{"text":282,"config":606},{"href":284,"dataGaName":285,"dataGaLocation":486},{"text":287,"config":608},{"href":289,"dataGaName":290,"dataGaLocation":486},{"text":292,"config":610},{"href":294,"dataGaName":295,"dataGaLocation":486},{"title":310,"links":612},[613,615,617,619,621,623,625,629,634,636,638,640],{"text":316,"config":614},{"href":318,"dataGaName":21,"dataGaLocation":486},{"text":321,"config":616},{"href":323,"dataGaName":324,"dataGaLocation":486},{"text":329,"config":618},{"href":331,"dataGaName":332,"dataGaLocation":486},{"text":334,"config":620},{"href":336,"dataGaName":337,"dataGaLocation":486},{"text":339,"config":622},{"href":341,"dataGaName":342,"dataGaLocation":486},{"text":344,"config":624},{"href":346,"dataGaName":347,"dataGaLocation":486},{"text":626,"config":627},"Sustainability",{"href":628,"dataGaName":626,"dataGaLocation":486},"/sustainability/",{"text":630,"config":631},"Diversity, inclusion and belonging (DIB)",{"href":632,"dataGaName":633,"dataGaLocation":486},"/diversity-inclusion-belonging/","Diversity, inclusion and belonging",{"text":349,"config":635},{"href":351,"dataGaName":352,"dataGaLocation":486},{"text":359,"config":637},{"href":361,"dataGaName":362,"dataGaLocation":486},{"text":364,"config":639},{"href":366,"dataGaName":367,"dataGaLocation":486},{"text":641,"config":642},"Modern Slavery Transparency Statement",{"href":643,"dataGaName":644,"dataGaLocation":486},"https://handbook.gitlab.com/handbook/legal/modern-slavery-act-transparency-statement/","modern slavery transparency statement",{"title":646,"links":647},"Contact Us",[648,651,653,655,660,665,670],{"text":649,"config":650},"Contact an expert",{"href":49,"dataGaName":50,"dataGaLocation":486},{"text":378,"config":652},{"href":380,"dataGaName":381,"dataGaLocation":486},{"text":383,"config":654},{"href":385,"dataGaName":386,"dataGaLocation":486},{"text":656,"config":657},"Status",{"href":658,"dataGaName":659,"dataGaLocation":486},"https://status.gitlab.com/","status",{"text":661,"config":662},"Terms of use",{"href":663,"dataGaName":664,"dataGaLocation":486},"/terms/","terms of use",{"text":666,"config":667},"Privacy statement",{"href":668,"dataGaName":669,"dataGaLocation":486},"/privacy/","privacy statement",{"text":671,"config":672},"Cookie preferences",{"dataGaName":673,"dataGaLocation":486,"id":674,"isOneTrustButton":103},"cookie preferences","ot-sdk-btn",{"items":676},[677,679,681],{"text":661,"config":678},{"href":663,"dataGaName":664,"dataGaLocation":486},{"text":666,"config":680},{"href":668,"dataGaName":669,"dataGaLocation":486},{"text":671,"config":682},{"dataGaName":673,"dataGaLocation":486,"id":674,"isOneTrustButton":103},"content:shared:en-us:main-footer.yml","Main Footer","shared/en-us/main-footer.yml","shared/en-us/main-footer",[688],{"_path":689,"_dir":690,"_draft":6,"_partial":6,"_locale":7,"content":691,"config":694,"_id":696,"_type":26,"title":697,"_source":28,"_file":698,"_stem":699,"_extension":31},"/en-us/blog/authors/gitlab","authors",{"name":18,"config":692},{"headshot":693,"ctfId":18},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1749659488/Blog/Author%20Headshots/gitlab-logo-extra-whitespace.png",{"template":695},"BlogAuthor","content:en-us:blog:authors:gitlab.yml","Gitlab","en-us/blog/authors/gitlab.yml","en-us/blog/authors/gitlab",{"_path":701,"_dir":34,"_draft":6,"_partial":6,"_locale":7,"header":702,"eyebrow":703,"blurb":704,"button":705,"secondaryButton":709,"_id":711,"_type":26,"title":712,"_source":28,"_file":713,"_stem":714,"_extension":31},"/shared/en-us/next-steps","Start shipping better software faster","50%+ of the Fortune 100 trust GitLab","See what your team can do with the intelligent\n\n\nDevSecOps platform.\n",{"text":42,"config":706},{"href":707,"dataGaName":45,"dataGaLocation":708},"https://gitlab.com/-/trial_registrations/new?glm_content=default-saas-trial&glm_source=about.gitlab.com/","feature",{"text":47,"config":710},{"href":49,"dataGaName":50,"dataGaLocation":708},"content:shared:en-us:next-steps.yml","Next Steps","shared/en-us/next-steps.yml","shared/en-us/next-steps",1758326242751]