This article describes our Host Server Down (HSD) process.
What is an HSD event?
Virtualized Cloud Servers, Block Storage, and Databases reside on physical servers called hosts (sometimes called a hypervisor). It is SpinUp’s responsibility to manage the underlying hardware (the host), but it is your responsibility to manage everything relating to your virtual Cloud Server, Block Storage, and/or Database.
The host hardware, like all technology, will eventually fail or need maintenance. When it does, the downtime will impact you unless you’ve made the appropriate preparations. When a host experiences a problem that results in it being offline, all virtual devices on that host are also brought offline.
When a host or hardware failure impacts you, SpinUp sends email to the primary contact on the account to inform you about the HSD event. At that time, engineers are already aware of the problem and are actively engaged in resolving the issue. When the host comes back online, you are notified with a final update.
Actions to take after HSD events
We strongly recommend that you don't make any changes to your affected resources during an ongoing HSD event.
When you receive a notification that a host has been recovered, either log into the affected Cloud Server or attempt to access the affected volume or database and verify that it is operating as you expect.
Remember that any configuration changes that were not set to properly persist through a reboot will revert and need to be re-applied. For example if you created a rule in the Linux iptables configuration of a Cloud Server but did not save it to become persistent, then this will need to be corrected.
How to avoid future downtime
We recommend that you set up your infrastructure so that there is no single point of failure.
For example, use a Load Balancer with Cloud Servers to maintain continuity of your service in the event that a host goes down. Even in the case of a single Cloud Server environment, a Load Balancer enables you to retain access to the same public IP address so that you can clone a replacement Cloud Server from a snapshot. To accomplish this with minimal downtime, we also recommend that you set up scheduled snapshots and backups and test both snapshots and backups periodically to ensure you can quickly clone production Cloud Servers or restore data from backups to a new Cloud Server.
For Cloud Databases, make use of highly available instances, which feature automated failover and replication. You can also configure your Cloud Server to collect scheduled MySQL® dumps from the database as a form of data backup.
With Block Storage, take regular snapshots of your devices, or use file-level backups to ensure that you can recreate or recover data that might be impacted due to an infrastructure incident.