Got another great question today
We’re migrating a Swarm Scheduler instance to Replicated’s Kubernetes-Based KOTS App Manager.
The end user would like to stop the Swarm Scheduler processes, install the KOTS version, and then restore all their configuration.
If anything goes wrong, they need the option to roll back the change and go back to running the Swarm Scheduler version.
What are the best practices for maximizing the chances of success for this? What steps should we follow?
This assumes the deployed app has no state stored on the node in question.
This will help to migrate internal Relicated state like configuration options, but all app state is expected to be in external databases configured by customer-provided
For applications that store state in embedded databases, you will need to build a plan to migrate that manually as well.
- Familiarize yourself with the methods for collecting a support bundle and sharing support bundles with the Replicated team. Ensure you have access to Replicated - Vendor from your team, and access to the shared private GitHub repo used to track support requests.
- Familiarize yourself with the process for submitting a support issue at Replicated - Vendor
- Take a snapshot of the instance in case you need to restore to a new instance.
- This guide assumes no application state, and that all state is stored externally.
If you skipped the “Before you start” steps, go back and do them, including taking an snapshot of the running application.
If you’re using direct-to-disk snapshots, it might be worth backing up the snapshots directory on a separate server.
replicated CLI to export the application’s configuration options. Store this in a safe place. See the docs.
replicatedctl app-config export --hidden to export all configuration including passwords, or
replicatedctl app-config export (without
--hidden) to export only non-password items.
For Swarm Scheduler apps, these will be on-disk at $LOCATION, or you can re-provision these from wherever you normally get your certs.
In either case, drop them in a working directory on the server at
tls.key (or whatever filenames you prefer).
Stop the application via the Replicated UI.
Use the relevant init command to stop the replicated processes, for example, on systemd servers, run the following as root or with
systemctl stop replicated && systemctl disable replicated systemctl stop replicated-ui && systemctl disable replicated-ui systemctl stop replicated-operator && systemctl disable replicated-operator
To Do – are there other containers that have to be stopped? Are there other containers that should be stopped?
To Do – as long as the app is stopped, can the replicated containers be left up and running during this process, and only stopped on success?
Run the kubernetes installer command, something like
curl https://k8s.kurl.sh/app-name | sudo bash
Upload your certs, upload a license, and use your exported config options to fill the new config screen.
Check to make sure everything is working as expected.
To do – is there a safe way to do this? I’m guessing the kurl uninstall script will also try to rip out docker and friends, which might mean you have to re-run the Replicated Swarm install script to reinstall docker and friends, but will pick up the previous data dirs?
Proposal (needs sudo)
kubeadm reset systemctl start replicated && systemctl enable replicated systemctl start replicated-ui && systemctl enable replicated-ui systemctl start replicated-operator && systemctl enable replicated-operator # start the app in the replicated UI # if anything goes wrong, restart the server and repeat