4 years ago
About login issues on 2019-10-08
What happened
From 6:11AM to 8:05AM users could not log in to the platform due to authentication issues.
A quick fix was implemented, but we experienced issues when trying to deploy it. The issues were related to:
- The cluster losing one of its nodes and thus not having enough compute power to run the deployment pipeline.
- Integrates previous deployment being stuck due to hard limit policies.
What we have done
We have:
- Re-deployed Integrates in our cluster with the proper fix to the specific issue.
- Increased the number of nodes in our Kubernetes cluster to improve performance.
- Improved Integrates’s deploying rules to avoid future deployment errors
What is the impact
Failed login attempts to Integrates from 6:11AM to 8:05AM that resulted in users getting an error message saying that they did not have authorization to access the platform. 38 users were affected by this at the most.
What we are doing to help
We are improving our cluster’s capabilities of recovering from undesired states.