Resolved
The issue has been fully resolved. All affected components are operating normally again.
Monitoring
After the initial mitigation, there were ~20 projects stuck scheduling onto storage nodes. We have now repaired their state, and all impact for all users should now be resolved.
Identified
We have mitigated the issue for the vast majority of users. We're continuing to investigate a small rate of branch creation failures.
Identified
We have identified the underlying cause of this incident and are actively working to implement a resolution. Our team is committed to resolving this issue as quickly as possible to minimize any further impact.
Investigating
We are aware of an issue affecting new project / branch creation and waking up idle endpoints that have been inactive for an extended period. We're investigating the scope, root cause, and potential mitigations.