> yet when they just dip their toes into kubernetes world, when they switched their docker image registry to run on kube, they've quickly discovered glaring omissions like missing liveness probe and storm of errors on any version update.
You are right, we had some missteps with the Helm chart that were unfortunately not discovered by others or us. Our test cases of scaling registry up and down worked perfectly in all our synthetic tests we did so it was not obvious to us that the liveness probe was missing. In hindsight it is quite obvious but at that time we had 16 other charts to write and some things did slip through. For a number of other services, community and paying users were reporting issues that we solved as we went further. Registry is one of the components that receives traffic in bursts so for majority of users this was probably never an issue or it was one of those "gremlin" moments.
For GitLab.com the things are quite different. We hold 2PB of data in docker images alone and there is continuous flow of traffic. For GitLab.com scale, none of the services we are porting to K8s have the luxury of the bursty traffic so we are careful in how and when we switch over traffic. The good thing is that all the edge cases we found are fixed and now in the Helm charts releases so that users can really put this on k8s as ready. If you are curious on all the issues we had to cover during this process, see the main registry migration epic https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/70 .
It is also worth nothing that over time, the omnibus-gitlab package has benefited greatly from being used to deploy to GitLab.com. As we migrate more of the workloads into kubernetes using the charts, we expect to see some of the same benefits and insights there. See our handbook for more details on how we use dogfooding: https://about.gitlab.com/handbook/engineering/#dogfooding
Every case of dogfooding brings tremendous benefits to the depth and quality of your offering, no doubt about it and these improvements are very much welcome. Issue I have is not with dynamic, but with the current state of affairs, were wide spectre of features are presented as ready, yet they are not as high quality as Gitlab marketing.
(GitLab employee) thank you so much for the feedback! We agree that a lot of our features aren't as polished as they seem to be presented, which is why we hope the Maturity page adds more transparency.
You are right, we had some missteps with the Helm chart that were unfortunately not discovered by others or us. Our test cases of scaling registry up and down worked perfectly in all our synthetic tests we did so it was not obvious to us that the liveness probe was missing. In hindsight it is quite obvious but at that time we had 16 other charts to write and some things did slip through. For a number of other services, community and paying users were reporting issues that we solved as we went further. Registry is one of the components that receives traffic in bursts so for majority of users this was probably never an issue or it was one of those "gremlin" moments.
For GitLab.com the things are quite different. We hold 2PB of data in docker images alone and there is continuous flow of traffic. For GitLab.com scale, none of the services we are porting to K8s have the luxury of the bursty traffic so we are careful in how and when we switch over traffic. The good thing is that all the edge cases we found are fixed and now in the Helm charts releases so that users can really put this on k8s as ready. If you are curious on all the issues we had to cover during this process, see the main registry migration epic https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/70 .