Another dimension of differences between readiness and liveness probes is what you want to do during high load. During the load the probes may fail more often (due to timeout). If you have very sensitive liveness probe — ie short timeout and/or small failure threshold — you may end up with Kubernetes killing your containers in the worst moment, under the peak load.
What we found as a best practice for us is to setup liveness probe quite tolerant for probe failures — for example twice a timeout than our app is designed for and failure threshold of 5 or 10. This will allow the app under the load to miss few probes and not being killed, allowing it to process the user’s requests.
Then we setup readiness probe that checks if the app respond in the designed manner — ie the design timeout, and 2 or even 1 value of failure threshold. When the pod is under load that causes degradation of service, kubernetes stops sending new traffic to it. It allows the pod to process its backlog and return to the pool of available pods — effectively self load balancing the load.