The Promise of Planned Reboots: a Robust & Healthy Server
Rebooting a server is a misunderstood practice, often negatively perceived because downtime gets misconstrued as system instability and unreliability. It is gratifying to see a server’s consecutive running days increase. It is great to see that your chosen cloud host has such dependable hardware that your server never has to reset its total days number back to 0.
Consider, however, that this run of constancy could be setting your server up for a situation that it can’t easily recover from.
tl;dr: Your server requires an occasional shutdown in order to remain operationally robust, and backend healthy.
Why, you ask?
Some software changes that are done in a server require a reboot in order to take effect and many conflicts in the server will not present themselves until after a reboot. If enough time passes between reboots, there’s a good chance numerous changes have taken place within the server and remain unimplemented.
A reboot updates the entire server operation, from the stack to the app to the network. Also, a reboot allows you to determine if a recent change will take effect on start up. And a little known fact — reboots get rid of excess, unnecessary files in the server, freeing up lots of space. Your server stores discardable information in its .tmp folder, but not everything in there is able to be deleted. Rebooting takes care of that for you.
Some sysadmins argue that instead of rebooting, you can simply restart the services that received changes on the server. This is an effective action in most cases, but not all. Unfortunately, restarting services won’t account for how well everything else in the system works with the now-changed service.
In addition, some services require you access a directory and run the startup.sh file to properly start everything. While this may be common knowledge for some admins, it’s just another task to perform that’s specific to that service. It’s one more nuance you need to remember while testing to see if your services start properly. Add this atop any other services that have their own nuances and things get a bit convoluted. It’s a lot to remember!
The easier, safer solution is to reboot the server and check that all services are running properly.
Even an amazing system administrator can’t remember every single, changed configuration and version of installed services after many months have gone by. You should know if your server is capable of a reboot, and the longer the time between reboots, the worse chances will be that everything will work as you last remember it.
Another thing to keep in mind is that your server requires frequent updates, upgrades and file-system checks to remain relevant and protected. Because all Linux distributions release updates in a timely fashion, it’s good practice to run an update daily or at least weekly. Regular reboots will maintain patch stability and ensure server protection from any recently discovered vulnerabilities. Some updates demand that you reboot your server to implement all updates. I suggest it’s best to heed this advice.
So, how often you should reboot your server?
For a regular cloud server, I recommend about each week. Of course, depending on set up and maintenance, you could have a working and stable server running for months; an experienced admin should have a good idea of the stability of their own personal set ups. Each server is different and can possibly run without issue for a long period of time, but for those unsure, once a week isn’t sacrificing much (Linodes go from graceful shutdown to booted in a short few seconds).
It’s not uncommon for major companies around the world with 24/7 server access to set aside fifteen minutes to three hours a week downtime for server maintenance. Many of these companies resort to load balancing to handle their traffic, especially during maintenance periods. Node balancing practically eliminates downtime, as one server can remain online while the others reboot.
Looking at this issue another way: if a single server is so critical to your entire company that taking it offline for reboots or other maintenance is out of the question, you’ve got to ask if you really want to be in such a volatile situation. What happens if that single server goes down for uncontrolled variables? Something so important should have a backup plan. And reboots can aid and abet that backup plan.
I know, rebooting a server can seem risky. And the longer a server has remained active since its last reboot can make it seem even riskier. But trust me when I tell you, rebooting is necessary. Get it done — regularly.
If it’s been years since your server’s last reboot, you could safely just create a new, up-to-date server with all the required services, then move your content from the old server to the new. This is the strongest recommendation I can make for anyone unsure if their server can withstand its next reboot.
The idea that a server remaining online for several months (or even years) as a badge of performance honor is obsolete — and could lead to a server’s irrecoverable downfall.
You want updates. You want all possible services relevant. You want a smoothly running, online website with as few backend hassles as possible. Then, reboot your servers regularly. Please. You’ll be glad you did.
edit: I do want to add that there are a lot of factors that can change what ‘regularly rebooting’ means. For reasons extending from distribution used to personal intimate knowledge on your server and how to maintain it, you can go from weeks to some months before the server requires a reboot. There is no one who knows your server better than you.