How to setup a Server

Things to do and things to avoid

Matteo Anselmi
Growens Innovation Blog
5 min readFeb 12, 2020

--

Hello everyone!

Matteo Anselmi here, Infrastructure Engineer at MailUp Group, working in the IT sector since 2013.

During these years I’ve setup nth of servers, learning how to make them more reliable and giving them a long life span with little or no maintenance.

I thought “why not sharing what I’ve learned to make life easier for other fellow IT employees?”

So… here are my do’s and don’ts when setting up a new server. Let’s dive into it!

Things to do

Always do a complete inspection if you are recycling old hardware

Ok! This sounds obvious huh? Well… sometimes obvious things are the most important and easiest to forget! If you are recycling old hardware or fairly new one that has been in service for a reasonable amount of time (about 1–2 years), the first thing you should do is to inspect every single part of it: check the status of the Motherboard, Power Supplies, Ram sockets, RAID Controller, Disk slots, IPMI/BMC etc. After that, remove the CPUs Heatsinks and check the status of the CPUs and CPUs Socket (if you have a server with LGA sockets DOUBLE CHECK that none of the pins are bent).

Use the correct HDD caddies

This has been a lesson I had to learn the hard way. Once I tried to squeeze an SSD HDD into a slot with a HP caddy that wasn’t set up correctly (I added the “anti-short circuit metal screen” thinking that it would fit as a normal SAS 2.5” HDD). I ended up sticking it into the slot destroying completely the caddy while extracting it. This is exactly what you SHOULDN’T do. You’ll thank me later!

Respect the hardware compatibility matrix of your vendor

Although some Motherboards and RAID Controllers are pretty flexible in terms of compatibility, to avoid any type of “mysterious” problem, in the near future, respect the hardware compatibility matrix (I experienced it the most with RAM Sticks)

When possible, clean the servers from dust and dirt

Unfortunately servers cannot be turned on and off like an ordinary PC. While turning them off, 99% of the time you are compromising one or more parts of your infrastructure. My advice is to clean the servers from dust and dirt using compressed air and alcohol whenever you have a scheduled maintenance window or you are re-purposing the server.

If you have time (and patience) perform a PSU and RAID Controller crash test

Even though a visual inspection may be sufficient, if you want to be 100% sure that your server will work better, perform a PSU and RAID Controller crash test:

  • Plug every PSU into the power outlet. With the server on, remove the plug from one PSU, reinsert it and repeat this procedure one more time. This is useful for testing the power delivery circuits and the fail-over mechanism.
  • Create a RAID 1 or 5 with your HDD and try to take them out and put them back. Check whether your RAID Controller rebuilds the logical volume correctly or not.

These simple tests can prevent you from putting online an unstable server (not to mention the potential economic loss!)

Things to avoid

Never change your CPU thermal paste

Don’t you dare! If you are re-purposing old hardware ALWAYS change your thermal paste! Temperature is your n°1 CPU killer, along with dust.

Using old thermal paste entails a loss of many degrees of cooling capacity and, as a result, it causes more stress on the CPU.

Given that many old servers constantly operate with 60°C core temperature, it’s easy to understand why sometimes they explode and knock all the chassis down.

Needless to say, buy a new thermal paste! You needn’t spend tons of money buying a dose of Grizzly Aerocool. You can use, for example, a mid-range paste like the Artic MX or an OEM paste.

It’s up to you whether to spend 10 euros today or 2000 euros tomorrow for a new CPU (possibly a new server). When CPUs explode due to thermal stress, most of the time the Motherboard succumbs too.

Mix RDIMM and UDIMM RAM Modules

Disclaimer: your RAM modules are 100% doing fine (you bet!)

The server is doing POST as usual… what if it freezes up out of the blue? What if it turns off after the OS boot?

Congrats, you’ve combined RDIMM and UDIMM modules just like me.

The memory controller can’t support this mixed setup.

I strongly advise you to use only RDIMM modules because it implies more system stability. Sometimes it brings more RAM capacity supported by the controller (many controllers are designed to perform to setups made up of RDIMM only).

Don’t secure with screws the server rack rails

You are likely to sit on a time bomb: sometimes you have to install a full-loaded server that can weigh more than 40 kilograms. Screws installed in the server rack rails are an additional layer of protection from a catastrophic server fall (or even worse). If the server is installed in the upper part of the rack it can lead to a full destruction of the hardware that is installed below. Usually network switches and routers are located in the upper part of the rack, so.. without some screws you can send offline an entire datacenter housing!

Mix CPU models that belong to the same family

Well… this is not a serious mistake. If you mix them, in the worst case scenario, the server won’t even power on.

Install a full loaded server in the upper part of an empty rack

Once I received a rack with 3 full-loaded servers installed into the upper part of it (there were only them in it). Every time I touched them even slightly, the rack started to wobble as if it was going to fall over.

The first thing that I did was to put some objects on the base to make it stable. Then I reinstalled the servers in the lower part of the rack.

Remember that nothing is more important than your life: be safe while handling servers!

That’s all folks! See you the next article!

--

--