It’s probably better to start with, what is a data centre? Well, the Oxford English dictionary defines it as:
(noun) a large group of networked computer servers typically used by organisations for the remote storage, processing or distribution of large amounts of data.
A data centre is a room (or building), that is part of an organisation’s IT / Operational infrastructure. It is often critical to the reliable operation of that business and the mission-critical data and applications that are hosted and operated within. This room, should be, cooled by precision air handling equipment, have UPS-backed power supplies, have dual fed data cabinets or racks, either hot or cold aisle containment, be a ‘clean environment’, have some sort of fire detection and/or suppression and a monitoring system to ensure it is all working.
A generator is a handy thing to have as well to protect against power cuts, but not all organisations take their resilience to this level. Seems like a lot? The question you should be asking is what the cost or impact to the business is if it stops working for an hour, a day or longer. The reality is a data centre runs behind most businesses and should be considered the essential heartbeat of the business. The commercial losses resulting from a data centre outage can be extreme, but the reputational damage can be even worse.
In this series, we will discuss some of the aspects of data centre maintenance, why these are important and who should be doing them. To ensure a data centre is functioning to its maximum efficiency, a regular data centre maintenance regime is essential. This is what is referred to in the industry as Planned Preventative Maintenance or PPM for short.
In the first part of this series, we will cover what should be included in a good, well-managed PPM Schedule.
Data Centre PPM Schedule
- Air-conditioning / AHU / CRAC or CRAH – These should be serviced quarterly or in some cases less, but it is very dependent on the type of cooling that you have and what the external plant environment is like, as well as the age of the installed infrastructure.
- Uninterruptible Power Supplies / UPS – In most cases annually but some manufacturers will service twice a year. Additional care is required between years 5 and 7 for replacement batteries, fans, and capacitors.
- Generators – These are serviced twice a year, and the services are split into major and minor. Oil sampling and Fuel sampling are often conducted during the major service. There is a lot more that should be done when it comes to generator care, but we will discuss that later in the series.
- BMS / EMS / Monitoring Systems – This is not so straightforward as there is a multitude of products that fall into this category. Keeping in line with the manufacturer’s requirements is the best approach, depending on the product; in most cases, these systems can be tested once a year by a qualified engineer and remain functional and optimised until the next service.
- Fire Suppression / Detection / VESDA – System dependent but a minimum requirement is twice a year, with some systems having to be maintained four times a year. In year 10, any gas bottles must be hydrostatically tested. This is a legal requirement, so don’t forget.
- Electrical Testing / Switchgear / PDU – This is an area that makes all data centre owners/operators twitchy. Every 5 years the data centre must be Periodically Tested, which is a legal requirement. Electrical inspections and thermal imaging, as well as branch circuit testing, is sufficient annually and considered good practice in the interim years. Electrical testing is an area we will cover later in the series.
- Emergency Lighting – All emergency lighting ‘should’ be tested once a month – a short functional test should be sufficient so that the luminaire (lights to me and you) operates correctly. It is worth checking who in the business looks after this as it is often a service that is incorporated into the wider building testing regime.
So that’s it for the first part of the series, I hope you have enjoyed it and I haven’t bored you. Time to get back to work, or start work, or finish lunch or whatever you have been doing while reading this.
My name is Richard Stacey and I am the Director of Operational Infrastructure at Future-tech, I’ve been working in data centres for over 10 years and I’m an accredited Uptime AOS Specialist. Please feel free to direct message me if you want more information or a chat about your data centre.
Next in the series: The Importance of Data Centre Maintenance