Poland
GRANDMETRIC Sp. z o.o.
ul. Metalowa 5, 60-118 Poznań, Poland
NIP 7792433527
+48 61 271 04 43
info@grandmetric.com
Sweden
Drottninggatan 86
111 36 Stockholm
+46 762 041 514
info@grandmetric.com
UK
Grandmetric LTD
Office 584b
182-184 High Street North
London
E6 2JA
+44 20 3321 5276
info@grandmetric.com
US Region
Grandmetric LLC
Lewes DE 19958
16192 Coastal Hwy USA
EIN: 98-1615498
+1 302 691 94 10
info@grandmetric.com
We all know the feeling when we lose all our unsaved work during a sudden power outage. Data recovery is possible, but businesses cannot afford downtime. Banks or hospitals cannot afford downtime either, where even major failures should not interrupt the continuity of work. It is best to find and strengthen the weakest links, which could potentially become a pebble that starts an avalanche. In short, it is time to look for single points of failure (SPOF).
As in the process of creating a backup and recovery of data, we are safe to the extent of the copies we have. A single point of failure in a company’s infrastructure is nothing more than a lack of redundancy, i.e. proper replication or diversification of devices, suppliers or software. Properly designed redundancy allows for the creation of a high availability infrastructure (HA), which is irreplaceable in the event of a response to a power failure, loss or destruction of equipment, data loss and other random situations.
The same threats apply to both large business centers and small businesses. Of course, depending on the size of the organization, the possibilities of reacting to a failure and earlier counteracting it will be different (from the number of redundant devices and connections to Disaster Recovery). Nevertheless, remedial measures should be taken according to your capabilities.
A SPOF can affect any element of infrastructure (as seen in the graphic below), so let’s look at each layer – from PCs to virtual machines, network infrastructure, and even physical factors. Sometimes, a small thing can threaten a company’s existence.
Redundant architecture without SPOF
Employee computers are among the most vulnerable to damage or loss in any organization. A user puts a cup down the wrong way or steals it from a car and sometimes weeks of work are forgotten. Without a computer, we won’t start a machine needed in a process or send an important order and the dominoes start to fall.
Here we need to protect ourselves on two levels, but fortunately this does not require a large financial outlay or a lot of work. The idea is to create additional space for data, from which in the event of a computer loss we can restore the system and the contents of the disks.
It would be very bad if such space remained empty despite everything, so automatic synchronization or sensitizing employees to the need to save files in the appropriate place would be an advantage.
If something is virtual, it doesn’t mean it can’t fail, which is why replication also applies to virtual machines. For example, in the case of virtual Wi-Fi controllers, the lack of redundancy can affect the entire network, leading to problems in many places at the same time. The solution is not only virtualization, but also hyperconvergence. When configuring the software, we need to take care of the appropriate algorithms that will ensure that the virtualized infrastructure will operate uninterruptedly.
Preventing SPOF in the virtualization layer
Switches, firewalls, servers, Wi-Fi controllers – it’s good to have more than one device because by configuring their operation, we can create clusters, i.e. solutions connected in groups that perform the same function. When one of them stops working – the others take over its function.
Every device, not just network devices, can be redundant in some way. When buying a server or choosing an array, it’s worth checking whether it has two power supplies, for example, or whether it offers a hot-swap function in the event of a failure (replacing the card without turning off the device).
SPOF in the network layer
The two worst-case scenarios these days are power outages and internet outages.
Although blackouts are rare these days, their frequency will increase. Climate change and rising temperatures, combined with an unfortunately inefficient energy system, are already causing excessive strain on the grid. Automatic generators are ideal for such an eventuality. However, the huge cost and need to provide the right conditions mean that only the largest companies and sensitive facilities such as banks and hospitals can afford them. Smaller companies can equip themselves with UPS devices that will maintain power for a period of several to several dozen minutes. During this time, we will have time to save unfinished projects or survive a temporary power outage.
Both electricity and the internet are at risk from nearby earthworks. One wrong move by an excavator operator and the cables are severed. The key here will be diversification of suppliers. This is not always possible, especially in a city where there is only one energy operator or one fiber optic infrastructure. However, if we have such an opportunity, it is worth doing it. In the case of the internet, the matter will be easier, because we can have a radio or satellite connection.
Finally, rare but not improbable fires and floods. What good is having two servers if they are located in the same, flooded room? If conditions allow it, you can consider moving the excess infrastructure to another location – e.g. to another building, another part of the city.
Connection redundancy mitigates SPOF
It is impossible to completely eliminate the threat associated with a single point of failure, especially in the case of random events. However, a thorough analysis of the system and implementation of security measures in the areas indicated above will significantly minimize them. As with a backup, you also need to regularly test security measures and, if possible, respond to weaknesses detected and shown in the security audit report.
In addition to physical and software solutions, you should always remember procedures, especially those related to backup or data recovery. When all else fails, there is a chance to recover at least some of the work and avoid a disaster.
Experienced in the commercial areas of network and network & data security. Active in the area of communication with clients, he will help in recognizing the problem, selecting solutions and suggesting an effective implementation model. His competence is confirmed by technical certificates from Cisco, Sophos, Palo Alto and Fortinet brands.