Distributed Algorithms for Resilient Wireless Sensor Networks in Disaster-Prone Regions

In recent years, the notion of resilience has become exceptionally pertinent to disaster risk science. During a disaster situation, accurate sensing information is the key to efficient recovery efforts. In general, resilience aims to minimize the impact of disruptions to systems through the fast recovery of critical functionality. However, resilient design may require redundancy and could increase costs.

Balancing Efficiency and Resilience in Sensor Networks

Modern engineered systems have increasingly integrated cyber and physical components for monitoring and control. Advances in sensor hardware and networks have been critical to this development, as sensors form the key linkage between the cyber and physical domains. In this way, sensors are often the key source for real-time disaster risk information and can detect functional degradation when disasters occur.

These advances have greatly increased the number of potential data streams available for users to analyze. They have been accompanied by the rise of increasingly advanced analytical methods such as artificial intelligence, machine learning, and digital twins. As the capacity to draw insights from more complex and heterogeneous data sources has improved, advanced sensor networks have become nearly ubiquitous in applications across diverse domains from intruder detection to power and water system monitoring.

However, with this increased cross-domain integration also come new and poorly understood threats that may cascade and propagate in unexpected ways. Sensors in particular are vulnerable to attacks in both the cyber (e.g., cyberattacks) and physical domains (e.g., targeted attacks, natural hazards, and disasters). The vulnerability at the data source can have downstream impacts on disaster response and recovery.

At the same time, the design of sensor networks tends to focus on how to make a given system operate more efficiently under normal conditions. In this stable environment, efficiency is the primary concern, typically in terms of maximizing coverage with minimal sensors. But in such highly optimized sensor networks, a disruption to even a single sensor can lead to missing information and failure to detect anomalies.

Thus, while sensors are crucial for disaster detection and response, they are also vulnerable to these same events. Applications of advanced sensor networks should accept that disruptions to the network will occur. Instead of working to minimize or eliminate all risks, developers must plan for resilience, designing sensor networks to absorb and quickly recover from disruption.

Defining Resilience in Sensor Networks

The National Academies defines resilience as the ability to plan for, absorb, recover from, or more successfully adapt to actual or potential adverse events. In this article, the focus is on the first two aspects of resilience: to plan for and absorb.

The objective is to determine sensor locations that would preserve wireless sensor network (WSN) functionality if some sensors were disabled, hence absorbing the disruption. This can be thought of at the system level, where the primary focus is the persistent delivery of a system’s critical function, as opposed to the hardening of a specific asset to resist failure.

Linkov et al. (2013) proposed a resilience matrix framework that divides resilience into the four temporal domains defined by the National Academies (plan, absorb, recover, adapt) as well as into the four components of network-centric operations: physical, information, cognitive, and social. Within an integrated cyber-physical system, sensors are key for ensuring access to the information domain through all stages of a disaster. Furthermore, sensors can be impacted by threats in the physical domain, while threats to cyber operations can impact the cognitive ability of decision-makers to process and understand data collected from sensors. In this way, sensor networks have cross-cutting impacts on and relationships with the different components of resilience, requiring further research on methods for the resilient design of sensor networks.

Existing Research on Sensor Resilience

Generally, research on the resilience of sensors and sensor networks is fairly limited, with most studies focusing on only one aspect of resilience. Security is the most common risk-minimization strategy for sensors, with many methods such as key assignment applied to secure sensor networks from threats. While strengthening security is an important part of the planning phase for the sake of mitigating risk, it cannot be assumed to protect against all forms of risk.

Other studies address the fault-tolerance of sensors, such as the introduction of a clustering method for energy-efficient routing of data through a wireless sensor network. Similarly, some work addresses the reliability of sensors under disruption. While both concepts are related to mitigating risks and absorbing disruptions, resilience focuses on critical functionality more holistically and should also consider the ability to recover lost critical function of the network.

The literature on sensor resilience was reviewed in this context to identify the resilience phase of past work, as shown in Table 1. Because of the broad role wireless sensor networks play across network-centric operations, it is necessary to consider their resilience at all levels of the system, which is largely divided into subsystems of the sensor domain, sensor hardware, processing, communication, and power supply.

Resilience Phase	Existing Research
Plan	Security, fault-tolerance, reliability
Absorb	Resilient communication networks, sensor placement for resilience
Recover	Limited research
Adapt	Limited research

The bulk of research has focused on resilient communication networks, manifesting in several studies such as improving path-routing algorithms or designing the network structure to maintain functionality during disruption. At the same time, more emphasis in future work must address the additional need to quickly recover a failed network to an acceptable level of service when faults do occur.

Furthermore, this definition of resilience is applied only to the computing network itself, but sensors have a physical hardware component often ignored in network-based studies. In sensor research, the optimal placement of sensor hardware is an important design criterion, typically based on efficiency only. Research on the resilience of sensors, however, focuses primarily on the communication network and not on the physical arrangement of sensors.

One study examined the optimization of sensor placement within a water quality sensor network for resilience by measuring how overall relative performance changes with disrupted sensors. Further developments are needed to extend these methods to assess the recovery of sensors. More research is necessary on the physical placement of sensors within a network in order to ensure resilience of WSNs at a system level.

A Methodology for Resilient Sensor Placement

To incorporate the concept of resilience by design into sensor networks, this article advances a methodology for optimizing the placement of sensors by taking resilience as well as efficiency into account.

Conventionally, the sensor placement problem centers on minimizing a desired objective, such as the number of sensors required to achieve a specified level of coverage. However, in this most efficient case, disabling any single sensor leads to a loss of coverage in a certain area and therefore potentially a degradation in the critical functionality of the entire network.

Incorporating additional sensors can introduce redundancy to the system such that if one sensor fails, particularly in a high-interest area, another sensor will still be in place to collect data. These additional sensors come with additional resource costs, making it cost-prohibitive to have a fully redundant or resilient network. Balancing the trade-offs between these two objectives in a user-controlled way is the goal of the method proposed here.

The most common approaches to the optimal sensor placement problem are based on combinatorial optimization, random placement, cheap sensors, open large areas, or heuristic placement strategies. The core approach in this article is based on the algorithm described in Vecherin et al. (2011, 2017), which is a fast algorithm providing an approximate solution to the binary linear programming problem formulated for sensor performance in the probabilistic framework.

Formulating the Optimization Problem

In this approach, the performance of any sensor is characterized in terms of the probability of detection (P_d) and the probability of false alarm (P_fa). Along with information about noise energy probability density functions, P_fa determines the value of the threshold for signal detection by a sensor, which, along with the signal energy probability density function, allows for the calculation of P_d at a location r by a sensor at location r_s.

Another parameter that needs to be specified for each location where coverage is required is a desired minimal probability of detection (P_pr_r). Requiring coverage by at least one sensor in a probabilistic sense leads to a different and more efficient sensor network design when compared to the requirement where for any given point there must be a specific sensor that has a desired P_d at that point.

Denoting the total number of possible source locations as Q and sensor locations as K, the set of conditions for all points that now need to be satisfied can be written in matrix notation as:

A_Q×K * p_K×1 ≥ b_Q×1

where:
– p_K×1 is a binary column vector with only possible values of 0 or 1, where a value of 1 indicates the placement of a sensor at that location.
– b_Q×1 is a preference vector where b_k = ln(1-P_pr_r_k).
– A_Q×K is a coverage matrix where A_is = ln(P_md(r_i, r_s)).

The problem of optimal sensor placement can now be formulated as the following binary linear programming problem:

minimize c_K×1 · p_K×1

Here, c_K×1 is a cost column-vector which can represent a monetary value, power consumption, or sensor installation time. Setting all values of c to 1 will result in a solution p_0 having a minimal number of sensors required to cover the area with the prescribed probability of detection or greater.

Approximate Solution Algorithm

A rigorous solution of this nondeterministic polynomial-time complete (NP-complete) problem rapidly becomes too computationally intensive and cannot be applied for practically relevant cases. The main idea of the fast approximate algorithm is to do a consecutive search, avoiding considering all possible combinations.

At each step, there is a decision vector p with elements corresponding to spatial locations where sensors can be placed. If some of its entries equal one, a sensor is placed in that location; if zero, no sensor is placed. The algorithm will look for the best position to put another sensor in the vector p.

Letting p̂ be a trial vector with an extra 1 added consecutively to all possible sensor locations which correspond to zeros in the vector p, the best position to place another sensor is the one that would minimize the sum of the positive elements of Δ p̂ = A p̂ – b.

The algorithm stops either when adding an extra 1 results in no positive elements of Δ p̂ (i.e., coverage is achieved at all required locations) or when there are no more candidate sensor locations (when p̂ = 1 1 1^T) or when there are no more sensors for the problem with a finite sensor supply. In the latter case, if the coverage is still unsatisfactory, the problem is infeasible. However, the algorithm still produces a sensor placement that can be considered the best possible coverage of the area given insufficient resources.

Incorporating Resilience into Sensor Placement

As mentioned earlier, one of the approaches in the paradigm of resilience is resilience by design. In the case of sensor networks, designed resilience can be accomplished by incorporating calculated redundancy into the network design such that if a certain number of sensors are disabled, the remaining sensors would still provide the required coverage, i.e., the desired detection probability would still be satisfied for desired locations.

The locations where redundancy is required will be referred to as resilience points. In addition, the depth of resilience (D) indicates how many sensors in a network can be disabled without loss of coverage. For example, if the depth of resilience is D for all resilient points, all resilience points need to be covered by at least R=D+1 sensors to allow any D of them to be disabled without loss of coverage.

This requirement can be incorporated into the existing binary linear programming framework by modifying the rows of the A and b matrices corresponding to resilience points. The condition for changing the coverage matrix element A_is to -1 indicates whether that specific sensor covers a designated location or not. If yes, the entry is changed to 1. Therefore, the condition in the optimization problem will be satisfied only if more than or exactly R sensors cover the designated location, which is a desirable result.

Application and Results

The problem considered in this article is to monitor human presence at a workplace in a large building for safety reasons, using commercial infrared sensors with realistic sensor performance characteristics.

2D Problem

For the 2D problem, the candidate sensor locations are set along the walls inside the coverage area, assuming each sensor has an omnidirectional 2D field of view. Figure 3 shows the optimal sensor placement for the 2D problem when only efficiency is required, with the sensor supply limited to 17 sensors.

Figure 4 shows the optimal sensor placement with the depth of resilience D=1, where every location of interest should be covered by at least two sensors so that any one of them can be disabled without affecting critical functionality. The total number of available sensors was limited to 36 in this case.

3D Problem

In the 3D problem, the candidate sensor locations are not limited to walls, and the sensors have a finite field of view in 3D, resulting in an omnidirectional field of view in the 2D horizontal plane. Additionally, the cubicle walls are short, so that a sensor located on the ceiling can see the cubicle interior.

Figures 5 and 6 show the optimal location of sensors without and with resilience (D=1) for the 3D case, respectively. In the non-resilient case (Fig. 5), 25 sensors are located at intersections and corners, which provide maximum coverage on the floor. In the redundant case (Fig. 6), 45 sensors are located almost uniformly over the possible candidate location area on the office ceiling, which maximizes coverage area and at the same time provides the required depth of resilience.

Conclusion and Future Research Directions

To be able to face the threats of the future, sensor hardware and networks must be designed with resilience in mind. This resilience will require a compromise with the goals of efficiency, given the potential for additional resource costs incurred by adding sensors. By balancing both efficiency and resilience optimization, the framework proposed here seeks to address the challenges of this delicate trade-off.

Further research is required in several areas to advance resilience applications for sensor networks. The described methodology is focused on the absorption of disruptions, so that disabling one or more sensors does not lead to the loss of WSN functionality. Future research efforts could tackle the adaptivity and recovery aspects of resilience in WSNs.

This research should center on ensuring the resilience of cyber-physical systems as a whole. In this context, sensors may not just deliver information to end users but could inform an automated response by the system to automatically recover from a threat. Technologies such as edge computing have the potential to further strengthen system resilience through redundancy in their computing nodes, in the same fashion as redundant sensors.

Furthermore, maintaining the ability to process data collected from sensors through artificial intelligence or digital twin tools will require fast optimization solutions, particularly when a decision-maker response is required during a disruption. In all these related areas, developers must move away from the traditional focus on efficiency and consider how to assess and incorporate resilience into system design.

By incorporating resilience into all levels of cyber-physical monitoring and control systems, users can be better prepared to maintain or recover functionality in the face of disasters. This approach will be crucial for ensuring the reliable and resilient performance of critical infrastructure and services, especially in disaster-prone regions.

Sensor Networks