The effect of a loss of model structural detail due to network skeletonization on contamination warning system design: case studies

The effect of limitations in the structural detail available in a network model on contamination warning system (CWS) design was examined in case studies using the original and skeletonized network models for two water distribution systems (WDSs). The skeletonized models were used as proxies for incomplete network models. CWS designs were developed by optimizing sensor placements for worst-case and mean-case contamination events. Designs developed using the skeletonized network 5 models were transplanted into the original network model for evaluation. CWS performance was defined as the number of people who ingest more than some quantity of a contaminant in tap water before the CWS detects the presence of contamination. Lack of structural detail in a network model can result in CWS designs that (1) provide considerably less protection against worst-case contamination events than that obtained when a more complete network model is available and (2) yield substantial underestimates of the consequences associated with a contamination event. Nevertheless, CWSs developed using 10 skeletonized network models can provide useful reductions in consequences for contaminants whose effects are not localized near the injection location. Mean-case designs can yield worst-case performances similar to those for worst-case designs when there is uncertainty in the network model. Improvements in network models for WDSs have the potential to yield significant improvements in CWS designs as well as more realistic evaluations of those designs. Although such improvements would be expected to yield improved CWS performance, the expected improvements in CWS performance have not been quanti15 fied previously. The results presented here should be useful to those responsible for the design or implementation of CWSs, particularly managers and engineers in water utilities, and encourage the development of improved network models.

have reviewed strategies for placement of sensors in CWSs. Given that a CWS may be able to help reduce consequences associated with contamination events, understanding the factors that can affect the quality of a CWS design is important for those responsible for managing distribution systems. This paper focuses on one important factor, the accuracy with which the network model of a distribution system represents the actual structural details of the network, namely its pipes and junctions.
Lack of structural detail in the network models developed for WDSs is known to reduce the accuracy of estimated adverse 5 health effects associated with potential contamination events in these systems (e.g., Grayman et al., 1991;Grayman and Rhee, 2000;Bahadur et al., 2008;Janke et al., 2007Janke et al., , 2009; . Lack of network model detail is also known to affect sensor placement in the design of CWSs (Klise et al., 2013). All network models involve some degree of simplification relative to the actual WDS. Although improvements in network models would be expected to result in improved CWS performance, the relationship between the quality of the network model and CWS performance has not been quantified. 10 Studies have examined the influence of uncertainty in various factors on the design of CWSs. Most studied has been the influence of uncertainty in the nature of potential contamination events; Davis et al. (2013) provide a recent review of work in this area. Studies also have considered the influence of uncertainty in water demand (e.g., Berry et al., 2006;Comboul and Ghanem, 2012;Cozzolino et a., 2006Cozzolino et a., , 2011Mukherjee et al., 2017;Ostfeld and Salomons, 2005a, b;Shastri and Diwekar, 2006), and population density (Rico-Ramirez et al., 2007;Davis et al., 2013). Davis et al. (2013) also considered the influence 15 of uncertainty in the rate of contaminant decay in a network following injection and uncertainty in the nature of the exposure model used to assess the consequences of a contamination event. We are not aware of any studies that have examined the influence on CWS design of uncertainties in the nature of the network itself, specifically the accuracy with which the network model used as the basis for designing a CWS represents the actual structure, the pipes and junctions, of the distribution system being considered. 20 In addition to the influence of uncertainty in the various factors just discussed on the performance of a CWS, the design objective used for the system can also affect its performance when faced with uncertainties. When the nature of potential contamination events is uncertain and the objective is to minimize worst-case adverse consequences associated with the events, CWSs designed to minimize mean-case consequences are more robust than those designed to minimize worst-case consequences (Davis et al., 2013). These designs are called mean-case and worst-case designs, respectively. Mean-case designs are 25 more effective at reducing consequences over a range of conditions. The relative lack of robustness of worst-case designs is a consequence of the narrow focus of these designs, which handicaps their performance when conditions differ from those assumed as the basis for the design.
The primary goal of this paper is to examine how and to quantify to what extent limitations in the detail available on a system's pipes and junctions affect the performance of a CWS design. An additional goal of this paper is to obtain some insight 30 into the robustness of worst-and mean-case designs for a CWS when there are such limitations in the network model used to represent a WDS.
Contamination in a distribution system has the potential to cause a variety of adverse effects. This paper considers adverse health effects associated with the ingestion of contaminated tap water; quantities of ingested contaminant, ingestion doses, are determined for those individuals who are potentially exposed to contaminated water. The term dose level is used to indicate the 35 2 Drink. Water Eng. Sci. Discuss., https://doi.org/10.5194/dwes-2017-39 Drinking Water Engineering  quantity of ingested contaminant for which adverse consequences are quantified. For a particular contaminant, dose level can be related to a health-effect level. For example, a dose level could correspond to the median lethal dose or the no-observedadverse-effect level. Lower dose levels can be related to a particular health-effect level for more toxic chemical contaminants and higher dose levels can be related to the same health-effect level for less toxic contaminants. In this paper when high or low dose levels are discussed, a statement is sometimes added that these can be related to contaminants with relatively low or high 5 toxicity, respectively, to re-emphasize this point. The measure of adverse consequences associated with a contamination event that is used in this paper, called impact, is the number of people who receive a dose of a contaminant above some dose level due to the ingestion of contaminated tap water. In this paper the performance of a CWS is defined by the impact that occurs before the CWS detects the presence of contamination.
The analysis presented in this paper is based on case studies using two WDSs. The best available network models for the 10 two systems were used to represent the actual distribution systems, and skeletonized versions of these network models were used as proxies for incomplete network models that might be developed for these systems. Network models will always be incomplete to some, generally unknown, degree; using skeletonized network models together with the best available network models allows the potential significance of uncertainties in network models to be studied. CWSs designed to minimize the adverse consequences of ingesting contaminated tap water were developed using the skeletonized models. These CWSs then 15 were utilized in (transplanted into) the complete network models, where their performance was evaluated and compared to the performance of designs developed using the complete network models. This approach allows the influence of uncertainties in network model detail on CWS performance to be evaluated. Developing and transplanting both worst-and mean-case designs allows the relative robustness of these designs to be studied.  considered here is the potential injection of a fixed quantity of contaminant at any one of the nodes in a network or at any of the nonzero demand (NZD) nodes in the network. The adverse effects examined are the impacts (as defined above) associated with an injection at a network node. Contaminant injection, transport, and ingestion were simulated using TEVA-SPOT (U.S. EPA, 2017). TEVA-SPOT uses Version 2.00.12 of EPANET (Rossman, 2000) for calculations involving contaminant transport. Quantities of ingested contaminant were determined using probabilistic models for ingestion timing and volume.

5
Nodal population in a network was assumed to be proportional to nodal water demand. The methodology used in carrying out these simulations is discussed in detail in . The analysis here assumes 0.5 kg injections of a conservative contaminant over a 1 h period beginning at 0:00 hours local time. All simulations were 168 h in duration, which includes the 1 h injection. The simulations used a 1 s water-quality time step and a 1 h hydraulic time step. One second is the shortest water-quality time step that can be used with EPANET.

10
Contaminant mass imbalances can occur during water-quality simulations with EPANET (Davis et al., 2017). Large imbalances can be associated with elevated estimates for impacts. However, mass imbalances generally can be minimized using short water-quality time steps. The 1 s water-quality time step used here minimizes the potential for any mass imbalances during the water-quality simulations used in this study.
Using TEVA-SPOT, CWSs were designed to minimize worst-and mean-case impacts associated with the design-basis threat 15 subject to a constraint on the number of sensors. Development of CWS designs is discussed in Davis et al. (2013). TEVA-SPOT optimizes sensor placement using a heuristic approach (Berry et al., 2006). Designs were developed for the original and the three skeletonized network models for each WDS for three sensor set sizes (5, 10, and 25 sensors) and for five different dose levels ranging from 10 -4 to 1 mg. A total of 120 designs were developed for each network (two objectives, four network models, three sensor set sizes, and five dose levels). 20 Sensors in CWSs were assumed to perform perfectly: they detect all contaminants and make no errors. A zero response time was assumed; all water use stops immediately when a contaminant is detected. This paper does not consider the sensitivity of consequences to sensor behavior and response time. These assumptions simplify the analysis. Imperfections in sensor behavior and delays in response will increase consequences relative to those reported here. CWS sensors are arrayed at locations within a network according to designs developed as described above and in the following paragraphs. CWSs alert when any sensor 25 detects contamination during an event. Impact is determined by summing the number of receptors at all nodes who have received doses above some dose level when the system alerts. The worst-case and mean-case performances of a CWS are determined by the largest impact and the mean impact, respectively, associated with a threat before contamination is detected by a sensor The performance of the CWS designs developed for the original and skeletonized network models for each WDS was 30 evaluated using the original network model for the system. Worst-case impacts were determined using both worst-case and mean-case designs. To evaluate the performance of a design developed using a skeletonized network model but applied to the original network model, the locations of the sensors determined for the skeletonized network were used to define a sensor network for the original model, and impacts were determined using this transplanted CWS. TEVA-SPOT has a built-in capability, the Regret Analysis mode, that allows various designs to be easily evaluated and that facilitates the selection of the best sensor design among those being considered (U.S. EPA, 2017).
The approach described yields impacts for CWSs designed using the original network models as well as impacts for CWS designs developed using the skeletonized network models that have been transplanted into the original models. Impacts determined using the transplanted designs were compared with those determined using the original designs to obtain insights into 5 the extent to which CWS performance is adversely affected when designs are developed using incomplete information on a WDS. Comparing the relative worst-case performances of the transplanted worst-and mean-case designs provides insight into the robustness of these designs when there is uncertainty in the network model.
The heuristic method used for sensor placement generally produces optimal designs, but in some cases can produce designs that are suboptimal (Davis et al., 2013). For the original model for Network N1 there were two instances of obvious suboptimality for worst-case designs out of the 15 cases (three sensor set sizes and five dose levels) examined; for Network N3 there was one. A design is suboptimal if larger impacts are obtained when the conditions used in the design and its evaluation are the same than when such conditions differ. Results were corrected to help minimize the effect of such obvious suboptimalities for a particular sensor set size and dose level by using the smallest impact from the five designs developed for different dose levels for that number of sensors. The corrections resulted in reductions in impacts of 6 and 18% for the two instances of 15 suboptimality for Network N1 and a 9% reduction for the single instance for Network N3. The correction does not identify the optimal design; it only helps improve the estimate of impacts that would be obtained with the optimal design.

Results and discussion
This section considers two topics: (1) CWS performance given uncertainty in the structural details of the network model and (2) the robustness of mean-and worst-case CWS designs when there is such uncertainty in the network model. CWS 20 performance is discussed in terms of the performance of the overall system and in terms of the performance of the individual sensors in a system.

CWS performance: Overall system
CWSs developed using the skeletonized network models generally perform more poorly than do those developed using the original network model. The following paragraphs discuss the behavior of these different CWSs. 25 The plots in Fig. 1 compare estimated impacts for worst-case CWS designs developed for three sensor set sizes using the original and skeletonized network models for Network N1. In this figure, the designs developed using the skeletonized network models are non-transplanted designs: they are evaluated using the network models for which they were designed.
Note the logarithmic scales on both the vertical and horizontal axes. Results are given for four different CWS designs as a function of the dose level used for the design. The CWSs were evaluated using the same dose level as used for their design. In the figure, a trim of 0 cm corresponds to the original model and the 20, 30, and 40 cm trims correspond to the three levels of skeletonization used. The estimated impacts obtained using the skeletonized network models are similar to those obtained using the original network model. However, if CWSs are designed using a skeletonized (i.e., an incomplete) network model and then implemented, they will be used in actual system, which is better approximated by the original network model. The 5 performance of transplanted designs is discussed next.
Again using Network N1, the plots in Fig. 2 compare (1)  The plots in Fig. 3 provide a comparison for Network N1 of the impacts obtained using designs developed for the skeletonized network models when they are used in the skeletonized network models (non-transplanted designs) and when they are 20 transplanted into the original network model. The same design is being used, but evaluated using different network models.
Depending on the dose level and number of sensors, the impacts estimated for Network N1 can be two to three times larger when the designs are used in the original network rather than in the skeletonized network where they were developed. In other words, evaluating a CWS using the skeletonized network model for which it was designed can yield results that considerably underestimate the actual consequences that could occur if the design were used in the actual WDS. There is no consistent 25 pattern in impacts or relative impacts related to the level of skeletonization used. Note that the somewhat jagged nature of some of the lines in the plots in Fig. 3 is the result of using only five points to construct the lines in the plots in this (and other similar) figures.
The estimated percentage reductions in impacts obtained using the CWSs designed for Network N1 relative to the worstcase impacts estimated for the network when no CWS is used are shown in the plots in Fig. 4. (When no CWS is used, the 30 relative reduction in impacts is 0%.) The reduction in impacts for low dose levels (contaminants with relatively high toxicity) can be similar and substantial (generally near or > 90% for dose levels below about 0.01 mg) for the original and transplanted designs. However, at higher dose levels (contaminants with relatively low toxicity), the reduction in impacts obtained with transplanted designs can be considerably smaller than that obtained with the original design. The reduction in impacts does not show a consistent relationship with the level of skeletonization. Percentage reduction in impacts decreases as dose level   increases. Consequences associated with less toxic contaminants generally are more localized than those associated with more toxic contaminants because of the larger quantity of contaminant required to produce a similar health effect. CWSs are less effective in providing protection against localized effects than effects that are more widespread.
Impacts estimated for CWSs designed using Network N3 are shown in the plots in Fig. 5, which provides results similar to those in Fig. 2 for Network N1. The results for Network N3 are more consistent than those for Network N1, with impacts 5 generally increasing with increasing level of skeletonization. The plots in Fig. 6 give the ratio of impacts for transplanted and non-transplanted designs for Network N3. The ratios are generally larger than those for Network N1 (Fig. 3) when 10 or 25 sensors are used. (Note the difference between the vertical scales used in Figs. 3 and 6.) When designed and evaluated using the skeletonized network models for Network N3, the results for designs with 10 or 25 sensors underestimate by a factor of two to eight times the impacts expected if the design were used in the actual network, a larger underestimate than for Network N1. For 10 five sensors, the underestimate can be as much as a factor of two. As is the case for Network N1, the percentage reductions in impacts at larger dose levels achieved using the transplanted CWS designs are generally considerably less than those obtained using designs developed using the original network model, as shown in Fig. 7.
The relative performance of the transplanted worst-and mean-case CWS designs for Networks N1 and N3 is summarized in Table 3. Performance is relative to the performance of a CWS designed to minimize worst-case impacts using the original The results in Table 3 indicate that for Networks N1 and N3 the relative performance of the transplanted worst-case designs generally becomes poorer when the dose level is smaller than 1.0 mg. In particular, the median and maximum values of the ratios for the two networks generally increase when the dose level decreases below 1.0 mg. The maximum ratios are generally 25 considerably larger for Network N3 than for Network N1, indicating that the relative performance of the transplanted designs is network dependent. Note that although the relative performance of the transplanted designs is poorer at smaller dose levels, the reduction in impacts, both percentage wise and in absolute terms is considerably better at smaller dose levels than at 1.0 mg. Table 3 also show that the relative performance of the transplanted mean-case designs deteriorates when the dose level decrease below 1.0 mg. The results in the table show that the relative worst-case performance of the transplanted 30 mean-case designs is generally similar to the relative worst-case performance of the transplanted worst-case designs: the ratios for the transplanted mean-case designs are generally similar to the corresponding ratios for the transplanted worst-case designs.

The results in
CWS performance is influenced by the network nodes considered as possible injection locations. Fig. 8 provides results for Network N1 similar to those shown in Fig. 2 except that only NZD nodes are used as injection locations. Differences in the performances of the transplanted designs obtained with all nodes (Fig. 2) or only NZD nodes (Fig. 8 )     10 or 25 sensors are used. Worst-case impacts with no CWS are somewhat smaller when only NZD nodes are used relative to those obtained when all nodes are considered as possible injection locations.

CWS performance: Individual sensors
The preceding discussion has examined the performance of CWSs as systems. Examining the performance of individual sensors in those systems provides some additional insight into how the overall systems perform. CWS designs were developed 5 considering their performance when challenged by the possible injection of contaminants at any node in the network or at any NZD node. A CWS can detect some of the events, but, in general, with a limited number of sensors will not detect all events.
The worst-case performance of a CWS is determined by the largest impact associated with any event that occurs before an event is detected by a sensor. For Network N1 with a five-sensor CWS and injections at NZD nodes, the sensors detect about 3,500 events out of about 11,000, for a dose level of 10 -4 mg. The number of events detected for the original network model does undetected events are smaller than the worst-case impacts. Therefore, it is not unexpected that the largest undetected impact for the original design (5,000) is somewhat larger than the largest undetected impact for the 20 cm transplanted design (4,800).
The performance of the individual sensors in the five-sensor CWS design developed using the original network and the five-sensor, 20 and 40 cm transplanted designs are shown in Fig. 9 for a dose level of 10 -4 mg. Note that the vertical scale on the plot for the 20 cm design is different from the vertical scale used in the other two plots. The figure shows results 5 considering NZD nodes as injection locations. The general locations of the five widely spaced sensors in the three designs are similar and the sensors are arbitrarily labeled as Sensors 1 through 5, consistently for all the designs. For the detected events, the impacts at the time the events are detected were sorted in ascending order for each of the five sensors and plotted against event number, starting with the lowest impact event for Sensor 1 and continuing using a cumulative count of events through the highest impact event for Sensor 5. The numbering of events in the three plots in Fig. 9 is independent. The highest impact 10 for any event detected by any sensor is the worst-case impact for the CWS unless a higher impact is associated with any of the undetected events. No undetected events with such higher impacts occur for Network N1 and five sensors, as noted above.
In Fig. 9, the results for each of the five sensors are presented from left to right, with the results labeled with the sensor number in the upper plot. As an example of how to interpret the plots in the figure, in the upper plot the results for Sensor 4 begin at about Event 800 and continue to about Event 2200; about 1,400 events are detected by this sensor. The maximum 15 impact for any impact detected by the sensor is about 2,500. The highest impact for any event detected by any sensor is over 5,000 for Sensor 5. This is the worst-case impact for the CWS.
The performance of the sensors varies substantially between the original and transplanted designs. The worst-case impact for the original design is over 5,000, as already noted, over 16,000 for the transplanted 20 cm design, and over 5,000 for the transplanted 40 cm design, similar to that for the original design, but for a different sensor. The three worst-case impacts in 20 Fig. 9 correspond to the worst-case impacts in the upper plot in Fig. 8 for a dose level of 10 -4 mg. Fig. 8 shows that the worstcase impacts for the original and 40 cm designs for five sensors are similar at that dose level. Fig. 9 shows that these impacts were the result of events observed by different sensors. Although not shown in the figure, the events in the two cases are also different. Fig. 8 suggests that the original and 40 cm, five-sensor designs perform similarly for a dose level of 10 -4 mg. In fact, the similarity results from sensors in different parts of the network detecting different events with the same impacts. For Network N1 there are 27 points in Fig. 10 that lie above the diagonal line, 9 points that lie on the line, and 9 points that lie below the line. For the comparisons used in Fig. 10, the transplanted mean-case designs yield worst-case impacts that are less than or equal to those yielded by the transplanted worst-case designs in 36 of the 45 cases for Network N1. The transplanted worst-case designs yield worst-case impacts that are less than or equal to those obtained for the transplanted mean-case designs in 18 of the 45 cases. For the 27 instances in which impacts for the worst-case designs exceed those for the mean-case designs, 5 the impacts are about 34% larger on average. For the 9 instances in which the impacts for the mean-case designs are larger, they are about 59% larger on average. Considering only NZD nodes (not plotted), the transplanted mean-case designs perform as well as or better than the transplanted worst-case designs in 32 of the 45 cases; transplanted worst-case designs perform as well as or better than transplanted mean-case designs in 28 of the 45 cases.
For Network N3 there are 6 points in Fig. 11  For the two networks studied, the mean-case designs developed using the skeletonized network models yield results that are comparable to those obtained with the worst-case designs developed using the skeletonized network models when the designs are transplanted into the original network models. Mean-case designs perform somewhat better for Network N1 and somewhat 5 poorer for Network N3. As discussed above, mean-case designs are more robust than worst-case designs when the objective is to minimize worst-case impacts and there is uncertainty concerning the conditions of a contamination event. The results presented here for Networks N1 and N3 indicate that transplanted mean-case and worst-case designs can be similarly robust when used to estimate worst-case impacts in the original network models. The small sample size limits the ability to make any more general conclusions about the overall robustness of mean-case designs under conditions of uncertainty in the network 10 model. Evaluations using additional networks would be helpful.

Conclusions
On the basis of the two networks examined, lack of structural detail in the network model results in worst-case CWS designs that perform more poorly than worst-case designs developed using the original "all-pipes" network model. number of sensors; however, for the cases considered in this paper, estimated impacts can increase by a factor of two to eight times when the design is evaluated using the complete network model. Although lack of model detail generally has an adverse effect on CWS performance, no simple relationship was found between the degree of skeletonization and loss of performance. For the two networks studied, the relationship depends on the number of sensors used in the CWS.

10
In spite of the negative effect of loss of network model structural detail on CWS performance, CWSs designed using incomplete network models can provide substantial reductions in adverse consequences compared to results obtained when no CWS is used, except at high dose levels (less toxic contaminants), for which consequences tend to be localized near the injection location. Reductions at low dose levels are generally above 70% for the skeletonized networks and consequences considered.
Proper understanding of the basis for CWS performance requires an understanding of the performance of the individual 15 sensors used in the CWS. As discussed for Network N1, apparently similar overall performance of two different CWS designs can be associated with very different results for the individual sensors in the system.
Mean-case designs developed using incomplete network models can provide worst-case results that are generally comparable to those obtained with worst-case designs developed using the same incomplete models, consistent with a conclusion that meancase designs can provide robust results under conditions of uncertainty. However, results for more networks are needed before 20 any broader conclusions can be made.
Improvements in network models, by reducing the uncertainty in their structural details, have the potential to yield significantly better performing CWSs. The magnitude of the potential improvement depends on the degree of the improvement in the network model and the nature of the contaminants of most concern. However, the results for the networks examined here suggest that a reduction in worst-case impacts by a factor of as much as about two or more is possible for contaminants whose 25 effects are not localized near the injection location (cf. Table 3). In addition, evaluations of the expected performance of CWSs designed using all-pipes models should provide considerably more realistic results than evaluations of designs developed with incomplete network models, which yielded substantial (two to eight times) underestimates of impacts for the two networks examined.
The results presented here should be useful to those responsible for designing or implementing CWSs, in particular managers 30 and engineers in water utilities. Hopefully, the results will help provide motivation for the improvement of existing network models. Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. This paper has been subjected to the U.S. Environmental Protection Agency's (EPA's) review and has been approved for publication. The views expressed in this paper are those of the authors and approval does not signify that the contents necessarily reflect the 5 views of the Agency. Mention of trade names, products, or services does not convey official EPA approval, endorsement, or recommendation.
Because of the confidentiality of the information, the identity of the real WDSs used in this paper and any information that could be used to identify the systems cannot be disclosed.
The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory ("Argonne"). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government