This systematic review of evidence suggests that wearing masks, wearing higher quality masks (respirators), and mask mandates generally reduced the transmission of SARS-CoV-2, although the quality of the studies analyzed was often poor and did not account for the highly transmissible Omicron variants.
- Testing, contact tracing, and isolation can substantially reduce the transmission of infection and improve public health outcomes. - The majority of studies suggest a considerable impact of testing, followed by isolation and treatment of detected individuals. - Contact tracing is a method of identifying potential secondary cases and can reduce the population-level growth rate or levels of infection. - There is a clear need for more robustly designed experimental studies to inform TTI design and quantify TTI impact across diverse populations, over different levels of compliance, over different time periods, and for different epidemiological characteristics. - More research is required to fully elucidate the epidemiological consequences of TTI under different scenarios, as well as the broader costs and benefits of different approaches to TTI.
This is from Royal Society Publishing in 2023 at https://royalsocietypublishing.org/doi/10.1098/rsta.2023.0133.
Top five keywords: face masks, SARS-CoV-2, COVID-19, transmission, systematic review
Abstract
This rapid systematic review of evidence asks whether (i) wearing a face mask, (ii) one type of mask over another and (iii) mandatory mask policies can reduce the transmission of SARS-CoV-2 infection, either in community-based or healthcare settings. A search of studies published 1 January 2020–27 January 2023 yielded 5185 unique records. Due to a paucity of randomized controlled trials (RCTs), observational studies were included in the analysis. We analysed 35 studies in community settings (three RCTs and 32 observational) and 40 in healthcare settings (one RCT and 39 observational). Ninety-five per cent of studies included were conducted before highly transmissible Omicron variants emerged. Ninety-one per cent of observational studies were at ‘critical’ risk of bias (ROB) in at least one domain, often failing to separate the effects of masks from concurrent interventions. More studies found that masks (n = 39/47; 83%) and mask mandates (n = 16/18; 89%) reduced infection than found no effect (n = 8/65; 12%) or favoured controls (n = 1/65; 2%). Seven observational studies found that respirators were more protective than surgical masks, while five found no statistically significant difference between the two mask types. Despite the ROB, and allowing for uncertain and variable efficacy, we conclude that wearing masks, wearing higher quality masks (respirators), and mask mandates generally reduced SARS-CoV-2 transmission in these study populations.
This article is part of the theme issue 'The effectiveness of non-pharmaceutical interventions on the COVID-19 pandemic: the evidence'.
1. Introduction
Recommendations and mandatory policies to use medical or surgical masks, respirators such as N95, KN95 or FFP2 masks, and other facial coverings such as cloth masks have been commonly implemented among non-pharmaceutical interventions (NPIs) during the COVID-19 pandemic. Initially implemented in healthcare settings, mask recommendations and mandates for members of the public became more common globally as the pandemic progressed through 2020 and 2021. Although mask mandates had been discontinued in most jurisdictions by 2023, there remains a need to critically examine the role of masks among other NPIs in reducing the transmission of SARS-CoV-2 and other respiratory pathogens in preparation for future surges of COVID-19 or new epidemics.
Previous systematic reviews have examined evidence of the effectiveness of mask-wearing against respiratory viruses with varied conclusions, reflecting a lack of high-quality or conclusive data and heterogeneous methods of investigation. A living rapid review by Chou et al. [1] included randomized controlled trials (RCTs) and observational studies with a direct focus on COVID-19. At its last (eighth) update, the authors found low-to-moderate strength evidence supporting the benefit of mask use to prevent SARS-CoV-2 infection in the community. However, the review also concluded that there was insufficient evidence to recommend masks in healthcare settings. In an early-pandemic systematic review of COVID-19 and non-COVID-19-focused observational studies, Chu et al. [2] found evidence of low certainty to suggest that ‘face mask use could result in a large reduction in risk of infection’.
Recently, Jefferson et al. [3] completed a fifth revision of their 2006 Cochrane review of physical interventions (screening, isolation, quarantine physical distancing, face masks and handwashing) to interrupt or reduce the spread of respiratory viruses. The review included 78 RCTs in community or healthcare settings. Many of the trials were conducted during non-epidemic influenza periods with only six new trials (two focused on masks) conducted during the COVID-19 pandemic. The authors reported low to moderate certainty evidence on the effects of masks on the spread of influenza-like illness or COVID-19-like illness. They concluded that the high risk of bias (ROB) in the trials, variations in outcome measures and adherence to interventions during the studies hindered their ability to draw firm conclusions. In an earlier, pre-COVID-19 edition of their review, which included case–control and other observational studies, Jefferson et al. [4] concluded that implementing transmission barriers, isolation and hygienic measures are effective at containing respiratory virus epidemics, and that surgical masks or N95 respirators were the most consistent and comprehensive supportive measures.
These are the most important and prominent reviews, among a large number of reviews of variable quality, which inevitably include many of the same studies. The review that was limited to RCTs was unable to draw firm conclusions about the effectiveness of masks, whereas those that included observational studies were able to draw low to moderate strength conclusions that generally favoured the effectiveness of masks.
Due to the difficulty in performing rigorous RCTs to assess NPIs and the relative urgency to collect data of all kinds during the pandemic, observational studies have been the most common type of research to address the effectiveness of masks. Therefore, as in Chou et al. and Chu et al.'s work, this review includes a synthesis of observational data in addition to RCT data on the effectiveness of masks, different types of masks and mask mandates for reducing transmission of SARS-CoV-2. We have taken an inclusive approach, acknowledging the reduced certainty of conclusions drawn from observational studies, particularly studies done during a time of polarized opinion on NPIs, and when pressures on peer review led to greater variability in the quality of published studies [56].
(a) Research question
Our primary question was: What is the best available evidence about the effectiveness of masks in reducing transmission of SARS-CoV-2 in community-based and healthcare settings?
We also asked two subsidiary questions:
1. | What is the best available evidence about which types of masks (respirators, surgical masks or other face coverings such as cloth masks) are the most effective at reducing transmission of SARS-CoV-2 in community-based and healthcare settings? | ||||
2. | What is the best available evidence about the effectiveness of mandatory masking policies in reducing transmission of SARS-CoV-2 in community-based and healthcare settings? |
2. Methods
This rapid review follows guidelines set out by the Joanna Briggs Institute for systematic reviews [7], with adjustments made due to its rapid timeline guided by Straus et al.'s guide to rapid reviews for policymakers [8].
In this review a ‘surgical mask’ is a multi-layer polypropylene mask as used in medical and surgical healthcare settings, a ‘cloth mask’ is a face covering of variable manufacture that covers the mouth and nose, and a ‘respirator’ is a polypropylene mask manufactured for higher filtration efficiency which is usually intended to be fitted to the wearer.
(a) Search strategy
The search was designed by the review team in consultation with a health sciences librarian, who finalized, translated and executed the search in PubMed (NCBI), iCite (NCBI), Embase (Ovid), CINAHL (EBSCOhost) and ERIC (ProQuest). The search terms limited results to studies published from 1 January 2020 in English. Keywords and controlled vocabulary terms related to SARS-CoV-2, COVID-19 and masks were combined with search filters to limit results to reviews, RCTs, quasi-experimental studies or cohort studies of SARS-CoV-2 transmission, and to exclude animal-only studies. Relevant preprints indexed in PubMed, iCite and Embase were retrieved by the search; no other preprint servers or trial registers were searched. Searches were initially executed on 25 November 2022 and updated weekly up to and including 27 January 2023. Search results were exported to EndNote for deduplication, then imported to Covidence (Veritas Health Innovation, Melbourne, Australia; www.covidence.org) for screening by the review team. The full search strategy in PubMed is included in the electronic supplementary material, Appendix S1.
(b) Source selection criteria
We included studies if they were written in English, reported on COVID-19 alone or in combination with other respiratory infectious diseases, and reported on masks as a method of reducing or preventing transmission of COVID-19. Included studies could be based either in the community or in a healthcare setting. RCTs, non-RCTs, quasi-experimental studies and observational studies (e.g. case–control studies, prospective or retrospective cohort studies) with a comparison group were included, whether published or preprint.
We excluded studies if they did not involve human participants, if they did not report mask-related data separately from other interventions, or if they relied on self-reported SARS-CoV-2 status of participants. We excluded studies that compared mask-wearing and COVID-19 infection in large groups (sometimes called ‘ecological studies’), for example across whole countries or across multiple countries, because of the high risk of confounding factors. Modelling studies, mechanical studies, laboratory-based studies, descriptive studies, prevalence studies, conference abstracts and non-studies (e.g. press releases, commentaries and letters to the editor) not reporting original data were excluded. Evidence syntheses were excluded but their lists of included studies were screened for relevant constituent studies, to cross-check our own literature search.
Sources were screened in Covidence at the title/abstract and full-text level by two independent reviewers. Conflicts were resolved by a third reviewer or by consensus. Studies excluded at the full-text level are listed in the electronic supplementary material, Appendix S2, with reasons for exclusion.
(c) Data extraction, risk of bias assessment and data analysis
Data extraction was completed in Microsoft Excel using a standardized data extraction form that is included in the electronic supplementary material, Appendix S3. Data were extracted by a single reviewer and verified independently by a second.
ROB was assessed using the ROB-2 tool for RCTs [9]. Observational studies were assessed using a modified version [10] of the ROBINS-I tool [11], which was appropriate for this rapid systematic review. The modified instrument is presented in the electronic supplementary material, Appendix S4. Each study was assessed by one reviewer and checked for accuracy by a second independent reviewer. Once a study met one criterion that made it ‘critical’ for ROB the assessment was stopped without completing it in full. ROB was not an exclusion criterion, but rather used as a tool for interpreting results of studies. The ROB tools are methods for identifying weaknesses in study design that fail to rule out bias, but they do not prove that bias exists.
We synthesized data narratively and in summary tables. We considered presenting the results as a meta-analysis, but ultimately deemed this inappropriate due to heterogeneity in study design, the variety of outcome measures, and poor reporting across the included studies. We did not carry out a formal GRADE assessment [12], but we considered the design of each study included in the review, the ROB, and the precision (confidence intervals) and direction of reported effects. Blobbograms were created to graphically display unpooled, unadjusted odds ratios (ORs) and 95% confidence intervals (CIs) for those studies reporting detailed data of SARS-CoV-2 infection events in those who wore a mask versus those who were unmasked, in those who wore a mask ‘sometimes’ versus those who were unmasked, or in those who wore a respirator versus a surgical mask. Blobbograms were created using Microsoft Excel.
3. Results
There were 5593 records identified by the database search, from which 408 duplicates were removed before screening. A total of 5185 records were screened at the title/abstract level, plus an additional 60 studies identified for screening from existing evidence syntheses. Of the 284 studies screened at the full-text level, 75 were included in this review. Figure 1 illustrates the screening process using a PRISMA 2020 Flow Diagram [13]. Table 1 summarizes the characteristics of included studies, with further detail including key findings presented in the electronic supplementary material, Appendix S5. Table 2 summarizes the number of studies addressing each type of comparison studied and the direction of their conclusions.
Figure 1. PRISMA 2020 Flow Diagram.
Table 1.
High-level characteristics of included studies.
View inlineView popup
aNon-peer-reviewed preprint.
Table 2.
Effect of mask-wearing on SARS-CoV-2 transmission (number of studies of each type).
View inlineView popup
aNo statistically significant effect.
There were 35 studies set in the community (three RCTs (14–16), 32 observational studies [15–1921–4648] and 40 studies situated in healthcare settings including hospitals and long-term care facilities (LTCFs) (one RCT [72], 39 observational studies [49–7173–88]).
The United States was the most frequent host country, presenting 18 community-based studies and 10 healthcare-based studies, totalling 37% of all included studies. In community-based studies, Germany was the next most frequent host country (n = 4 studies), with the remainder coming from a variety of other countries. In healthcare-based studies, China (n = 5), France (n = 4) and India (n = 4) were common settings, followed by Italy (n = 3), Turkey (n = 2), Iran (n = 2) and two multinational studies. One study did not report its host country.
Studies with data collection taking place early in the pandemic (defined in this review as December 2019–July 2020) were far more common in healthcare settings (n = 30/40; 75%) than in community settings (n = 9/35, 26%). Most community-based studies took place from mid-2020 to late 2021 (n = 25/35; 71%). Only four studies (5%), three in the community and one in healthcare, reported on data collected during the Omicron era of the pandemic (November 2021–January 2023).
PCR was the most common method of testing for SARS-CoV-2 infection (n = 47/75; 63%), followed by serology (n = 18/75; 24%). Thirteen studies (17%) were not specific about the testing method used, only stating that COVID-19 cases were laboratory-confirmed or drawn from a database of confirmed infections. Enzyme-linked immunosorbent assay (ELISA) was the most common immunoassay used (n = 9/18; 50% of studies using serology; table 1).
Most observational studies of mask-wearing relied on self-reported mask-wearing data from participants. Thirty-three studies (44%) used questionnaires to collect this information, while 13 (17%) used interviews and seven (9%) used contact tracing methods that were not further described. Studies of mask mandates typically used publicly available data about whether mask mandates were in effect (n = 14/18; 78%). Five studies in healthcare settings were able to draw information about mask-wearing from COVID-19 incident tracking systems.
Of the 22 community-based observational studies of masks in general and types of masks, 14 studies (64%) asked whether participants wore masks but did not ask whether masks were worn by potential sources of infection. Four studies asked if either party had been masked, three asked if both parties had been masked and one focused solely on the mask-wearing of the COVID-19-positive contact. In healthcare settings, mask-wearing by healthcare workers (HCWs) and not by patients was reported in all but one study, which focused instead on mask-wearing by LTCF residents [50]. Six studies reported on mask use by both HCWs and patients, one of which [57] focused specifically on COVID-19 patients. The majority of studies evaluated whether individual mask wearers were protected from SARS-CoV-2 infection, but studies that measured effects in whole populations (e.g. cluster randomized trials, communities under mask mandates) did not distinguish whether transmission was reduced from infected mask wearers, to uninfected mask wearers, or both (electronic supplementary material, Appendix S5).
ROB was consistently high across all included study designs. All included RCTs were at ‘high’ ROB. All observational studies in healthcare settings (n = 39) and most studies in community settings (n = 29/32; 91%) were at critical ROB in at least one domain. Of the remaining three observational studies, two were at serious ROB [3032] and one was at moderate ROB [15]. Critical ROB in observational studies was often related to study authors' inability to definitively relate outcomes to masks or mask mandates alone (n = 30/68; 44% of critical assessments) or due to a failure to adjust for other COVID-19 protective interventions either before or during the study period (n = 11/68; 16%).
(a) Masks for reducing transmission of SARS-CoV-2
Forty-seven studies reported on the effectiveness of masks for reducing transmission of SARS-CoV-2 as their primary outcome: 24 studies (two RCTs, 22 observational) in community settings [14–1820212325–2729333436–4143444648] and 23 observational studies in healthcare settings [5053–6163–66686975–7781828485].
Of the two RCTs conducted in community settings, one cluster RCT [14] found a 9.5% reduction in symptomatic seroprevalence (n = 105 fewer symptomatic seropositives; adjusted prevalence ratio [aPR] 0.91 [95% CI: 0.82, 1.00]) and an estimated 11.6% reduction in the proportion of individuals with COVID-19-like symptoms (n = 1541 fewer people reporting symptoms; aPR 0.88 [95% CI: 0.83, 0.93]) in communities where masks were distributed and their use promoted compared with control communities. In the other RCT [20], infection with SARS-CoV-2 occurred in 1.8% of participants in the mask group (recommended to wear medical masks) versus 2.1% in the control group (OR 0.82 [95% CI: 0.54, 1.23]), with the authors concluding that the difference was not significant. Of note, the latter RCT took place early in the pandemic (April–June 2020) and only 95 out of 4862 (2%) of participants who completed the study were infected with SARS-CoV-2, and self-reported adherence to mask-wearing among participants was poor.
Of the 45 observational studies, 39 (87%) found that mask-wearing was associated with a reduction in SARS-CoV-2 transmission. These 45 studies had a wide variation in study design and intervention characteristics, and almost all (n = 42; 91%) relied on self-reported mask-wearing data from participants. Five studies concluded that masks had no significant effect on transmission, and one favoured the control group. Notably, the study favouring controls [18] is one of the three preprints included in this review; it had not been peer reviewed nor subsequently published, and the data presented are inconsistent with the authors' own conclusion that masks were protective.
A subset of 24/47 studies (two RCTs, 22 observational; 14 in community settings, 10 in healthcare settings) reported the number of SARS-CoV-2 infection events in those who wore a mask versus those who were unmasked. The unpooled, unadjusted ORs and 95% CIs of these studies are graphically displayed in figure 2. The blobbogram illustrates that there were only two RCTs, and their effect sizes were smaller and within a closer range than most observational studies. The figure helps to illustrate between-study comparisons in direction of effect while also making explicit the existing sampling imbalance and wide CIs across many of the studies.
Figure 2. ORs and 95% confidence intervals of a subset of eligible included studies comparing masked versus unmasked. *Although Abaluck et al. [14] is a cluster RCT, the sample sizes presented in this figure represent events at the individual level.
A further subset of four studies, all observational and set in the community, reported the number of SARS-CoV-2 infection events in those who self-reported wearing a mask ‘sometimes’ versus those who were unmasked. The unpooled, unadjusted ORs and 95% CIs of these studies are visualized in figure 3. Two studies showed a positive association and two showed a negative association of sometimes wearing a mask with risk of SARS-CoV-2 infection.
Figure 3. ORs and 95% confidence intervals of a subset of eligible included studies comparing sometimes masked versus unmasked.
In summary, the great majority of studies found that masks (n = 39/47; 83%) reduced transmission, although the magnitude of measured effects was variable and the quality (precision and ROB) of evidence in both community and healthcare settings was low.
(b) Comparative effectiveness of different types of masks
Nineteen studies compared the effectiveness of different types of masks for reducing transmission of SARS-CoV-2: four studies (two RCTs, two observational) in community settings [14162347] and 15 studies (one RCT, 14 observational) in healthcare settings [5052576870–74788082–8488].
Two RCTs compared different types of masks in community settings. In the implementation cluster RCT cited above [14], surgical masks were associated with an 11.1% reduction in symptomatic seroprevalence (aPR 0.89 [95% CI: 0.78, 1.00]) compared with no statistically significant effect of cloth masks (aPR 0.94 [95% CI: 0.78, 1.10]). The other RCT [47] compared use of a closed face shield with surgical face mask to using a surgical mask alone to prevent SARS-CoV-2 infection. Only one participant tested positive for SARS-CoV-2 in the mask-only group versus three participants in the mask plus shield group. In the per-protocol analysis, the absolute risk difference was –1.4% (95% CI: –4.2%, 1.4%), indicating non-inferiority of the mask plus shield.
Of the observational studies in community settings, one study [16] found that N95 or KN95 masks and surgical masks were effective while cloth masks were not, but the other [23] found that type of mask was not significantly associated with infection risk. Both studies relied on self-reported mask-wearing data from participants without any attempts to observe adherence.
In healthcare settings, one non-inferiority RCT [72] compared different types of masks (fitted N95 respirator versus surgical mask) used by HCWs in hospital settings in Canada, Egypt, Israel and Pakistan. The authors identified an infection rate across all sites of 10.5% in the medical mask group compared with 9.3% in the N95 group (hazard ratio 1.14 [95% CI: 0.77, 1.69]), indicating that surgical masks were non-inferior to N95 respirators. Among participants who contracted RT-PCR-confirmed COVID-19, two participants were hospitalized in the medical mask group compared with one in the N95 group. Notably, data collection in Egypt took place when the highly transmissible Omicron variant was in heavy circulation. This contrasts with data collection in Canada that took place early in the pandemic. Furthermore, mask-wearing outside of work was not measured.
Of the 14 observational studies comparing different types of masks in healthcare settings, seven found that respirators were associated with a decreased risk of transmission compared with other masks and five found that there was no statistically significant difference between the two types. One study [88] compared respirator versus surgical mask use during AGPs for COVID-19 versus non-COVID-19 patients and found that while respirators were associated with lower infection rates than surgical masks in HCWs performing AGPs and non-AGPs for COVID-19 patients, the opposite was true for those caring for non-COVID-19 patients, for whom wearing a respirator was associated with a higher risk of infection than a surgical mask.
A subset of six studies in healthcare settings (one RCT, five observational) reported the number of SARS-CoV-2 infection events in those who wore respirators versus surgical masks. The unpooled, unadjusted ORs and 95% CIs of these studies are visualized in figure 4. The figure illustrates the variation in this evidence, with several 95% CIs crossing the line of no effect.
Figure 4. ORs and 95% confidence intervals of a subset of eligible included studies comparing respirator versus surgical masks.
In summary, where significant effects were reported, they favoured wearing higher quality rather than lower quality masks. However, the majority of studies suffered from a critical ROB in at least one domain, and effects were uncertain in magnitude and variable betweenstudies.
(c) Mask mandates for reducing transmission of SARS-CoV-2
Eighteen observational studies reported on the effectiveness of mask mandates for reducing transmission of SARS-CoV-2: 10 in community settings [1922242830–32354245] and eight in healthcare settings [4951626777798687]. Sixteen of these studies (89%) found that mask mandates were associated with a reduction in transmission, while two found that they had no significant effect on transmission.
Six of the 10 community-based studies were set in schools. The majority of studies (n = 9/10; 90%) in the community found mask mandates to be associated with a reduced risk of SARS-CoV-2 transmission. The only community-based study to find mask mandates had no significant effect on transmission [35] studied transmission on airplanes before and after the implementation of a universal mask mandate. This retrospective study was carried out before Omicron and included 95 participants from 46 flights, among which there were only four instances of probable in-flight transmission (two before the mandate, and two after). Adherence to the mandate was not measured in this study, nor was it measured in any other study of mask mandates.
All hospital-based studies of mask mandates (n = 7) found mask mandates to be associated with lower incidence of SARS-CoV-2 infection. By contrast, findings from the only study based in LTCF indicated no significant effect of mask mandates on transmission [79].
In summary, the majority of included studies found that mask mandates reduced transmission of SARS-CoV-2, albeit with effects of variable magnitude and low precision. The quality of the evidence supporting mask mandates was low based on ROB assessments and heterogeneity in study designs.
4. Discussion
Most of the studies included in this review favoured the wearing of masks, of wearing higher quality masks (respirators) and mask mandates to reduce SARS-CoV-2 transmission (table 2). Sample size, intervention implementation and measurement, measures of infection, and effect size all varied greatly across studies, and 95% (n = 71/75) were performed before the emergence of highly transmissible Omicron variants, which began in November 2021. RCTs addressing the three review questions were rare and were all assessed to be at high ROB. Although few in number, the two RCTs assessing the effectiveness of mask use produced lower effect sizes (unadjusted OR 0.82 to 0.88, consistent with a 12–18% reduction) than observational studies, which had a wide range (unadjusted OR 0.08 to 1.28, consistent with a 92% decrease to 28% increase). Observational studies made up 95% of included studies and were almost all at critical ROB in at least one domain. Laboratory-confirmed SARS-CoV-2 was a required outcome measure for inclusion in this review, and while most studies were specific about the type of test used, 17% did not report this information. How mask-wearing was recorded varied among studies and typically relied on self-reporting by study participants. The results of all included studies apply under the conditions in which the studies were carried out and caution is needed when generalizing from them. For instance, findings related to the effectiveness of mask mandates do not mean that mask mandates are always effective or appropriate.
Studies carried out in healthcare settings were among the first to contribute to the body of literature examining whether masks reduce the transmission of SARS-CoV-2 infection, with 75% of included healthcare studies occurring prior to July 2020. This is likely due to the implementation of masks in these settings at the very beginning of the pandemic before recommendations and mandates were extended to the public. While the timing makes these early studies less subject to confounding due to vaccines and other pharmaceutical interventions, they pre-date the emergence of several SARS-CoV-2 variants of concern, so it is not clear whether masks are effective, for example against highly transmissible Omicron variants. Additionally, masks in these settings were usually introduced in conjunction with other NPIs including personal protective equipment, visitor restrictions, strict quarantine and isolation rules, and in some cases, increased ventilation requirements in COVID-19 wards. Likewise, masks were rarely the only NPIs implemented in community settings, particularly early in the pandemic when people were encouraged to limit mobility, self-isolate when ill or in contact with cases and seek frequent testing. Regarding bias, there is a risk of overstating the effectiveness of masks if, for example, mask wearers were more cautious about meeting others in their communities. On the other hand, the potential effectiveness of masks could be understated if the risk of infection has been reduced by other concurrent NPIs, as suspected by Bundgaard et al. [20].
Because it is difficult to monitor mask-wearing in practice during a health emergency, the observational studies included in this review have relied on self-reported mask-wearing as a measure of mask use. Some studies made efforts to guard against recall bias by following participants prospectively and following up on positive COVID-19 results as soon as possible, thereby reducing the chance of such bias. This approach was more common in healthcare-based studies than those set in the community. Standardized methods for recording and reporting adherence to masking are needed, as is increased rigour in reporting the exact designs and outcomes of observational research.
There are a number of considerations and challenges inherent in designing high quality trials at the individual or population level to study the effectiveness of health interventions such as masking or adherence to mask mandates. This is particularly true when different actors with different experiences and varying environmental contexts behave in unpredictable ways that are difficult to measure consistently. Understanding if face masks are effective in reducing the transmission of SARS-CoV-2 is contingent on the intervention (masks or mask mandates) being delivered in a similar way and on study participants wearing masks in a similar way. Repeated-occurrence behaviours such as masking are known to vary from day to day at the individual level in response to environment/contextual and intra- and inter-individual factors [89]. These factors can influence recruitment, retention and outcome measures. Attention to and evaluation of trial process interventions and anticipation of potential barriers to adherence could enhance future trials [90].
Our review expands upon existing reviews by including more recent studies (released up to 27 January 2023), by broadening inclusion criteria to encompass a range of study designs and interventions, and by including evidence from community and healthcare settings. It also differs from Jefferson et al.'s and Chu et al.'s reviews in being restricted to COVID-19-related studies. The rates of severe infection and mortality from COVID-19 infection were significantly higher than those of influenza across age ranges, and initial estimates of the epidemiologic reproductive number established that SARS-CoV-2 is more transmissible than influenza. We therefore decided against directly comparing COVID-19 studies with studies of other illnesses and excluded non-COVID-19 studies from this review.
As this was a rapid systematic review, we abbreviated established systematic review methodology. We did not attempt to classify studies according to whether the masks reduced the transmission of infection from infected mask wearers to others, to uninfected mask wearers from other infected people, or both. We stopped our ROB assessments once a study received a critical judgement in at least one domain. Thus, we did not fully grade the certainty or validity of the evidence, and this limits our ability to estimate the impact of these studies' ROB on the certainty of their findings. However, it is important to note that a critical assessment in one domain did not necessarily mean that bias was detected, but rather that it could not be ruled out. Additionally, although not typically recommended for systematic reviews, the search for this rapid review included keywords related to outcomes of interest (e.g. ‘transmission’; ‘spread’). While this kept the number of results at a feasible level for rapid screening, relevant articles not reporting on these concepts in the title, abstract or subject headings were not captured. We compensated for this limitation by thoroughly screening the included studies of other systematic reviews related to our topic.
5. Conclusion
Most of the studies included in this rapid systematic review were observational rather than experimental. Study designs commonly suffered from a critical ROB. The effects measured in each study were variable in magnitude and generally of low precision. Nevertheless, taking together the evidence from all studies, we conclude that wearing masks, wearing higher quality masks (respirators), and mask mandates generally reduced the transmission of SARS-CoV-2 infection.
(a) Patient-identified key messages
Patient Partners on this project were invited to identify important findings from this synthesis work. Patients and families, particularly those with compromised health, worried about how the limited level of evidence supporting the use of masks to reduce transmission of SARS-CoV-2 will impact adherence, particularly in community settings.
Data accessibility
Supporting information and the data are provided in the electronic supplementary material [91].
Authors' contributions
L.B.: data curation, formal analysis, investigation, resources, visualization and writing—original draft; J.A.C.: conceptualization, funding acquisition, methodology, project administration, resources, supervision and writing—review and editing; A.G.: data curation, formal analysis, visualization and writing—review and editing; H.W.: data curation, formal analysis, visualization and writing—review and editing; C.J.: data curation, formal analysis, investigation, visualization and writing—review and editing; A.D.P.: data curation, formal analysis and writing—review and editing; L.S.: formal analysis, investigation, validation and writing—review and editing; D.C.: validation and writing—review and editing; J.C.: investigation, validation and writing—review and editing; T.F.: resources; J.Cl.: resources; C.D.: conceptualization, funding acquisition, investigation, methodology, project administration, resources, supervision, validation and writing—review and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
This project began as a living evidence synthesis of studies in community settings commissioned by the Office of the Chief Science Officer, Public Health Agency of Canada. The initial development and continued updating of that living evidence synthesis was funded by the Canadian Institutes of Health Research (CIHR) and the Public Health Agency of Canada. The addition of studies in healthcare settings and the preparation of this version of the manuscript were commissioned and funded by the Royal Society's Non-Pharmaceutical Interventions Programme. The opinions, results and conclusions are those of the team that prepared the evidence synthesis, and independent of the Government of Canada, CIHR, the Public Health Agency of Canada, and the Royal Society. No endorsement by the Government of Canada, Public Health Agency of Canada, CIHR, or Royal Society is intended or should be inferred.
Acknowledgements
We acknowledge the work of Dr John Lavis and the COVID-END Public Health and Social Measures Living Evidence Synthesis Working Group, who developed guidelines for the protocol of this work and the template on which the tables are based. Dr Lori Linkins developed the modified version of the ROBINS-I instrument on which our instrument is directly based. Additionally, the Steering Group of the Royal Society's NPI Programme contributed to the manuscript's format and methods. We acknowledge the work of Dr Andrea Tricco and the support of the SPOR Evidence Alliance in completing this review. We acknowledge the work of our research assistants who contributed to study screening and data extraction: Nawal Fatima, Erin McConnell, Tamar Gazit, Chloe Flynn, Madison Hickey, Stephanie Rowe and Jennifer Lane. We also acknowledge the work of Maritime SPOR SUPPORT Unit staff members Alice Bruce, Kristy Hancock and Amy Grant. Finally, we acknowledge with great appreciation Tamara Navarro-Ruan, health sciences librarian and Research Coordinator at the Health Information Research Unit, McMaster University, for leading development of the search and for executing all subsequent updates.
Footnotes
One contribution of 7 to a theme issue ‘The effectiveness of non-pharmaceutical interventions on the COVID-19 pandemic: the evidence’.
Electronic supplementary material is available online at https://doi.org/10.6084/m9.figshare.c.6688729.
© 2023 The Authors.
Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
Abstract
Social distancing measures (SDMs) are community-level interventions that aim to reduce person-to-person contacts in the community. SDMs were a major part of the responses first to contain, then to mitigate, the spread of SARS-CoV-2 in the community. Common SDMs included limiting the size of gatherings, closing schools and/or workplaces, implementing work-from-home arrangements, or more stringent restrictions such as lockdowns. This systematic review summarized the evidence for the effectiveness of nine SDMs. Almost all of the studies included were observational in nature, which meant that there were intrinsic risks of bias that could have been avoided were conditions randomly assigned to study participants. There were no instances where only one form of SDM had been in place in a particular setting during the study period, making it challenging to estimate the separate effect of each intervention. The more stringent SDMs such as stay-at-home orders, restrictions on mass gatherings and closures were estimated to be most effective at reducing SARS-CoV-2 transmission. Most studies included in this review suggested that combinations of SDMs successfully slowed or even stopped SARS-CoV-2 transmission in the community. However, individual effects and optimal combinations of interventions, as well as the optimal timing for particular measures, require further investigation.
This article is part of the theme issue 'The effectiveness of non-pharmaceutical interventions on the COVID-19 pandemic: the evidence'.
1. Introduction
Social distancing measures (SDMs) are interventions applied to individuals in the community that aim to reduce transmission by reducing person-to-person contacts or the chance of transmission when contact occurs, regardless of their infection or exposure status. The use of SDMs as a means to reduce community transmission of infectious diseases dates back to the 1918 influenza pandemic when school closures and restrictions on mass gatherings were implemented in the United States (US)—a policy decision that was later estimated to have saved thousands of lives [12]. School closures were also implemented during the 2009 influenza A(H1N1) pandemic. However, as infections were generally of mild-to-moderate severity and antiviral treatments and vaccines became quickly available, SDMs were implemented for less than a year. These SDMs were also less restrictive and measures to restrict human mobility more generally were not implemented. During the COVID-19 pandemic, restrictions on mass gatherings, school closures, business closures and restrictions on human mobility (both within and across national borders) were implemented in most countries and in many cases for prolonged periods of time.
SDMs may be applied to specific community settings where there is thought to be a higher risk of disease transmission or a higher impact of outbreaks than the general community. With the help of rigorous contact tracing in some locations, it was identified early in the COVID-19 pandemic that clusters of cases were occurring in settings that involved close interpersonal interactions or industries that require their employees to work directly with clients or the public [34]. Social and dining activities that occur in restaurants, bars and weddings were associated with more secondary cases than households for individuals of the same age [5]. Many social settings and environments where personal care services are delivered with the removal of face masks may be at higher risk of disease transmission in those settings, especially when they occur in enclosed spaces. As such, SDMs that limited group sizes for dine-in restaurants, reduced the capacity of venues or limited opening hours were implemented. Outbreaks in long-term care facilities were particularly concerning because infections in frail older adults were often more severe than those in younger individuals. SDMs in care homes during the COVID-19 pandemic generally took the form of cohorting residents and staff or restricting visitors, where sick or exposed residents were grouped together and/or dedicated staff were assigned to only work within those groups. This is challenging as care homes often rely on agencies to provide staff on an on-call basis due to well-documented pre-existing staffing challenges [6]. In 2009, it was estimated that up to 60% of nursing homes in the US relied on agency staff [7]. This reliance meant staff members would commonly work in multiple care homes, which contributed to the introduction of infections in some care homes during the pandemic. A study carried out in the US in 2020 found that the risk of infection in nursing homes increased ninefold if the homes hired staff through agencies [8].
Lockdowns, also known as stay-at-home orders, were the most restrictive SDM where the majority of the population were required to stay-at-home with exceptions granted only for exercise and essential shopping. The stringency of this measure varied across the world due to the need to balance preventing transmission that could lead to numerous hospitalizations and deaths while also avoiding large contractions of the economy or the breakdown of essential services. Therefore, some stay-at-home orders were targeted at some segments of the community rather than community-wide, for example, allowing construction sites or factories to remain open. Here, we focused on reviewing the effect of SDMs in community settings, including those applied in high-risk settings.
2. Methodology
Individual search terms and systematic literature searches were performed to obtain studies reporting the effectiveness of nine specific SDMs (school closures; school measures; workplace closures; workplace measures; catering, fitness and personal care service measures; care home measures; restrictions on mass gatherings; physical distancing and stay-at-home orders) on the transmission of SARS-CoV-2 in the community (electronic supplementary material,appendix B).
Database searches were carried out in Web of Science and Scopus from 1 January 2020 to 1 December 2022. All study designs were considered for inclusion in this review. However, if there were more than 10 observational studies or studies of higher quality of evidence such as randomized controlled trials or quasi-experimental studies, simulation studies (defined as modelled scenarios without fitting to any observed data) were excluded. Throughout this review, the term ‘ecological study’ refers to the investigation of an association using population-level rather than individual-level data, and may, therefore, be vulnerable to ecological fallacy [9]. Studies that use statistical or mathematical modelling methods that were fitted to observed data will be referred to in this review as ‘modelling studies’ while studies that only simulate hypothetical epidemics based on parameter estimates or assumptions will be referred to as ‘simulation studies’. Preprints were excluded from this review. Broad Google searches were also conducted to find any existing systematic reviews, and relevant studies included in these reviews were also included here.
We included in this review quantitative studies that estimated the effect of SDMs implemented in the community setting that were aimed at reducing SARS-CoV-2 transmission or disease severity by reducing person-to-person contacts or making such contacts safer. Community settings were defined as non-healthcare settings where medical care by health professionals is not usually delivered, including homes, schools, workplaces and long-term care facilities in the community. Interventions designed to reduce transmission in the community through other mechanisms, such as improved ventilation or face masks, are not considered in this review.
The titles, abstracts and full texts of search results were screened by two reviewers. Data were extracted from included studies, and the quality of the evidence was assessed using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework—a tool for evaluating the evidence available for an intervention based on eight criteria, four of which are based on the risk of biased estimates due to study design or measurement errors [1011]. In this framework, the certainty or quality of evidence is categorized into four levels: very low, low, moderate or high. The quality of evidence from randomized trials is initially rated as high, and evidence from observational studies is initially rated as low. This initial rating is then penalized when there are potential risks of bias (e.g. selection or misclassification bias), inconsistencies of findings with published literature, indirectness of reported outcomes compared with true outcomes (e.g. the use of non-specific outcomes), imprecise measurement of exposure or outcomes and the likelihood of publication bias; or upgraded if the reported effect size is large despite plausible residual confounding that may reduce or nullify the effect, and has an appreciable dose–response relationship between intervention and outcomes [11]. Thus, findings from well-conducted randomized controlled trials would generally be considered high-quality evidence with this tool. By contrast, observational studies are often classified as either low or very-low quality.
Data extraction and GRADE assessments were conducted using Microsoft Excel, and further analyses of extracted data were conducted using R v. 4.2.1 (R Foundation for Statistical Computing, Vienna, Austria).
3. Results
The nine systematic reviews included 338 studies, among which 48 reported effectiveness estimates for more than one SDM (figure 1, electronic supplementary material, appendix A, Table S3) [12–59]. Most studies analysed population-level data and examined SARS-CoV-2 transmission and/or COVID-19 mortality and morbidity (including hospitalizations and deaths) in the presence or absence of the intervention. The main reasons studies were excluded were that the intervention was not evaluated in a community setting or the outcome was unrelated to the effect of the intervention on SARS-CoV-2 transmission (figure 1). As most of the evidence identified in this review came from observational studies, the quality or certainty of the evidence was mainly rated as low or very low for most studies (electronic supplementary material, appendix A) based on the GRADE framework, indicating that the true effect may differ from the estimated effect. However, with over 300 studies included in this study, patterns indicative of the effectiveness of SDMs have emerged. When modelling studies or simulation studies were included due to a lack of randomized or observational studies, their quality of evidence was not assessed because the GRADE framework was not designed for their assessment. In general, simulation studies that did not fit models to data would typically be considered to provide a lower quality of evidence than observational studies or modelling studies that did fit mathematical models to epidemic curves.
Owing to the high volume of papers and concerns over the quality of evidence presented in some studies, we highlight here studies that were conducted more thoroughly (attempted estimating a causal relationship while adjusting for confounders and quantifying uncertainty) and examined whether other studies supported their findings. Details of the included studies are provided in electronic supplementary material, appendix A, and the visualization of their study period and reported effects are included in electronic supplementary material, appendix C.
(a) Stay-at-home orders
During the COVID-19 pandemic, stay-at-home orders were also referred to as lockdowns, shelter-in-place, mandatory control orders or in some locations as circuit-breaking measures. Italy was the first European country to implement stay-at-home orders on 9 March 2020, lasting over 60 days. As infections spread, the UK also announced a lockdown on 23 March 2020 and began a phased reopening by mid-May that year. Here, we included 151 studies estimating the effectiveness of stay-at-home orders (electronic supplementary material, appendix A, Table S13), 119 of which found a substantial benefit resulting in a reduction of the reproduction number (Rt) [1623333538454860–97], incidence of SARS-CoV-2 infection [29505298–129] and mortality [107116130–143].
Among the studies that reported a relative reduction in Rt, most estimated substantial reductions of around 50%, although there was a wide range of effects (6–81%). These studies had different study designs, populations and definitions for stay-at-home orders. They were mainly carried out at a national scale within the first year of the pandemic (electronic supplementary material, appendix C). Definitions differed in stringency, where a stay-at-home order may include lockdown-type measures such as restricting internal travel and imposing limitations on gatherings versus the most stringent where individuals were unable to leave their homes for anything other than exercise or essential shopping. A modelling study that estimated the effects of 17 non-pharmaceutical interventions across two waves in seven European countries estimated that a lockdown (banning all gatherings and closing all non-essential businesses) reduced the Rt by 52% (95% CrI: 47%, 56%) [45]. Two studies in the US in 2020 found similar reductions, estimating a reduction of Rt by 51% (95% CI: 46%, 57%) after stay-at-home orders were implemented in some states [3880]. However, another study carried out on a multi-national scale in early 2020 concluded that stay-at-home orders had a relatively small additional effect (on top of business closures, school closures and gathering restrictions that were already in place), reducing the Rt by 13% (95% prediction interval (PI): 5%, 31%) [16]. A study in Europe in 2020 also estimated a smaller additional effect of lockdowns when implemented on top of other measures [28].
Studies that estimated the impact of stay-at-home orders on COVID-19 incidence varied across settings. A study in Australia from 2020 to 2021 estimated that the lockdown in Victoria decreased the incidence of COVID-19 two weeks after its implementation (incidence rate ratio (IRR): 0.88; 95% CI: 0.86, 0.91) [113]. In comparison, a multi-national analysis that looked at 210 countries in early 2020 found that stay-at-home orders reduced the incidence of COVID-19 by 11.2% [52]. Three studies did not find a significant association between stay-at-home orders and COVID-19 cases [1428144]. However, the effectiveness of stay-at-home measures on reducing mortality was mixed, with 16 studies [107116130–141] reporting reductions, and nine studies reporting no significant associations [19283047144–148]. Nevertheless, one study in Europe [28] and another in the US [144] concluded that social distancing behaviours had already changed substantially before stay-at-home orders were implemented in early 2020, and therefore little additional benefit was observed after the stay-at-home orders were issued.
(b) School measures and closures
Historically, children have played an important role in influenza transmission due to their susceptibility to influenza virus infection and prolonged viral shedding, which facilitates transmission to family members and, by extension, the community. Based on this prior understanding, schools in many countries were proactively closed during the pandemic. In some locations, however, interventions to reduce within-school transmission were used as an alternative to complete closures of schools. These include rotating school schedules (e.g. children from different grades may be in school at different times), reducing the number of consecutive school days, physical distancing or limiting classroom capacities. Some classes were dismissed early to avoid having meals together or if a case was identified in that class.
Eighteen included studies estimated the effectiveness of school measures (not including closures) to reduce the impact of COVID-19. Six were observational studies [149–154] and 12 were simulation studies [155–166]. Most studies were carried out in 2020 at a sub-national scale (electronic supplementary material, appendix C). Seven of these studies estimated the effect of school measures in combination with the mandatory or recommended use of face masks [149151–153155157160].
Four of the six observational studies assessed individual schools between the end of 2020 and early 2021. Three did not quantify the effect but observed minimal transmission in schools with SDMs and universal masking interventions in place despite substantial community transmission [150151153]. The remaining study that examined the effect of multiple distancing measures in 36 schools in Italy (including limited capacity, distanced student desks and minimized crowding and entry and exits) observed that the overall secondary transmission rate was 3.8%, although there was no comparator without the interventions in place [152]. Two other observational studies had similar findings [150154]. One that used data from 35 school outbreaks across 12 countries from 2020 to July 2021 [154] suggested that distancing and masking were both associated with a lower risk of SARS-CoV-2 infection in schools (adjusted odds ratio (aOR): 0.30, 95% CI: 0.25, 0.37) [154]. Another ecological study examining schools in North Carolina and Wisconsin, US, from 2020 to 2021 did not observe an increase in the secondary transmission rate in schools after distancing measures were relaxed, indicating they had no effect on transmission in these schools [149]. The remaining simulation studies found that school measures were associated with reductions in public health impacts of COVID-19, both in the schools [156–158160162–166] and the community [155159161].
For the review of school closures, 104 studies were included. All the studies were observational; about 89% estimated the impact of proactive school closures during the COVID-19 pandemic. The remaining 11% estimated the impact of reopening of schools. Forty-eight studies were conducted on a national scale, and 35 were on a multi-national scale. Over half of these studies showed strong evidence for the effectiveness of this intervention. Compared with the other individual interventions, school closures were examined for longer into the pandemic, with 29 studies assessing their effectiveness in 2021 (electronic supplementary material, appendix C).
Thirty-two studies estimated the effects of school closures on a country's community epidemics [13171923–262831343638394142444651–545658167–175] and 26 studies estimated the impact of closures on students, staff or specific age groups in the population [40176–200]. Ten studies estimated the relative reduction of Rt in the population [2326343839425153, 170171] with school closures implemented alone or in combination with other interventions. One study that attempted to discern the individual effect of school closures estimated they were responsible for reducing the Rt of SARS-CoV-2 in the US in the first half of 2020 by 37% (95% CI: 33%, 40%). This was followed by daycare closures (31%, 95% CI: 26%, 35%) [95]. In comparison, another study in the US carried out from January to May 2020 found school closures were associated with a 10% reduction in the daily Rt [38].
Several studies estimated the effect of proactive school closures internationally and estimated reductions in the incidence [132528345458] and transmission [3139414244] of SARS-CoV-2, and their associated hospitalization [46] and mortality [19172–174]. The impact of school closures varied potentially due to differences in study populations, study period and the timing of implementation of school closures across settings. A study that estimated the independent contribution of school closures across 30 European countries from January to April 2020 reported that the IRRs for COVID-19 cases were estimated at 52% 22–28 days after school closures compared with a pre-intervention baseline). This estimate was 14% after more than 36 days of school closures [28], which indicated negative associations between school closures and the IRR for COVID-19 cases. These associations were unclear in six studies conducted on a multi-national scale in 2020 [1724365256175]. One study that looked at the general population across 90 countries did not observe statistically significant effects of school closures on daily confirmed COVID-19 cases during the first global wave of the pandemic but estimated significant reductions in the second and third waves [175].
Twelve studies examined the impact of reopening schools [21201–211]. Nine showed an increasing trend in the number of daily new confirmed COVID-19 cases, growth rate or Rt of COVID-19. One study in South Carolina, US, estimated the Rt increased by 12.3% (95% CrI: 10.1%, 14.4%) after schools reopened at the end of August 2020 [21]. However, one study found that the reopening of schools did not immediately impact SARS-CoV-2 incidence, and an increase in incidence was not observed until 13 weeks after reopening [203]. The remaining four studies observed that reopening schools did not generate a substantial increase in transmission within the community when other interventions to prevent SARS-CoV-2 transmission were in place, including staff wearing masks, hand sanitation and limiting the in-person capacity in schools [198208–210].
Sixteen studies also examined different school closure strategies [12203747485559212–220]. For instance, two studies estimated that the delay of school closures in March 2020 was associated with more deaths across 50 states in the US [212213]. Similarly, two more studies estimated the impact of the timing of the implementation of school closures [55214]. One study in the US estimated that every additional day of delay from a county's first case until implementation of school closures, was associated with 1.5–2.4% higher cumulative COVID-19 deaths per capita (980–1972 deaths) for a county with median population and deaths per capita [214]. In Pakistan, a city that implemented complete school closures for 10 days saw a greater decline in incidence for the overall city population compared with a city that partially closed schools [217]. In the remaining 18 studies [222736454950221–232], the effects of school closures were estimated alongside other interventions, making it challenging to isolate the individual effect of school closure strategies.
(c) Workplace measures and closures
Twelve observational studies were included in the review for workplace measures [17214155233–240]. These were carried out at a national and sub-national scale, with nine studies conducted within the first year of the pandemic (electronic supplementary material, appendix C). Ten studies examined the effectiveness of workplace measures using population-level data [172155234236–238240–243]. Six studies assessed workplace mobility to explain the variation in the COVID-19 growth rate, case numbers or Rt [1755234236237240] consistently showed that reduced workplace mobility was significantly associated with reduced SARS-CoV-2 incidence or transmission [234236237240]. Some analyses were based on data from a single country [1755], and others considered data from different countries [41235].
There were two retrospective cohort studies that estimated the effectiveness of individual measures instead of a combination of measures at a population level [233239]. Seven companies in Spain implemented a digital application in May 2020 that monitored workers in real-time to enable the quick identification and isolation of workers. Over a seven-month period, the proportion of symptomatic employees continuously decreased [233]. Another study on temperature screening implemented by 20 multi-national companies in February 2021 [239] found that the detection of COVID-19 cases using this measure alone was very rare, and approximately 2000 workers who were diagnosed with COVID-19 during the study period were not identified [239].
Thirty-seven studies were included in the review for workplace closures, 90% of which estimated effects in 2020. It is, therefore, unclear whether the levels of effectiveness would be similar for newer SARS-CoV-2 variants. All the included studies were observational, and they estimated the effect of workplace closures at a population level alongside other interventions. Most studies (92%) observed a beneficial effect of workplace closures alone or in combination with other interventions to reduce incidence [1524252729324652244–250] of COVID-19 and transmission [18232433353839434849251] of SARS-CoV-2. A study that reported the examined effect of workplace closures alone estimated that non-essential workplace closures across 13 countries in Europe from March to May 2020 were estimated to reduce the change in deaths by 4% points (95% CI: 0.5, 7.4 pp). Another study based on county-level data in the US in early 2020, estimated that had SDMs such as non-essential business closures and stay-at-home orders been implemented a day earlier, the COVID-19 death rate could have been lowered by 1.9% [252]. However, a study using data from 10 countries (England, France, Germany, Iran, Italy, Netherlands, Spain, South Korea, Sweden and the US) that compared more stringent SDMs (mandatory business closures and stay-at-home orders) in England with less stringent policies in Sweden and South Korea in early 2020 did not observe additional effects with more stringent measures on the growth rate of cases in any country [14].
(d) Catering, fitness and personal care service measures
Nine studies estimated the effectiveness of measures in catering, fitness and personal care service settings, including one randomized controlled trial [253], one cross-sectional study [254], three ecological studies [57255256] and four simulation studies [257–260]. One study examined fitness centres [253], and another looked at the reopening of theatres [260], while the remaining 10 studies estimated the effectiveness of SDMs in restaurants and bars. Six studies were conducted on a national or sub-national scale, while three simulation studies had an unclear setting or study period [258–260].
Catering measures appeared to be effective at reducing SARS-CoV-2 infection and Rt, although the effect varied by specific distancing measures. A study in Spain estimated that shortened bar and restaurant business hours and restricted outdoor seating capacity from August 2020 to January 2021 were associated with significant reductions in Rt of 0.14 and 0.11 respectively [57]. Similarly, in Norway, SARS-CoV-2 infection among bartenders and waiters had been reduced by 60% from 2.8 (95% CI: 2.0, 3.6) per 1000 workers to 1.1 (95% CI: 0.5, 1.6) per 1000 workers) four weeks after the implementation of a ban on serving alcohol in late 2020 to early 2021. The partial ban decreased infections among bartenders and waiters by 50% from 2.5 (95% CI: 1.5, 3.5) per 1000 workers to 1.3 (95% CI: 0.4, 2.2) per 1000 workers [254]. However, this was not supported by two observational studies that were both carried out in 2020 in Hong Kong and Tokyo, Japan [255256]. The ban on dine-in services after 18.00 in restaurants in Hong Kong may not have influenced Rt after capacity reductions had already been considered [255]. Similarly, the randomized controlled trial carried out in Norway in 2020 suggested that there was no significant difference in laboratory-confirmed SARS-CoV-2 infections between the intervention (access to a fitness centre implementing prevention control measures) and control (no access to fitness centres) groups after 14 days. However, there were concerns in several aspects of the trial, such as the low incidence of COVID-19 during the trial period and the risk of transmission by exercising in groups, whether in a fitness centre or not [261].
(e) Care home measures
Sixteen studies, including 11 epidemiological studies [53262–271] and five simulation studies [272–276] examined the effect of SDMs on SARS-CoV-2 transmission in care homes or the effect of care home SDMs on population-level transmission. Nine of the 16 studies were carried out at a sub-national scale (electronic supplementary material, appendix C).
Two studies were chosen to discuss in depth as they both reported the effect of cohorting of staff and/or residents while taking confounding factors into consideration. One study on long-term care facilities in the south-west of France in early 2020 estimated that if staff were organized into smaller groups to work in different areas of the facility with no physical connection to other groups, there was a reduced risk of infection (odds ratio (OR): 0.19, 95% CI: 0.07, 0.48) [268]. This was not the case if residents were similarly compartmentalized (OR: 3.01, 95% CI: 0.51, 18.51) or if they were restricted to their rooms (OR: 1.67, 95% CI: 0.49, 5.76) [268]. However, the 95% confidence intervals for compartmentalizing residents were very wide owing to the small sample that reported they implemented this intervention (less than 20% of the 124 long-term care facilities examined). A second study carried out in the UK during mid-2020 reported that both the risk of infection in residents (aOR: 1.30, 95% CI: 1.23, 1.37) and staff (aOR: 1.20, 95% CI: 1.13, 1.29) were significantly higher in long-term care facilities in which staff often or always cared for both infected or uninfected residents compared with those that always cohorted staff [269].
Rigorous modelling, fitted to surveillance data in England, examined a different care home measure: restricting the number of visitors per nursing home resident [265]. It estimated the impact of reducing the contact rate between care home residents and the general population by 50% in 2020. However, compared with a baseline scenario with no reduction in contacts, the study did not find a substantial difference in care home deaths [265]. It was suggested that this may have been due to other routes of transmission into care homes at the time, such as patients being discharged from the hospital to care homes without being tested, indicating the importance of understanding the routes of transmission to increase the effectiveness of interventions and reduce the impact of COVID-19. This was supported by another simulation study in English nursing care homes [275].
Two population-based studies [264266] ranked the US state-level restrictions by the stringency of their measures and compared COVID-19 incidence across states. Using data from just over 6800 nursing homes, one study estimated that the states with more stringent measures had an 11% reduction in the new cases in residents (IRR: 0.89, 95% CI: 0.83, 0.97) compared with states with less stringent restrictions [266]. Similarly, the risk of infection for those residing in assisted living communities was lower in states with more stringent measures [264]. In the first half of 2020, the US implemented a ban on visits to nursing homes to reduce contact with the community when the prevalence was high. This was estimated to reduce the weekly effective reproduction number of SARS-CoV-2 across 3035 US counties by 26% (95% CI: 23%, 29%) [53].
Other than cohorting and restricting visitors, staff working in multiple care homes was considered a large source of SARS-CoV-2 introductions and subsequent outbreaks [277]. The role of these connections in spreading COVID-19 cases was examined among nursing homes in the US in 2020 by analysing smartphone location data from 50 million smartphones over an 11-week period. Results indicated that if a nursing home adds one neighbour (a home with at least one shared contact) the expected number of COVID-19 cases increases by 15.2% and further suggests that 49% of cases in nursing home residents were attributed to staff transmitting SARS-CoV-2 across numerous homes [262].
(f) Physical distancing and restriction of mass gatherings
Physical distancing is a measure taken by individuals to stay a recommended distance between one another (usually a minimum of 1 m) to limit transmission [278]. However, distancing oneself from others is also the key to many of our community-wide measures, and therefore, the term physical distancing also encapsulates measures including restrictions on mass gatherings, working from home, staying at home or adaptations to schools and workplaces [278]. Thirty-four studies were identified in the review of physical distancing (electronic supplementary material, appendix A) [13182629323342455156279–302], 19 of which were observational studies. Most studies were carried out in 2020 and at a national scale, and 33 studies found physical distancing effective (electronic supplementary material, appendix C).
A rigorous modelling study that examined the individual effects of interventions in seven European countries from 2020 to 2021 found business closures were particularly effective with a 35% (95% CI: 29%, 41%) reduction in Rt while gastronomy and nightclubs were estimated to reduce the Rt by 12% (95% CI: 8%, 17%) each, and the closure of leisure centres, entertainment venues, zoos and museums had minimal effects. Considered collectively, banning all gatherings, including one-to-one meetings, had a large effect with a 26% (95% CI: 18%, 32%) reduction in Rt [45]. Seven other studies [2629334256283286] attempted to estimate individual effects. The remaining studies examined the combined effects of different packages of measures due to the differing interventions included in the term physical distancing. For example, another study across 50 states in the US during the first five months of 2020 estimated social distancing was associated with a 15.4% daily reduction in COVID-19 cases where the SDMs included limits on the size of group gatherings, closures of public schools and non-essential business and stay-at-home orders [32]. Only one study assessed physical distancing at an individual level between students in schools. The study compared public schools in Massachusetts, US, that implemented greater than or equal to 6-feet distancing to those with greater than or equal to 3-feet distancing between students. However, likely due to a small sample size, it did not capture any additional effects that distancing by 6 feet may have [301].
If physical distancing measures had been implemented earlier many infections and deaths could have been avoided, two studies suggested [279296]. One study in New York, US, estimated that if the interventions were implemented a week earlier, the total number of COVID-19 cases would have reduced by nearly 162 000 as of 31 May 2020. If there was a one-week delay, the total number of cases could have increased from 203 261 to 1 407 600 [279]. Cases and deaths were also estimated to reduce by 35.2% and 30.8% in Iran from January to September 2020 if physical distancing interventions (including school and border closures) and self-isolation were implemented a week earlier [296].
The effectiveness of restricting mass gatherings for reducing the impact of COVID-19 was examined in 28 studies [12161819232833–3538394143–45525759303–312], and 26 reported a substantial reduction in the impact of COVID-19. The most common outcome was the Rt [16182333353839434557307–310] of SARS-CoV-2. Only two studies did not find statistically significant effects of the restriction on mass gathering on SARS-CoV-2 transmission [1923]. The majority of studies were carried out in 2020 on a multi-national scale (electronic supplementary material, appendix C). Half of the studies (14) made use of the Oxford COVID-19 Government Response Tracker [16181933–35394459303–305].
A comprehensive modelling study, using data from 41 countries, estimated the effects of different levels of stringency for restrictions on gatherings in the first half of 2020 [16]. The associated reduction in Rt for restricting gatherings to 1000 people or fewer was 23% (95% PI: 0%, 40%); limiting gatherings to 100 people or fewer was 34% (95% PI: 12%, 52%) and limiting gatherings to 10 people or fewer was 42% (95% PI: 17%, 60%) [16]. That the effectiveness of restricting mass gatherings increased as the stringency increased was supported by six other studies [1638445259305] across the world (e.g. 37 OECD countries, 30 Asian countries and 50 states in the US) during varying periods in 2020. Five studies only found a significant impact on SARS-CoV-2 transmission when the mass gathering restrictions reached the maximum limit of 10 people or fewer [163545310311]. The effectiveness of restrictions on mass gatherings also seemed to increase over time. For instance, one study estimated that the restriction of public gatherings of more than 10 people was associated with a reduction in Rt by 6% on day 7, 13% on day 14 and 29% on day 28 across 131 countries in the first half of 2020 [35].A similar pattern of effect on daily cases was also found in two other studies [2834]. However, the period of assessment was short (the first half of 2020) and therefore did not consider the long-term implementation of such restrictions, where adherence could have waned in subsequent waves.
4. Discussion
This review identified 338 studies that assessed the impact of SDMs on reducing the transmission of SARS-CoV-2 in community settings. Nearly half of the studies included in this review estimated the effectiveness of stay-at-home orders, 79% of which were found to substantially reduce transmission. The main response variable assessed was the Rt of SARS-CoV-2 in national and multi-national settings. As the effectiveness of interventions differed across populations and across demographics within a population, as well as potentially over time, examining the experiences in multiple countries was beneficial to understand the average effect and variations across the world. Further research could examine the drivers of policy impact, such as the degree to which human behaviour changed, especially across time, as most studies for the stay-at-home orders review were carried out in 2020. The potential for interactions with individual protective measures is also worthy of further study. Logically, as contacts are reduced, the Rt of SARS-CoV-2 becomes smaller, which should result in a decline in transmission and, by extension, a decline in COVID-19-associated morbidity and mortality. While in the minority, nine studies found no significant associations between stay-at-home orders and morbidity and mortality, suggesting that other voluntary behavioural changes had occurred before more stringent interventions were implemented. This highlights one of the key considerations for SDM policies, namely that a policy will only have an effect if it changes the behaviours or actions of individuals in the community. While we reviewed the impact of various SDMs, most studies did not have quantitative data on the degree to which behavioural changes occurred in response to the SDMs.
Estimating the effects of individual interventions proved challenging. By 30 May 2020, five SDMs (closures of schools, workplaces and public transport, restrictions on mass gatherings and public events and restrictions on movement) were already implemented in 118 countries out of 149 countries examined [29]. Therefore, a common difficulty for the studies included in this review was disentangling the effects of any one of the SDMs from other interventions applied at the same time or further understanding the incremental benefit of each. High-quality randomized controlled trials are also infeasible for many SDMs and as such observational evidence and modelling studies guided decision makers on appropriate measures. With that said, studies have made use of different statistical models to reliably estimate the effects of individual measures indicating that the more stringent interventions such as stay-at-home orders followed by restrictions on mass gatherings and school closures were likely to be the most effective at reducing the transmission of SARS-CoV-2. Thus, the combined effects of numerous SDMs, including school closures, workplace closures and restricting mass gatherings were highly successful at reducing transmission in the community. This was also evident by the lack of other respiratory viruses circulating during the COVID-19 pandemic and the subsequent increase in influenza and respiratory syncytial virus transmission upon relaxation of all non-pharmaceutical interventions [170313–315].
Human mobility became a common indicator for estimating the impact of non-pharmaceutical interventions. Mobility data can arise from public transportation data or aggregated mobile device data to estimate the movements and mixing of a population. When SDMs are implemented, we would expect to see corresponding reductions in mobility and, therefore, contacts. Eighteen studies [13171827303655188198210219234237240245246248283] included in this review used human mobility as a proxy for estimating the effect of interventions, although we did not explicitly include human mobility in the search terms of our review. Searching all abstracts that were reviewed for the term ‘mobility’ yielded 267 results. While some may not have been relevant to our outcomes, because mobility was not included as a search term, it is likely that many have been missed. Sixteen of the eighteen studies used Google mobility data (mobile phone devices), which has been shown to accurately explain the transmission of SARS-CoV-2 [316–319]. Mobility data could also be used to assess the level of adherence to interventions in place. Adherence, or changes in a population's behaviour, can impact the effectiveness of SDMs. Unintended changes in mobility and population behaviour can be prompted by the initial measures put in place. For example, when schools close, some parents may require work-from-home arrangements, reducing the potential for transmission in the workplace.
Stringent community-wide measures come at a high cost to society. However, studies that examined unintended consequences of non-pharmaceutical interventions, such as mental health, were not included in the scope of this review. Nonetheless, we recognize that due to unintended consequences of school and workplace closures, measures to reduce person-to-person contact and thereby reduce the risk of SARS-CoV-2 transmission within both settings may be preferred to substantially limit viral transmission while limiting the adverse educational and socioeconomic effects. More evidence is needed to identify the optimal combination of measures to be implemented in these settings. Most studies in this review were population-based studies. While observational evidence has its limitations, it will remain an important source of information to guide decision-making in future epidemics or pandemics. Subgroup analyses were not frequently assessed and may be a focus for future research. Overall, countries adopted different approaches to managing COVID-19, and many of the SDMs were implemented together. Even though different SDMs may have varied acceptability and feasibility in different socioeconomic and cultural settings, the studies included in this review suggested that the combination of measures was successful in slowing or even stopping the spread of COVID-19, even though some individual effects and optimal combinations are unclear. When considering SDMs in future pandemics, the potential effectiveness of individual or combination of SDMs needs to be assessed in the context of pathogen transmission dynamics and balanced against the socioeconomic impacts of such interventions.
Data accessibility
The data are provided in the electronic supplementary material [320].
Authors' contributions
C.Mu.: data curation, methodology, writing—original draft, writing—review and editing; W.W.L.: writing—review and editing; C.Mi.: visualization, writing—review and editing; J.Y.W.: methodology, supervision, writing—review and editing; D.C.: investigation; Y.X.: investigation; M.L.: investigation; S.G.: investigation; H.X.: investigation; J.K.C.: investigation; S.B.: writing—review and editing; B.J.C.: conceptualization, methodology, supervision, writing—review and editing; C.A.D.: conceptualization, methodology, supervision, writing—review and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
This study was financially supported by a grant from the Royal Society. C.Mi. is supported by a studentship from the UK's Engineering and Physical Sciences Research Council. B.J.C. is supported by the Theme-based Research Scheme (Project no. T11-705/21-N) of the Research Grants Council of the Hong Kong SAR Government and the Collaborative Research Scheme (Project no. C7123-20G) of the Research Grants Council of the Hong Kong SAR Government. C.A.D. is supported by the UK National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Emerging and Zoonotic Infections in partnership with Public Health England (PHE), Grant Number: HPRU200907.
Acknowledgements
The authors thank Ian Boyd, Charles Godfray, Mark Walport and two anonymous reviewers, for helpful feedback on an earlier draft. The authors thank Julie Au, Matthew Barnbrook and Kyle Bennett for technical support.
Footnotes
†Joint senior authors.
One contribution of 7 to a theme issue ‘The effectiveness of non-pharmaceutical interventions on the COVID-19 pandemic: the evidence’.
Electronic supplementary material is available online at https://doi.org/10.6084/m9.figshare.c.6677632.
© 2023 The Authors.
Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
Abstract
We conducted a systematic literature review of general population testing, contact tracing, case isolation and contact quarantine interventions to assess their effectiveness in reducing SARS-CoV-2 transmission, as implemented in real-world settings. We designed a broad search strategy and aimed to identify peer-reviewed studies of any design provided there was a quantitative measure of effectiveness on a transmission outcome. Studies that assessed the effect of testing or diagnosis on disease outcomes via treatment, but did not assess a transmission outcome, were not included. We focused on interventions implemented among the general population rather than in specific settings; these were from anywhere in the world and published any time after 1 January 2020 until the end of 2022. From 26 720 titles and abstracts, 1181 were reviewed as full text, and 25 met our inclusion criteria. These 25 studies included one randomized control trial (RCT) and the remaining 24 analysed empirical data and made some attempt to control for confounding. Studies included were categorized by the type of intervention: contact tracing (seven studies); specific testing strategies (12 studies); strategies for isolating cases/contacts (four studies); and ‘test, trace, isolate' (TTI) as a part of a package of interventions (two studies). None of the 25 studies were rated at low risk of bias and many were rated as serious risk of bias, particularly due to the likely presence of uncontrolled confounding factors, which was a major challenge in assessing the independent effects of TTI in observational studies. These confounding factors are to be expected from observational studies during an on-going pandemic, when the emphasis was on reducing the epidemic burden rather than trial design. Findings from these 25 studies suggested an important public health role for testing followed by isolation, especially where mass and serial testing was used to reduce transmission. Some of the most compelling analyses came from examining fine-grained within-country data on contact tracing; while broader studies which compared behaviour between countries also often found TTI led to reduced transmission and mortality, this was not universal. There was limited evidence for the benefit of isolation of cases/contacts away from the home environment. One study, an RCT, showed that daily testing of contacts could be a viable strategy to replace lengthy quarantine of contacts. Based on the scarcity of robust empirical evidence, we were not able to draw any firm quantitative conclusions about the quantitative impact of TTI interventions in different epidemic contexts. While the majority of studies found that testing, tracing and isolation reduced transmission, evidence for the scale of this impact is only available for specific scenarios and hence is not necessarily generalizable. Our review therefore emphasizes the need to conduct robust experimental studies that help inform the likely quantitative impact of different TTI interventions on transmission and their optimal design. Work is needed to support such studies in the context of future emerging epidemics, along with assessments of the cost-effectiveness of TTI interventions, which was beyond the scope of this review but will be critical to decision-making.
This article is part of the theme issue ‘The effectiveness of non-pharmaceutical interventions on the COVID-19 pandemic: the evidence’.
1. Introduction
Preventing people who have an infectious disease from making contact with others has been one of the cornerstones of public health practice since antiquity [1]. During the COVID-19 pandemic substantial efforts were made by many countries to identify people who were potentially infectious through the use of polymerase chain reaction (PCR) tests or lateral flow viral antigen tests, to isolate these people from others, and to trace and quarantine their contacts—who were also potentially infected. In theory, test, trace and isolate (TTI) methods can work well when the incidence of disease is low, and the generation time is long [2]. The low level of infection means that limited public health resources can be targeted to where they are most needed, and the long generation time means that isolation of the infected individual can occur before many secondary infections are generated. In addition, contact tracing—finding likely secondary infections due to their contact with the identified case—may help to more rapidly identify and prevent further onward transmission from these new infections and further target public health resources. Such testing has been common within public health responses to sexually transmitted infections [3], which often conform to the ideal of low incidence, slow transmission and ease of identifying sexual contacts (although the recent outbreak of Mpox is a counterexample due to the difficulty of identifying the sexual contacts of some of those who were affected [4]).
Testing followed by isolation of the identified infected individual has the potential to break chains of transmission and limit the spread of SARS-CoV-2; when coupled with contact tracing to form TTI there is the potential to accelerate this process. Compared to interventions that aim to reduce transmission through blanket reductions to social contacts, such as stay-at-home orders or workplace and educational closures, TTI interventions aim to identify and prevent contacts between infectious and susceptible people only. They can therefore be seen as more specific strategies compared to population wide social contact reductions, but which could come at a cost to sensitivity and thus epidemic control, if identifying and preventing these contacts is incomplete or delayed. However, as with many public health interventions, there can be a complex range of practical and contextual issues that may affect the strength of the effect. These can be broadly broken down into three categories that relate to: (i) the specific nature of the TTI practices that are employed, (ii) how members of the public engage with these practices and (iii) the current status of the pandemic. In the first category, we include factors such as the speed and sensitivity of the testing mechanism, as well as the speed and accuracy of any contact tracing. The second category includes the public's ability and willingness to follow testing guidance (e.g. testing on early symptoms or testing before entering specific settings) and the extent to which the population is able to adhere to isolation rules [5]. The third category includes the prevalence of infection and the transmissibility of the specific variants that are circulating, as well as the presence of other non-pharmaceutical interventions (NPIs) that reduce the risk of transmission.
(a) Uses of testing, tracing and isolating
During the COVID-19 pandemic, testing and/or TTI were used in a complex and overlapping range of situations (figure 1), which changed as the epidemic progressed:
1) | Focused testing of symptomatic and high-risk individuals especially during the early phase of an in-country outbreak, when infection is rare, to contain the spread and prevent transmission among the wider population. This was the case in the UK during the early outbreak, when symptomatic individuals, in combination with other specific travel histories and contact-based criteria, were tested and manual contact tracing performed [6]. In some countries, an intensive TTI system was implemented that aimed to fully contain re-introduced epidemics following periods of zero or very low transmission. | ||||
2) | Mass testing of those with symptomatic respiratory disease to identify SARS-CoV-2 infection, such that individuals could take appropriate action including home isolation and authorities could organize contact tracing. In the UK, this was deployed from May 2020 to April 2022, with minor differences across the four nations [7]; prior to an increase in testing capacity, the UK had adopted a less specific approach of asking everyone with a new onset cough or fever to isolate [8]. | ||||
3) | Mass testing of those entering healthcare or other vulnerable settings such as care homes [9]. Here, the aim was to block the interaction between vulnerable susceptible people and those who were infectious, thereby reducing transmission to individuals most likely to suffer severe consequences. | ||||
4) | Mass asymptomatic testing of all individuals, in an attempt to identify both symptomatic and asymptomatic infections and hence dramatically reduce the amount of circulating infection [10]. | ||||
5) | Regular asymptomatic testing to identify all infectious individuals in workplace and educational settings. In the UK, this process required healthcare workers [11], care home workers [912] and secondary school children [13] to perform regular (twice weekly) tests using lateral flow devices (LFDs) in attempts to minimize spread. | ||||
6) | |||||
7) | Testing of identified at-risk individuals. While the previous testing protocols are concerned with identifying infections, tests could also be used to reduce the time spent in isolation; either to inform the safe early release from isolation [19] or testing at-risk contacts rather than asking them to isolate [20]. In January 2022, the UK advised that people could end self-isolation after 5 days, if they had two negative lateral flow tests taken on consecutive days [21]. |
Two main forms of diagnostic test were used throughout the COVID-19 pandemic: LFDs and PCR tests [2223]. PCR and LFD have different targets for detecting SARS-CoV-2: PCR is a laboratory-based system that amplifies and detects viral RNA; LFDs detect the viral antigen [24]. Extensive studies have shown that in general, PCR tests have a higher sensitivity at lower viral loads, thus detecting infections early, as well as beyond the period of high infectiousness, while LFDs are most sensitive when sampled viral load is highest, which is at times of high transmission risk [25–27]. While LFDs have lower sensitivity over the full course of infection, they have the substantial advantage that they provide a rapid assessment (within 15–30 min) that can be performed in the home environment. Here, we do not focus on the differences between PCR- and LFD-based testing, which will be context-dependent, but instead consider the impact that all forms of testing may have on reducing transmission.
When investigating the impact of testing, it is important to consider the wider context of the outbreak, including the presence of other control measures, the incidence of infection and the characteristics of the dominant variant. Control interventions such as mask-use, social distancing or other NPIs aim to reduce the level of transmission from infectious to susceptible individuals, while vaccination aims to reduce the number of susceptible individuals in the population. All of these control measures might help to reduce transmission, such that that TTI can be less effective and still maintain containment or control. PCR testing and manual contact-tracing are highly intensive activities, such that when there is a high incidence of infection resources can become saturated leading to longer delay, weakening the impact of TTI [28]. As the dominant global variants have evolved from wild-type to Alpha, Delta and Omicron there has been an increase in transmission—implying that a greater number of secondary cases need to be prevented to contain the infection. The Omicron variants also have a noticeably shorter generation time [29], meaning that testing, isolation and tracing also need to be achieved more rapidly to maintain their effectiveness.
Another key challenge to assessment of TTI interventions is that the intervention is often also what is used to measure the outcome, e.g. reported confirmed cases. This presents a challenge to evaluation, because a short-term rise in infections detected post-implementation of a TTI intervention could indicate a lack of effectiveness or even harmful effect of the intervention, or it could simply reflect TTI doing what it is intended to do in improving detection of cases.
(b) Testing and tracing policies
While there have been many different testing policies and practices in different countries [30] and over the course of the pandemic (as knowledge and resources have changed), the impact of any testing policy is fundamentally reliant on human behaviour [31], which might change over time [32], and the extent to which the population is supported to take up and adhere to policies [303334]. Even the most theoretically rigorous TTI intervention is bound to fail if members of the public do not engage with it. For example, if accessing a testing site requires the use of public transport or if information about testing policies is not available in a suitable range of languages, adherence might be lower [35]. Within the UK, concern specifically focused on whether people received adequate financial compensation to enable them to take time off work in order to isolate or to care for children who needed to isolate [36] and whether people recognized that their symptoms necessitated taking a test [37].
Contract tracing policies may also be designed in different ways. In the UK, the National Health Service (NHS) Test and Trace employed a ‘forwards tracing' approach that sought to identify and quarantine people who might have been exposed to a confirmed case during a time when that case would have been infectious. For SARS-CoV-2, this ‘window' was often defined as 2 days before symptom onset (or test date if asymptomatic) and 14 days after, based on what was understood about the duration of infectiousness and timing in relation to symptom onset. A ‘backwards tracing' approach seeks to also identify the source of a confirmed case's infection, which would then enable forwards tracing of additional chains of transmission that might have otherwise been missed. Theoretical modelling work showed that this approach could be particularly effective for an infection like SARS-CoV-2, which was observed to have a highly skewed secondary case distribution—that is, a small proportion of cases led to a disproportionately large number of secondary cases, while many cases led to none or few. Backwards tracing could help to identify those relatively few cases who had transmitted to a high number of secondary transmissions [3839]. However, there are challenges that might limit effectiveness in practice, such as recall of contacts further back in time [40] and the resources required to implement the approach.
The probability of identifying cases by forward tracing and quarantining them before they become infectious could be further enhanced by tracing and quarantining contacts-of-contacts or secondary contacts, as was implemented in Vietnam [41]. While theoretically more effective at reducing transmission, this approach to tracing and quarantining becomes less specific, and could become so extensive throughout the contact network as to lead to effective ‘mini lockdowns' within local communities, diluting the targeted nature of the isolation measures [42].
Another approach to contact tracing which is similar in practice to backwards and contacts-of-contacts tracing was based on setting. Rather than attempting to trace individual contact links which might be too numerous or difficult to individually determine, cases were instead linked to settings they had attended for a sufficient period during the time that they had been infectious. Contacts were considered to be all other people present at this time and place. While this approach varied over time and was not extended to all types of settings in the UK—not least because of practicality and privacy concerns—at some points during the pandemic in the UK, whole school classes were quarantined without attempting to trace children's individual contacts [43].
One fundamental difference between the COVID-19 pandemic and most previous outbreaks has been the wide-spread use of electronic apps to automate contact tracing [4445], speeding-up the process, eliminating delays occurring during manual contact tracing and potentially identifying individuals that would otherwise be hard to trace. Most contact tracing apps operate through mobile phones by monitoring local Bluetooth signals. This allows them to record the presence of other phones in the vicinity and hence log ‘close' interactions (e.g. interactions lasting at least 15 min within 2 m). This log of close interactions is only transferred to a centralized system if the phone owner reports a positive test—helping to maintain privacy and security. Phones listed in the log can then be contacted and their users advised to take appropriate action—usually precautionary isolation and testing.
Although testing is primarily a tool for identifying and then isolating infectious individuals, the testing data is also extremely useful for public health monitoring of the outbreak, and hence projection of the likely short- and medium-term behaviour. These analyses and projections can then help a range of government agencies, including health care providers, to plan for future changes in incidence [46]. Additionally, samples sent to laboratories for PCR testing can also undergo genomic sequencing allowing the rapid identification of new variants of concern and highlighting growing risks to the appropriate government agencies.
Here, we have conducted a thorough literature review to identify peer-reviewed publications assessing the real-world effectiveness of testing, contract tracing and isolation/quarantine interventions on the transmission of SARS-CoV-2 infection. Previous reviews of contact tracing for infectious diseases more broadly have included relatively little evidence related to COVID-19 specifically and/or have not included studies looking only at testing and/or isolation intervention components [4748]. The evidence presented in this article has accumulated over the course of the COVID-19 pandemic; it was not present at the start of the outbreak to directly inform policy makers. Similarly, data specific to any future outbreak will take time to generate, although some of the general principles shown here are likely to hold.
2. Methods
This review focused on studies that quantitatively assessed the real-world impact on SARS-CoV-2 transmission outcomes of testing, contact tracing and isolation/quarantine interventions during the COVID-19 pandemic. The review forms one part of a larger project focusing on a range of non-pharmaceutical measures to contain the COVID-19 pandemic, funded by the Royal Society. To enhance the quality of this rapid review, we followed the Cochrane rapid review guidance [49], with a modification noted below.
(a) Inclusion and exclusion criteria
We included studies that provided a quantitative measure of real-world TTI effectiveness on reducing SARS-CoV-2 transmission. These included but were not restricted to the following study designs: randomized controlled studies (RCTs); prospective cohort studies; retrospective cohort studies; controlled before-and-after studies; interrupted time series; case-control studies; cross-sectional studies; and mixed methods studies (if a quantitative measure of impact could be extracted). Concentrating on real-world effectiveness studies, we excluded mathematical modelling studies; however, we did identify them during abstract and full paper screening and so are able to provide a list of mathematical modelling studies otherwise fitting our inclusion criteria as ‘supporting studies', listed in electronic supplementary material, appendix 3. For observational studies, we anticipated that many measured and unmeasured confounding factors could be present, including but not limited to differences in population demographics, immunity, infection variants, other transmission control interventions in place at the time, epidemic dynamics and test-seeking behaviour (assuming this is not an intervention under study). We therefore restricted full extraction and inclusion to studies that attempted to adjust for such confounding factors. Potential for risk of bias due to confounding was still assessed as part of study methodological appraisal, as described in more detail below. Uncontrolled observational studies were listed as supporting studies in electronic supplementary material, appendix 3.
We included only studies published in peer-reviewed journals which provides an additional element of quality control to our review; studies published only on preprint servers were excluded. Also, the following types of study were excluded: studies with no quantitative measure of impact; purely qualitative studies; diagnostic studies focusing solely on sensitivity and specificity of testing; non-empirical studies (e.g. commentaries, editorials, literature reviews); systematic reviews; and conference abstracts or reports.
We included studies assessing transmission and TTI effectiveness in the general population only. This general population could occur in any global location, with no restrictions relating to age, gender or health conditions. Any studies in which interventions were delivered in specific settings only were excluded. These included, but were not limited to: care homes; hospitals and other healthcare settings; schools; specific workplace types; and prisons. Given the high degree of vulnerability in some of these settings and the degree of close clustered contact, results in these settings may amplify the impacts of controls meaning that their findings might not be applicable outside the setting.
Studies were included if they focused on each of the following four components, in combination with each other or individually:
1. | SARS CoV-2 testing in combination with a transmission reduction element based on the results of the test, including: isolation initiation; contact tracing; prevention from attending setting (e.g. care home, workplace, school, healthcare setting); prevention from attending events; and release from isolation. Studies focusing on diagnostic testing to identify clinical treatment and pathways with no explicit transmission outcome were excluded. This means that we did not include studies whose primary outcome was an assessment of test sensitivity and specificity, which did not go on to assess the use of these tests in an intervention to reduce transmission. | ||||
2. | Contact tracing, if this is in combination with transmission reduction elements, including: quarantine initiation; and SARS-CoV-2 testing. | ||||
3. | Isolation of probable infectious cases (PCR/antigen tested or symptom-based), including interventions to facilitate and enable successful contact reduction while in isolation (e.g. advice, support payments, accommodation provision). | ||||
4. | Quarantine of exposed contacts of infectious cases, including interventions to facilitate successful contact reduction for the duration of quarantine (e.g. advice, support payments). |
Studies of interventions employing TTI components which did not measure transmission outcomes were excluded; this may include interventions aiming to improve treatment outcomes among those infected or studies for which a transmission outcome is not reported. Studies where TTI was employed as part of an international border policy were also excluded from our analysis because these studies were included in a separate review on border policies. To make a meaningful assessment of the impact of TTI components requires a comparator or counterfactual; in our analysis, eligible comparators included populations, settings or time periods with no TTI intervention or a variation in intensity to the intervention under study. Intervention versus non-intervention assessment were included, for instance, the random assignment of individuals to testing or non-testing (control) groups, or the controlled comparison of a time period in which contact tracing was implemented to a time period when it was not. Variation in intervention intensity included examples of differential numbers of tests per capita in different regions or changes to contact tracing (as measured by a standardized scale) over time. Although the focus of this review was on the reduction in transmission due to TTI interventions, transmission was rarely measured directly, so other outcomes of interest were assessed. These transmission-related outcomes are listed below:
— | positive SARS-CoV-2 tested infections (PCR or LFDs) | ||||
— | secondary/tertiary attack rates | ||||
— | estimated incidence | ||||
— | estimated infections averted | ||||
— | growth rate of confirmed cases or deaths | ||||
— | the effective reproductive ratio R or Rt | ||||
— | peak height/incidence per day | ||||
— | deaths | ||||
— | rates of hospitalization | ||||
— | intensive care unit (ICU) bed usage |
Deaths, hospitalizations and ICU bed usage outcomes were also included, but only if these were used as indicators of changes in rates of infection rather than changes in rates of treatment conditional on infection.
Inclusion and exclusion criteria are summarized in table 1.
Table 1.
Review inclusion and exclusion criteria. NGO, non-governmental organization; TTI, test trace isolate.
View inlineView popup
(b) Search strategy
We searched the literature using four main search components: SARS-CoV-2/COVID-19; testing; contact tracing; and isolation/quarantine, details of which can be found in electronic supplementary material, appendix 1. Systematic searches were run on the 6 January 2023 in Ovid MEDLINE (1946 to present) electronic databases. Results were restricted to 2020 onwards and limited to English language only. We also searched two COVID-19-specific databases on 9 December 2020. Firstly, the Cochrane COVID-19 Study Register (covid-19.cochrane.org), which contains study references from ClinicalTrials.gov, WHO International Clinical Trials Registry Platform (ICTRP), PubMed, Embase, CENTRAL, medRxiv and other handsearched articles from publishers' websites; and secondly the WHO COVID-19 Research Database [50], restricted to specific primary sources.
(c) Screening
Search results underwent deduplication using Endnote 20. After this, title and abstract screening was undertaken using Rayyan software [51]. First, title and abstract screening was calibrated, whereby all reviewers screened the same 20 records, before comparing results and clarifying the application of inclusion and exclusion criteria. All titles and abstracts were then screened by at least one reviewer, with 20% being screened by two reviewers (‘dual-screening’) to enhance screening quality. Any conflicts were resolved by a third reviewer. This was a pragmatic modification to the Cochrane Rapid Review guidance that at least 20% of abstracts are dual-screened with a recommendation that a second review is conducted for all excluded abstracts. Of the 20% of dual-screened abstracts, 373 (7% of dual-screened abstracts) were given a second opinion due to low confidence of the initial reviewer as to whether the abstract should be included or excluded and 4868 (93% of dual-screened abstracts) were not individually selected but were assigned in batches prioritizing reviewers with less reviewing and/or TTI research experience, though covering all reviewers.
Full text screening began with calibration, whereby all reviewers screened the same 10 full texts, before discussing and clarifying the application of inclusion and exclusion criteria. All full texts were screened by at least one reviewer, with 317 (27% of all full texts) dual-screened by two reviewers. Any conflicts were resolved by a third reviewer. Full texts for dual screening included those for which the first reviewer had low confidence in their judgement and those which were not individually selected but were drawn in batches disproportionately from reviewers with less experience. Reasons for exclusion were recorded at this stage and full text screening was managed using a shared Excel database.
(d) Data extraction
The data extraction form was piloted before study characteristics and data were extracted by a single reviewer and checked by a second reviewer. This stage was managed using a shared database. Data extraction consisted of the following categories: study information; methods; population and setting; intervention type; findings; and contextual information. The data extraction form headings can be viewed in electronic supplementary material, appendix 2.
(e) Risk of bias assessment
Risk of bias assessments were undertaken by one reviewer and all were reviewed by another for consistency. The Cochrane RoB 2 tool [52] was employed for randomized controlled trials. We used ROBINS-I for the assessment of non-randomized studies of interventions [53], guided by the Cochrane Handbook for Systematic Reviews of Interventions for assessing risk of bias of different types of non-randomized studies [54]. ROBINS-I requires the identification of potentially important domains of confounding a priori, as well as imagining a target trial of best practice for each study to be compared to. These were differences in: sociodemographic characteristics between intervention and comparison groups, epidemic dynamics over the time period of the study, population immunity, dominant SARS-CoV-2 variant and co-occuring interventions during the study period and means of measuring the outcome. Additional domains might be identified in individual studies. Each study was assessed as either ‘low', ‘moderate', ‘serious' or ‘critical' risk of bias. Any study rated at critical risk of bias was excluded from the review, in alignment with Cochrane Handbook guidance. For RoB-2, studies were rated as ‘low' risk, as having ‘some concerns' or as having a ‘high' risk of bias. Risk of bias ratings for each study are summarized in electronic supplementary material, appendix 4.
(f) Synthesis
Data were narratively synthesized using the synthesis without meta-analysis (SWiM) guidance [55]. Studies were separated according to the intervention, or combination of test, trace and isolate interventions for which they reported data. Studies were included in multiple sub-sections if they reported on more than one intervention. Extracted data were then summarized narratively, and presented in a table.
3. Results
The PRISMA flow diagram (figure 2) shows the sources and numbers of articles identified at each stage of screening. The database search identified 26 720 unique records, of which 358 were duplicate records. After screening the titles and abstracts of the remaining 26 362 papers, 1070 were taken forward into full text screening, and an additional 111 studies identified via examining the references of related systematic reviews also screened. From these, we identified 25 studies [102056–78] that met all our inclusion criteria. Reasons for exclusion are included in figure 2. There were two studies which presented results only graphically, and it was not possible to extract quantitative findings [7980]. There were 374 peer-reviewed studies which were mathematical transmission modelling studies and 18 uncontrolled observational studies, not included in the review but listed in electronic supplementary material, appendix 3.
Figure 2. PRISMA flow diagram showing the source of studies identified during screening stages, reasons for exclusion and the final number of included and supporting studies.
Many of the 374 transmission model-based studies used compartmental models in which an isolation or quarantine class was added to the standard susceptible infectious recovered (SIR) model paradigm. Transmission models varied in their sophistication and their real-world applicability, with some models focusing on generic concepts of invasion and stability while others more closely matched the dynamics of SARS-CoV-2 spread in their chosen focal areas. The majority of these model-based studies concluded that adding an isolation class (into which infected individuals can be diverted) reduces the transmission of SARS-CoV-2. The strength of this reduction is critically dependent on model assumptions and parameters, which may be difficult to correctly infer without very detailed individual-level data.
(a) Description of included studies
Included study characteristics are given in table 2. One included study was an individually randomized controlled trial [67]. Eleven studies used observational time-series designs [565760626470–737578]. These compared variation across, for instance, geographical time units while variously adjusting for characteristics of the units, other interventions and features of the epidemic. Some studies assumed or investigated time lags between exposures and outcomes, assuming that the effects of the intervention would take time to impact transmission. Two studies used difference-in-difference designs [5876] comparing the change over time in between groups exposed and unexposed to the intervention, accounting for baseline characteristics and thus more robustly controlling for confounding than in observational time-series designs. Three used synthetic control designs [656977], reconstructing a counterfactual ‘unexposed' scenario for the intervention group, based on the trends in unexposed controls [82] and thus enabling better control for time-varying confounding. One study used a controlled before-and-after design [10]. There were two retrospective [5968] and two prospective cohort designs [2061] which compared the transmission outcome over time among individuals who were either exposed or unexposed to the TTI intervention of interest. Three studies were cross-sectional and used cumulative outcomes: cases by a given time [74], drop in growth post peak incidence [63] and per capita mortality [66].
Table 2.
Characteristics of the 25 included studies
View inlineView popup
Of the 25 studies included, 12 used data from multiple countries across Europe, America, Africa and Asia [56576063646669–727578]. From the remaining studies, there were nine studies based in Europe, including five from the UK or England [6567747677], and one from each of the following countries: Portugal [68]; Spain [61]; Switzerland [20] and Slovakia [10]. Two studies were based in Asia, one from China [58] and one from South Korea [62] and the final two studies were based in Colombia [5973].
All but one study examined policies implemented across the general population, while one examined testing conducted among the labour force [20].
The interventions examined in the 25 included studies varied depending on which activities were undertaken during the study period, either testing, tracing and isolation or a combination of the three. TTI intervention exposure measures included continuous measures, such as the number of tests performed per capita as well as ordinal response scales of TTI intervention intensity (e.g. the commonly used Oxford COVID-19 Government Response Tracker (OxCGRT) [83] which used a 4-point scale to characterize testing policy and a 3-point scale for contact tracing), finding an effect of a one category increase in intensity. The RCT compared assignment to the intervention arm (daily contact testing) and assignment to the control arm (contact home isolation) [67].
Outcomes were categorized into three groups; transmission, healthcare usage (as a proxy of transmission) and deaths. Within the transmission category, 12 studies focused on number of cases, three focused on reproduction number, three on onward transmission, two on growth rate and one on relative risk. Several studies reported more than one outcome. Six studies measured number of deaths. One study used hospitalizations, over a given time period, as the transmission outcome because hospitalizations are less affected by changes in levels of case detection than other outcomes [77].
One study was assessed using the Cochrane RoB 2 tool for RCTs and the risk of bias was assessed to be as of ‘some concerns’ [67]. The remaining 24 studies were assessed using the ROBINS-I tool for non-randomized study designs; of these, six were assessed as moderate risk of bias [105865747677] and the remaining 18 were assessed as serious risk of bias. The risk of bias analysis is summarized in electronic supplementary material, appendix 4.
(b) Intervention effects
Studies were separated into four groups according to the elements of TTI investigated: contact tracing only (seven papers); specific testing strategies, in some cases alongside contact tracing (12 papers); strategies for isolating cases/contacts (four papers); and TTI as part of an intervention package (two papers). The majority of studies use data from 2020, as such the results are largely unaffected by variants, vaccination or population-level immunity.
Detailed quantitative study findings, including analytical methods and statistical uncertainty, are given in table 3.
Table 3.
Summary findings of the 25 included studies.
View inlineView popup
(i) Contact tracing (seven papers)
Seven papers considered the impact of contact tracing (as part of a TTI scheme) on the dynamics of the pandemic, figure 3. Six of these studies considered data from single countries and made use of spatial or temporal heterogeneity in contact tracing to assess its population impact [585965737476]. One study considered contact tracing policies across multiple countries [60].
Figure 3. Graphical representation of the 25 studies showing the associated time interval of the study (x-axis), the scale of interventions and analysis (multi-national black; national dark grey; sub-national light grey) and the reported statistical significance of the results: solid bars are studies where there was a significant reduction in transmission, cases or deaths due to a TTI intervention; diagonal hashed bars are where the results were not statistically significant; and long dashed bars are where findings were conflicted in different regions, times or scales.
Deng et al. [58] showed that across 313 Chinese cities from January to July 2020, the number of new COVID-19 cases were significantly negatively correlated with the stringency of contact tracing at lags of 0, 7 and 14 days.
Fernandez-Nino et al. [59] and Vecino-Ortiz et al. [73] both examined contact tracing in Colombia. Vecino-Ortiz et al. considered the proportion of cases identified through contact tracing between March and July 2020 as a measure of the intensity of contact tracing, and found that the number of deaths was negatively correlated with the intensity of contact tracing 21 days previously, with some evidence that the effect on deaths for a given contact tracing intensity tapered off somewhat as tracing effort increased (a possible ‘saturating' effect). Fernandez-Nino et al., considering data between March 2020 and January 2021, observed that cases that were part of chains of at least five contacts (and hence when contact tracing was more comprehensive) had a statistically significant 48% reduction in fatality.
Three papers focused on data from the UK; Kendall et al. [65] and Wymant et al. [74] considered the NHS COVID-19 contact tracing app in England and Wales, while Fetzer et al. [76] concentrated on delays to tracing. Kendall et al. [65] focused on the pilot of the contact tracing app launched on the Isle of Wight during May and June 2020, and estimated the reproductive ratio Rt on the Isle of Wight and in 150 other Upper-Tier Local Authorities in England. It was found that on the Isle of Wight, Rt had reduced from 1.3 (the third highest in England) before the app to 0.5 (the twelfth lowest) after the piloting of the app. Wymant et al. [74] supported Kendall et al.'s [65] findings; looking at the proportion of the population that used the app in each of 338 Lower-Tier Local Authorities they found a 1.1% (between October and November 2020) and 2.7% (between November and December 2020) reduction in cases for every percentage increase in app usage. Fetzer et al. [76] examined data from September 2020, when data handling issues within the public health authority led to delays of 6–14 days in the time taken to perform contact tracing; compared to such long delays, regression analysis suggested that timely tracing led to 63% fewer infections and 63% fewer deaths over this period.
By contrast, Gianino et al. [60] analysed data from Italy, Germany, Spain and the UK between October 2020 and January 2021, and could find no significant correlation between contact tracing and the incidence of SARS-CoV-2 cases. In its approach and use of the OxCGRT [83], which provides a three-level categorization of contact tracing intensity, the study by Gianino et al. [60] has more in common with the statistical approaches examined below in studies exploring testing strategies.
Overall, six of the seven studies assessing contact tracing found that contact tracing (in addition to other controls in use at the time) led to a significant decline in transmission; all of these studies relied on detailed information about contact tracing strength at an individual or small spatial scale. The only paper not finding a significant relationship [60] used coarse-grained data at the country scale from the OxCGRT [83].
(ii) Testing strategies (12 papers)
Nine papers performed statistical analyses across multiple countries to assess the impact of changing patterns of control [5763646670–727578], while three examined testing strategies in single countries [102077]. Many of the cross-country studies used the OxCGRT [83] to inform the type and strength of epidemic controls in each country over time. The studies varied in the number of countries/regions examined, the time scale over which the analysis was performed, and the measure used to assess the impact of testing and contact tracing using early data (up to May 2020), those that used later data when assessing contact tracing and then those that consider the strength of community testing (separate to any measures of contact tracing).
The earliest pandemic data (up to May 2020) showed the smallest impacts of testing and/or contact tracing. Haug et al.'s [64] analysis of 226 countries during March and April 2020 concluded that contact tracing was one of the least effective interventions as measured by a change in the reproductive ratio. Leffler et al. [66] analysed data from five countries (China, Macau, Cameroon, Sierra Leone and Sudan) in April and May 2020 and found only weak statistical evidence for an association between contact tracing completeness and lower per capita mortality (p = 0.06). Across the early epidemic period of 37 OECD countries, Pozo-Martin et al. [70] found no statistical evidence for an association between contact tracing completeness and average daily case growth rate.
Later results from cross-country statistical analyses tend to support a positive impact of contact tracing intensity (together with testing) on transmission. Hong et al. [63] studied 108 countries from January to June 2020 and found in an unadjusted analysis that contact tracing was significantly correlated with a more rapid decline from peak cases. However, adjusting for concurrent interventions, and treating contact tracing as an effect modifier of these interventions, the role of contact tracing was more mixed (contact tracing and school closures led to a significantly faster decline, but contact tracing combined with cancelling public events or workplace closures led to a significantly slower decline). Yalaman et al. [75] compared 138 countries over the period January to October 2020, and found that contact tracing had a strong and significant association with reduced population level mortality rates. Examining the 100 days after the first case was reported among 37 low- and middle-income countries, ranging from January to July 2020, Zamanzadeh and Cavoli [78] found that increased contact tracing intensity (measured using the OxCGRT scale [83]) was associated with lower incidence. Chew et al. [57], using neural networks to assess the impact of testing and contact tracing on growth rates between May 2020 and October 2021, found that while the results at a global and continental scale were mixed, contact tracing was the top measure for reducing growth rate when countries were clustered according to their dynamics. Although, Spiliopoulos [71] did not find evidence for an association between contact tracing and either case or death growth rates using data from February 2020 to April 2021.
Seven papers considered the impact of testing separately or independently to any available measures of contact tracing. Both positive and negative impacts of testing on transmission were reported; this may be due to conflicting actions of testing; more testing leads to a higher proportion of infections being detected but if cases are effectively isolated this should lead to fewer subsequent infections. Spiliopoulos [71] showed that testing had a significant negative impact on the growth rate of cases and deaths based on the data from 132 countries between February 2020 and April 2021 and the strength of this impact increased with the strength of the testing. However, studying the early epidemic period of LMICs, Zamanzadeh and Cavoli [78] found that increased testing was associated with higher cases, though they were not able to separate out effects on transmission from the effects of enhanced case detection on the proportion of cases identified. Leffler et al. [66] found no statistical evidence for an association with log per capita testing and per capita mortality across five countries from April to May 2020. Pozo-Martin et al. [70] found varying results depending on study period across 37 countries: from January to September 2020 they found that for every test per 1000 people there was a 0.02% reduction in the weekly growth rate, but later data (October to December 2020) showed that testing was positively correlated with the growth in confirmed cases, possibly indicating a difficulty with case growth rate being an indicator of both case detection and transmission. Ssentongo et al. [72] restricted their analysis to 46 countries in mainland Africa using data between January and August 2020 and found that testing (one week previous) significantly reduced the level of within-country transmission by 18%, but was associated with an increase in between-country transmission. Two studies failed to find statistical evidence for an effect of testing on transmission (Yalaman et al. [75] and Chew et al. [57]).
Studies using later (post-May 2020) and therefore more data, tended to find stronger relationships, with both testing levels and contact tracing leading to better public health outcomes, although this was not universal. However, these statistical analyses highlight the difficulties of assessing control measures by comparing across countries and times. By virtue of incorporating data from many countries, and including many different types of mitigation, the classification of control intensity is necessarily coarse; data from OxCGRT [83] for contact tracing has just three classifications: no contact tracing; limited contact tracing, not done for all cases, and comprehensive contact tracing, done for all identified cases. As shown by the study of Wymant et al. [74], heterogeneities exist at relatively small scales and can be informative for understanding the impacts of control. Both testing and contact tracing are likely to lead to an increased proportion of cases that are detected, making it difficult to discern effects on underlying incidence. In addition, such statistical analyses may not be able to fully capture the complex non-linear interactions between epidemiological dynamics, synergies and dependencies between testing and contact tracing, and with other interventions, and behavioural responses.
Three other studies provide more detailed within-country analyses of the impact of testing strategies. Pavelka et al. [10] examined the imposition of rounds of mass testing to exit lockdowns in October and November 2020 in Slovakia. In total, over 4 million tests were performed (out of a total population size of 4.5 million adults), with 50 thousand testing positive; while the impact was spatially and temporally heterogeneous, it was estimated that mass testing led to a 56% reduction in prevalence of infection between the first and second rounds. Gorji et al. [20] analysed the impact of weekly testing of employees in the Canton Grisons area of Switzerland during February and March 2021; companies enrolled in the testing scheme saw an 18%, 47% and 50% reduction in incidence in employees across three cohorts (with the latter two values being significant at p < 0.05) for those that had been enrolled for one, two or three weeks compared to newly enrolled companies. The Gorji et al. [20] study echoes similar findings where regular testing has been used on specific sectors of the population including school children and healthcare workers. Zhang et al. [77] estimated the effects of an asymptomatic testing pilot using rapid antigen tests on transmission, comparing hospitalizations in Liverpool city middle layer super output areas (MSOA, small geographical areas of around 7200 people) to a synthetic control based on comparative MSOAs without availability of asymptomatic testing in November and December 2020. While the intervention evolved somewhat in its design over time and also included effects of increased publicity around COVID-19 associated with the intervention, the study's primary finding was a reduction in hospitalizations of 25% (35% to 11%) in intervention districts compared to synthetic control.
Taken together, these three studies [102077] show the benefits of large-scale testing of the population—especially when infection is relatively common, as in Pavelka et al. [10] and Zhang et al. [77]. Earlier in the outbreak, when resources were generally more limited, the actions of testing and contact tracing might have been swamped by more stringent control measures such as lockdowns or social distancing.
(iii) Isolation strategies (four papers)
Four publications considered the effects of isolation of (suspected) cases or contacts but considered different aspects of the isolation procedure.
Malheiro et al. [68] considered the time between being identified as a contact of a confirmed case and the subsequent laboratory results, with the intervention group being quarantined while waiting for the laboratory results. This study conducted in Portugal from 1 March to 20 April 2020, when there were a range of other strict mitigation measures in place, found slightly fewer traced individuals having secondary cases from the intervention group (13.3% when compared with 17.2%) but the result was not statistically significant and the study included relatively small sample sizes (98 in the intervention group and 453 in the control group).
Lopez et al. [61] and Nam et al. [69] considered the location of quarantine, with Lopez et al. [61] examining hotel quarantine in Spain during April to June 2020 and Nam et al. [69] examining the step change in transmission on the introduction of centralized isolation in Japan and China. Lopez et al. [61] report more secondary household contacts with household-based quarantine than with hotel-based quarantine (with an odds ratio of 1.67), but the finding was not statistically significant and included only 229 individuals in the study. By contrast, Nam et al. [69] took a cross-country approach and report a 43% drop in transmission in Japan and a 96% drop in transmission in China when confirmed cases were centrally isolated, however given the number of other changes occurring in February 2020 (including increased public awareness of the pandemic) any assumptions around causality in statistical analysis should be made with caution.
Finally, Love et al. [67] conducted a non-inferiority RCT on around 55 000 adults identified by contact tracing in England between 29 April and 28 July 2021. This study compared home-based self-isolation of contacts for 10 days (policy at the time) versus daily LFD testing for 7 days and no isolation while the LFD tests were negative. The outcome was the proportion of contacts-of-contacts reporting positive tests to the national testing and contact tracing authority, a proxy for the attack rate among this group of contacts. Individuals from each group that became infected had a similar number of contacts (approx. 2.2 per case), and the percentage of them reporting positive tests was lower in the intervention arm (–1.2%, 95%CI of −2.3% to −0.2%) so the daily contact testing intervention was judged non-inferior to home isolation.
Taken together, these studies suggest some uncertainty around the benefit of rapid isolation of contacts (before laboratory results are available [68]) away from the home environment [6169], with studies either small or unable to disentangle the effects of different concurrent interventions. The findings of the study by Love et al. [67] that daily testing is at least equivalent to quarantine in terms of transmission are important, given that it allows substantially more social and economic freedom to those affected.
(iv) TTI as part of a broader package of measures (two papers)
Two papers analysed the impact of testing and contact tracing as part of a wider package of measures, so the effects of TTI alone cannot be assessed. Closely aligned with the eight population level studies of testing strategies detailed above, Chan et al. [56] considered data from 50 countries and eight suites of control measures from the start of the pandemic to July 2020; a combined package of case identification, contact tracing and related measures led to a net decline in the reproductive ratio but the results were heterogeneous between countries and not statistically significant. Heo et al. [62] analysed controls and daily new cases of SARS-CoV-2 in South Korean data from 20 January to 20 November 2020, and considered testing, contact tracing and public information campaigns as a combined control measure as they are tightly clustered in their implementation. The combined control measures were negatively correlated with cases at a number of lags (from 0 to 10 days) but the statistical significance of these correlations is not given, making the findings difficult to interpret.
4. Discussion
In total, we identified 417 peer-reviewed studies that examined the impact of testing and/or contact tracing and isolation on the transmission dynamics of infection. The overwhelming majority of these were transmission model-based studies, with only 25 empirical studies meeting our criteria of reporting the real-world impact of testing and/or contact tracing together with some adjustments for confounding factors such as changes to other control measures or population characteristics. Of these 25 studies, 11 adopted a broad statistical approach and attempted to link coarse classification of control measures in multiple countries to their epidemiological dynamics [56576063646670–727578]; five considered detailed contact tracing data from either England [657476] or Colombia [5973]; four considered strategies for isolation after testing or notification as contacts [6167–69]; two considered within-country stringency of TTI-type controls in China [58] and South Korea [62]; with other papers focusing on the impact of mass-testing [1077] and weekly testing of people without symptoms [20].
In general, these 25 studies showed that testing and/or contact tracing were associated with reductions in transmission—either measured from the growth of cases, the number of cases per capita or the level of mortality in the population. Three of the data analysis studies comparing between country dynamics using data from the earlest stages of the pandemic did not find statistically significant relationships [566466]; this may reflect the strength of other control measures (such as lockdowns) masking the impact of TTI. However, one study focusing on the first 100 days in low- and middle-income countries did find associations between increased intensity of contact tracing and reduced numbers of new infections [78]. Some of the cohort studies [6168] were considered to have been under-powered to yield significant results. All other studies identified a public health benefit of TTI interventions. Given the variety of ways in which both testing and contact tracing are characterized in different studies, and the different measures adopted for quantifying transmission reduction, it is not possible to pull a coherent metric from the studies for the impact of testing and related controls. There is a clear need for the relationship between TTI and transmission to be framed in a consistent quantitative manner that makes intuitive sense to policy-makers and public health officials. We would highlight Wymant et al. [74] as a good example of intuitive quantification which stated that a 1% increase in contact-tracing app usage was associated with a 2.7% reduction in cases (between November and December 2020).
We found a lack of RCTs of TTI interventions, identifying only one in the general population [67]. There are a limited number of other randomized studies but these were in specific settings, and therefore fell outside the scope of our review [8485]. One was a trial of daily contact testing in schools in England which also found this approach to be non-inferior to contact isolation, consistent with the later findings among the general population [67]. The two experimental studies of concert event attendance with pre-testing compared to no attendance did not specifically measure the benefits of testing [8687] and so were not included in our review. Both found no statistically significant difference in infection incidence between event attenders and non-attenders post event, though other measures, such as ventilation, face mask mandates, sanitizing and social distancing were also used at the events, and attributing effectiveness to pre-event testing alone is not possible.
The other 24 studies we identified were generally retrospective analyses of data, relying on heterogeneities between cohorts, locations, or times to infer the impact of testing, tracing and isolation methods. The difficulty with such retrospective observational analyses is the role of confounders. These confounders include underlying changes in the incidence of infection, changes to type or intensity of other controls being implemented, changes in the dominant variant, and changes to the public reaction and hence the compliance with mitigation measures. Additionally, statistical analyses, particularly across many diverse countries, might struggle to account for complex nonlinear interactions between epidemiological dynamics, control interventions and diverse behavioural responses; and it may be difficult to provide a consistent measure for the strength of TTI controls. TTI interventions have the additional complication for evaluation that a more efficient testing and tracing scheme will (initially) identify more cases, even though it should reduce the number of infectious individuals in the community.
Our review sought to use a sensitive search strategy with robust screening, review, and appraisal methods to identify the empirical real-world effectiveness of a variety of TTI interventions on SARS-CoV-2 transmission reduction. Our findings make an important and unique contribution to the evidence base; we did not identify previous reviews covering this range of testing, contact tracing and isolation interventions on SARS-CoV-2 transmission outcomes. Two recently published systematic reviews of contact tracing effectiveness, one of infectious diseases more broadly [48] and one specific to COVID-19 [47], did not review testing or isolation strategies. With regards to contact tracing, our review came to similar conclusions as the review of contact tracing for COVID-19 [47] which found a large number of mathematical modelling and simulation studies that predicted the high theoretical effectiveness of contact tracing strategies, alongside a relatively small set of empirical studies, with more mixed results. This difference could indicate implementation challenges with achieving effectiveness from TTI interventions in practice and/or a lack of empirical studies, though the highest quality studies included in both our reviews did find strong evidence of TTI effectiveness on reducing transmission.
Our review has some limitations. Given the large number of search results, it was only feasible to use two reviewers to screen 20% rather than 100% of papers—although the 20% found high levels of agreement between reviewers. While we might have identified more evidence by including preprint papers, this would have risked including lower quality studies or studies that never go on to be peer-reviewed. Future extensions of this review could include key secondary outcomes of TTI interventions hypothesized to facilitate or be proxies for effectiveness, such as proportions of cases identified as contacts, proportion identified prior to symptoms, and testing and tracing delays which could give further insight to the relative effectiveness of different approaches. A number of modelling studies have also considered such fine-grained implementation questions, although generally lack the real-world data to support the policy choices [33]. Synthesis and analysis of evidence relating to barriers to uptake and adherence to TTI measures and effects of social and health inequalities will also be important to inform future intervention development and deployment.
Our review did not consider the costs, either financial to the individuals concerned or to the wider economy nor the potential health, well-being, social and educational harms of different TTI strategies. This is clearly an important and generally overlooked area, as decisions about which control policies to adopt must reflect both costs and benefits. Such a cost-benefit view would also require a more detailed examination of testing using PCR rather than LFD, as PCR tests are considerably more expensive.
While this review sought to identify independent effects of TTI interventions to assess their optimal design and contribution to epidemic control, it is challenging to view each epidemic control intervention as isolated from the others. At the very least, the action of initiating a change in controls may alter the public perception of risks and therefore their level of compliance; the role of enhanced publicity and communication around COVID-19 might have played a role in the effects attributed to contact tracing and asymptomatic testing in this review's included studies, as explicitly mentioned by study authors [6577]. In addition, important synergies also likely exist. Some countries in Oceania and East Asia used very stringent international border controls to prevent importation, along with extensive contact tracing and physical distancing (when cases were identified) with the aim of local containment and elimination prior to mass vaccination. Of these, Australia and New Zealand are key examples of where combinations of NPIs kept infection to low levels until herd immunity could be gained through vaccination [8889]. The limited case numbers associated with stringent border controls and other methods are likely to facilitate more effective contact tracing, as the available resources can be focused on far fewer cases [90]. These contexts were very different to countries like the UK in which TTI measures were implemented in a context of varying and sometimes high prevalence. To prepare for future pandemics, these synergies, for which we lack extensive robust empirical evidence, will need to be considered.
There are some clear recommendations that come from this review. First is the need for more robustly designed experimental studies to inform TTI design; such studies need to quantity TTI impact across diverse populations, over different levels of compliance, over different time periods and for different epidemiological characteristics. Second is the need for in-depth studies that can quantify the costs and benefits of different testing and control strategies, and how these might vary between different epidemic phases and between different sectors of society. Finally, there is a need for a better quantification of the interaction between different control measures, which may act either synergistically or may weaken the effect of each other. Associated with all these is the need for a consistent method of reporting findings, such that the results of multiple studies can be compared.
Despite these gaps in our knowledge, two important conclusions can be drawn from the studies we have considered. Firstly, while evidence is imperfect, the majority of studies (including those of highest quality) suggest a considerable impact of testing, followed by isolation and treatment of detected individuals. Pavelka et al. [10] highlights the substantial reduction in infection following rounds of nationwide mass testing, which removed a large proportion of infected individuals from circulation within the general population. Similarly, Gorji et al. [20] show that regular testing within groups of co-workers can reduce the levels of infection, demonstrating both the utility of testing and the proportion of transmission that occurs in the work environment. Increased access to testing for asymptomatic individuals was also found to have reduced hospitalizations in Liverpool in the UK in late 2020 [77]. While the study by Love et al. [67] shows that regular daily testing is non-inferior to isolation of contacts, again emphasizing the strength of testing at detecting cases. Secondly, the studies from the UK [657476] and Colombia [5973] highlight the benefits of contact tracing as a method of identifying potential secondary cases. These studies of detailed data at a fine spatial scale demonstrate that timely contact tracing in the UK reduced the population-level growth rate or levels of infection; while the analysis of data from Colombia showed how contact tracing reduced mortality, both among infected cases who were identified sooner through contact tracing and at the population scale due to reduced transmission.
We conclude that testing, contact tracing and facilitated isolation can substantially reduce the transmission of infection and improve public health outcomes, and therefore is a key public health tool against future outbreaks—especially before infection-specific pharmaceutical interventions are widely available. However, many aspects of TTI efficacy and real-world effectiveness were unknown at the beginning of the COVID-19 pandemic, and many remain unquantified in general. Considerably more research is required to fully elucidate the epidemiological consequences of TTI under different scenarios (for example, the interaction with variants, vaccination and other controls), as well as the broader costs and benefits of different approaches to TTI.
Data accessibility
The data are provided in electronic supplementary material [91].
Authors' contributions
H.L.: data curation, formal analysis, investigation, methodology, project administration, supervision, writing—original draft, writing—review and editing; C.H.: data curation, formal analysis, investigation, writing—review and editing; J.O.: data curation, formal analysis, investigation, writing—review and editing; LT.C.: data curation, formal analysis, investigation, writing—review and editing; M.K.: conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, writing—original draft, writing—review and editing; G.J.R.: conceptualization, data curation, formal analysis, investigation, methodology, writing—review and editing; E.F.: conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, supervision, writing—original draft, writing—review and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
We acknowledge the Royal Society, who funded this work. M.J.K. was also supported through the UKRI (JUNIPER modelling consortium; grant no. MR/V038613/1) and the National Institute for Health Research (NIHR) (Policy Research Programme, Mathematical and Economic Modelling for Vaccination and Immunisation Evaluation, and Emergency Response; NIHR200411). M.J.K. is affiliated to the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Gastrointestinal Infections at University of Liverpool in partnership with UK Health Security Agency (UKHSA), in collaboration with University of Warwick. M.J.K. is also affiliated to the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Genomics and Enabling Data at University of Warwick in partnership with UK Health Security Agency (UKHSA). G.J.R.'s involvement was funded by the National Institute for Health and Care Research Health Protection Research Unit (NIHR HPRU) in Emergency Preparedness and Response, a partnership between the UK Health Security Agency, King's College London and the University of East Anglia. E.F. was also supported by UKRI Medical Research Council (MR/S020462/1). The views expressed are those of the author(s) and not necessarily those of the NIHR, UKHSA or the Department of Health and Social Care. For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.
Footnotes
One contribution of 7 to a theme issue ‘The effectiveness of non-pharmaceutical interventions on the COVID-19 pandemic: the evidence’.
Electronic supplementary material is available online at https://doi.org/10.6084/m9.figshare.c.6677629.
© 2023 The Authors.
Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.