Approach for Aim 3
Integration with Human Health Surveillance
Strain Tracking and Viral Methods Comparison
We have already shown the efficacy of RNA sequencing for viral tracking and strain identification (9a) and even as a potential replacement for LAMP and PCR based assays, but continued methodological comparisons can help guide future efforts. While both LAMP and PCR have lower unit costs than RNA sequencing, they are taxa-specific assays that must be developed specifically for an organism of interest. By contrast, metatranscriptomics could detect essentially any RNA-based organism with important implications for surveillance, early response, and strain mapping (e.g. Nextstrain).
In this study, we will evaluate how RNA sequencing may supplement or replace PCR and LAMP as RNA virus detection and quantification technologies. RNA sequencing has the ability to evaluate RNA viruses in the genome context of the rest of the sample (co-incidence of other microbes) and give a broader view of the microbiology. Additionally, RNA sequencing allows us to reconstruct the strain of environmental SARS-CoV-2 samples, which can indicate possible city-of-origin for strains, which is not currently possible with PCR or LAMP. Detecting strains will help to model and respond to the dynamics of future outbreaks and can be integrated with clinical data from the patients (e.g. recent travel history). Additionally, we will compare any detected environmental strains to the strains observed in local cases. This will allow us to determine if environmental strains are lagging or leading indicators of the type of strain responsible for infection in a given area.
Predictive Modeling that Integrates WBT with COVID-19 Distributions
For modeling, we will conduct three sets of analyses.
- First, we will explore and identify variables that are associated with the relative concentration of SARS-CoV-2 in the wastewater samples at different geographic scales.
- Second, we will assess association between the number COVID-19 cases at given location (building, zip code, and community) with respect to time-lagged concentration of COVID-19 in the wastewater samples, adjusting for confounders.
- Third, we will test and validate the model performance and then formulate bases for developing a predictive model.
The main variables will be:
(a) detection (coded as binary) and relative concentration of SARS-CoV-2 strains (by Nextstrain clade) in the daily wastewater samples collected for the selected residential dorms at the UM campuses, clusters of buildings on UM campuses, and county level (i.e., at the CDWWTP); and
(b) daily new cases of (clinically diagnosed) COVID-19 infection by residential buildings at the UM campuses and by zip code for the Miami-Dade County residents.
Since several factors can influence variables listed under “a” and “b” above, we will collect additional geospatial data on a wide range of other variables, including: wastewater temperature, presence of viable E.coli, turbidity, conductivity, population density, number of hospital and residential units connected to the wastewater site, and proximity (i.e., physical distance) from the sample site and source (i.e., building address or zip code). We will also collect data on population demographics, age, and access to hospitals screening for COVID-19. Models will use logistic regression for the binary outcome and log-linear regression for the continuous variables, and confounding variables will be adjusted for any location-specific effects. For the predictive model, two validation sets will be utilized, and the root mean square error (RMSE) of the observed and predicted values will provide an indicator of the efficiency of COVID-19 case incidence prediction.
Statistical Analysis
A time lag cross correlation (TLCC) model will be used to determine the time between detection of SARS-CoV-2 in wastewater and an uptick in positive COVID-19 cases in patient testing. We expect to see increased correlation between those timepoints that indicate an increased presence of viral load. Weekly testing rates and weekly viral loads from wastewater samples will be the two time series that will be correlated to determine the amount of time between the two events. The TLCC will determine the correlation at each lag and we will call significance at the 95% confidence level. All analyses will be performed in R version 4.0.0.
Power Analysis
We will use a TLCC to determine the number of lags that a detection of SARS-CoV-2 viral load predates an uptick in detection of COVID-19 cases. Given a power of 80%, alpha of 0.05, and an expected medium sized correlation of 0.4, we would require 44 observations to detect such a correlation. This will provide us with a degree of certainty that the association between wastewater sampling and COVID-19 detection has some evidence. As a secondary approach (in case of non-normality), we are also powered for a Spearman correlation of 0.4 with 46 observations. Since our study will likely have more observations available for each site, this provides a minimum correlation that we should be able to detect.
Ethics
SF-RAD will establish an ethics oversight mechanism to work with other investigators to provide advice regarding surveillance, consent, and privacy. Regular interactions with study leadership will provide both solicited and unsolicited advice regarding best practices in study design, conduct, and dissemination, and in research rigor and reproducibility. For instance, the study might make incidental findings that bear on the health of individuals or (sub)communities; a process will be established to manage and, as appropriate, communicate those data.
Survey Development
The human surveillance portion of this study will incorporate social determinants of health (SDOH) measures as available through the PhenX Toolkit, and relevant factors will be extracted.
Specimen Tracking (Wastewater and Human Samples)
Data collection and specimen tracking will be recorded in OpenSpecimen (OS), a comprehensive biorepository laboratory information management system (LIMS) used by the Sylvester Biospecimen Shared Resource (BSSR). OS tracks collections across multi-site longitudinal studies, sample utilization, and pre- and post-analytical variables, and has a powerful reporting module that is customizable to any study’s requirements. OS also has audit trail capability to track any changes made to the data, including but not limited to specimen data and system metadata. All samples will have a unique identifier that is tracked by OS, including derivatives and aliquots. The built-in reporting module allows for all collection and sample information to be reported in an accurate and standardized manner.
Biosafety Procedures
Additional biosafety procedures will be implemented for handling the sewage. Although wastewater has not yet been reported as a route of infection for COVID-19 (61), samples will be handled in a manner that assumes transmission is possible, requiring the implementation of biosafety procedures throughout sample collection, concentration, and detection; CDC Biosafety procedures will be followed (10). Additionally, representatives from UM Environmental Health and Safety (EHS) will accompany the sample collection team to assure that safe sample collection procedures are followed in the field. Once collected, samples will be handled at the BSSR, which routinely processes human samples, including biohazardous liquids like blood. Sample concentration will occur within the BSSR, which is equipped with a BSL-2 laminar flow hood, a centrifuge equipped to contain aerosols, and which will apply laboratory disinfection protocols throughout the process. RNA extraction, purification, detection, quantification, and sequencing in other shared resources (OGSR and MTSR) will follow similar safety measures.
Evaluation Plan
Program evaluation (PE) will be integrated in order to inform the administrative leads, the Aim leads, and the leads of the Human Population and Clinical Patient Surveillance (HPCPS) group about progress made. The process of evaluating progress includes the development of a work plan matrix that summarizes the relationships among the project goals, specific aims, outcome measurement, process objectives, and activities, thus ensuring that the project will deliver the outcomes for which it is funded. PE review activities will include a review of primary data, validation of data, training needs, data collection protocols, design of experiments, sampling plans, power test, reliability and validity issues, and internal and external communications. In addition to establishing the communication responsibility matrix, PE will also contribute towards communication, integration of data management and wastewater characterization, and integration of WBT results with COVID-19 cases in Miami and across Florida, including predictive modeling. Measurement outputs and measures of impact will be assessed through publications in peer-reviewed journals, citations, and through invited lectures and presentations.
Potential Pitfalls and Alternative Approaches
We recognize that the COVID-19 pandemic is unpredictable. If cases decrease drastically such that background levels of SARS-CoV-2 are no longer detectable in natural wastewater samples, experimentation will continue with spiked samples and with non-COVID microbes to provide comparisons among different sample collection, concentration and detection technologies. We will continue tracking disease transmission among the general and hospital populations to evaluate whether links can be established between wastewater and measures of human illness rates.
Expected Results
The results from this proposal will develop and deploy experimental and informatics infrastructure and operations, provide a proof of concept implementation to use wastewater for infectious disease surveillance, and advance work towards a model capable of predicting local and community level spread of COVID-19 based upon measurements from wastewater. Specific and measurable key milestones include process development and data sharing, completion of experiments planned for evaluating sample collection and sample concentration strategies, comparison of viral detection technologies, metagenomics comparisons between wastewater and human specimens, and predictive modeling.
Research Translation
The results from Aim 1, and in particular results related to sample collection and concentration, are directly applicable to field and laboratory settings, and optimum methods can be utilized immediately once agreed upon. For detection, all technologies utilize commercially available products, so procedures can be replicated immediately after proof of concept.
______________
9a Butler, D.J.; Mozsary, C.; Meydan, C.; Danko, D.; Foox, J.; Rosiene, J.; Shaiber, A.; Afshinnekoo, E.; MacKay, M.; Sedlazeck, F.J.; Ivanov, N.A.; Sierra, M.; Pohle, D.; Zietz, M.; Gisladottir, U; Ramlall, V.; Westover, C.D.; Ryon, K.; Young, B.; Bhattacharya, C.; Ruggiero, P.; Langhorst, B.W.; Tanner, N.; Gawrys, J.; Meleshko, D.; Xu, D.; Steel, P.A.D.; Shemesh, A.J.; Xiang, J.; Theirry-Mieg, J.; Thierry-Mieg, D.; Schwartz, R.E.; Iftner, A.; Bezdan, D.; Sipley, J.; Cong, L.; Craney, A.; Velu, P.; Melnick, A.M.; Hajirasouliha, I.; Horner, S.M.; Iftner, T.; Salvatore, M.; Loda, M.; Westblade, L.F.; Cushing, Levy, S.; Wu, S.; Tatonetti, N.; Imielinski, M.; Rennert, H.; Mason, C.E. Shotgun Transcriptome and Isothermal Profiling of SARS-CoV-2 Infection Reveals Unique Host Responses, Viral Diversification, and Drug Interactions 2020. bioRxiv 2020.04.20.048066. https://doi.org/10.1101/2020.04.20.048066