A living tool for the continued exploration of microplastic toxicity

Throughout the past decade, many studies have reported adverse effects in biota following microplastic exposure. Yet, the field is still emerging as the current understanding of microplastic toxicity is limited. At the same time, recent legislative mandates have required environmental regulators to devise strategies to mitigate microplastic pollution and develop health-based thresholds for the protection of human and ecosystem health. The current publication rate also presents a unique challenge as scientists, environmental managers, and other communities may find it difficult to keep up with microplastic research as it rapidly evolves. At present, there is no tool that compiles and synthesizes the data from these studies to allow for visualization, interpretation, or analysis. Here, we present the Toxicity of Microplastics Explorer (ToMEx), an open access database and open source accompanying R Shiny web application that enables users to upload, search, visualize, and analyze microplastic toxicity data. Though ToMEx was originally created to facilitate the development of health-based thresholds to support California legislations, maintaining the database by the greater scientific community will be invaluable to furthering research and informing policies globally. The database and web applications may be accessed at https://microplastics.sccwrp.org/.


Introduction
Microplastics are a multifaceted contaminant suite, comprising a vast array of polymers, sizes, morphologies, and added and adsorbed chemicals [1]. The environmental compartments contaminated by microplastics are also diverse as microplastic particles are frequently detected in aquatic environments as well as sediment, air, food, and drinking water. The pervasiveness of microplastics has led to questions regarding their potential impacts on aquatic ecosystems [2][3][4] and human health [5][6][7][8]. Additional concern may also be warranted as microplastic concentrations are predicted to continuously increase unless the management or production of plastic waste undergoes a drastic change [9,10].
Concerns regarding the potential impacts of microplastics have led to legislative mandates aimed to assess the risks of microplastic exposure [11,12]. However, despite the abundance of studies detailing the occurrence of microplastics in various matrices, our understanding of microplastic toxicity is still emerging. Characterizing the environmental health risks posed by microplastics requires researchers to consider the mixture of both physical (e.g., size, morphology), chemical (e.g., polymer composition, chemical additives, sorbed chemicals), and biological (pathogens, natural biofilms) characteristics of microplastics that may influence toxicological outcomes [13]. Hundreds of studies on the occurrence and effects of microplastics have been published within the last decade to better understand the scale and potential impacts of microplastics as well as support various regulatory initiatives [14]. The available literature describes a multitude of possible adverse biological effects, but the mechanisms for how microplastics may be impacting both human and aquatic health are not yet completely understood. Given these challenges and complexities, there is a need for tools that facilitate data exploration, summarization, and analysis to inform science-based regulatory, and policyrelated decisions.
To identify the key drivers of microplastic toxicity and inform risk assessment strategies based on existing knowledge, we created the Toxicity of Microplastics Explorer (ToMEx). ToMEx includes both a microplastic toxicity database and accompanying R Shiny application (app) to facilitate data visualization and analysis (https:// micro plast ics. sccwrp. org/). The database was created by systematically mining the available microplastic toxicity literature concerning both aquatic organisms and human health. More than 70 unique variables were either extracted or derived from each study, including experimental design, test organisms, biological effects, and particle characteristics of microplastics used. In addition, information which may be used to select studies meeting various quality criteria were also documented. The accompanying R Shiny app allows users to intuitively interface with the database and provides tools for data summarization, filtering, synthesis, and analysis. ToMEx was originally created and designed to facilitate the identification of the key pathways by which microplastics affect biota, prioritize the characteristics (e.g., size, shape, polymer) of microplastics that are of greatest biological concern, and identify critical thresholds for each at which those biological effects become pronounced as part of the California Microplastics Health Effects Workshop [15]. However, the potential applications of ToMEx are far-reaching as scientists, environmental managers, and other communities across the globe may use ToMEx to quickly summarize, synthesize and analyze data to identify what is known and unknown regarding microplastic toxicity in both humans and aquatic organisms, and to inform the experimental design of studies.

Data sources
The ToMEx database is divided into two distinct parts: Aquatic Organisms and Human Health. Data for the Aquatic Organisms Database was sourced from three recent reviews of the microplastic toxicity literature [2,16,17]. An original literature review was conducted to populate the Human Health Database in July 2020 using the ProQuest database using the following search strings: ((effect OR impact OR endpoint OR toxicity) AND (microplastic(s) OR microbead OR polyethylene (PE) OR polystyrene (PS) OR polyamide (PA) OR polypropylene (PP) OR polyvinyl chloride (PVC)) AND (human OR human health). Additional studies were added to the Aquatic and Human Health databases until December 2020 and April 2021, respectively.
Each study was screened to ensure that it focused on at least one of the following: 1) the toxicological effects of microplastics, 2) the toxicological effects of microplastic leachates (i.e., chemicals migrating from microplastics), and/or 3) the toxicological effects of microplastics in the presence of chemical contaminants (i.e., chemical co-exposure or chemical transfer studies). Studies exclusively focused on the effects of macroplastics (> 5 mm), field observations or toxicokinetics were excluded. A summary of each database is provided in Table 1, and a complete list of studies included in each database may be found in the Supplementary Information.

Data Mining & Structure
Data were extracted from each study according to the following six broad categories. Each category is briefly described below, and detailed descriptions may be found in Table S1. When specific data were not available or provided by the study, fields were populated as "Not Available" or "NA. " Data types Each data point within the database is categorized according to the study design to distinguish between particle-and chemical-driven effects associated with microplastics. Specifically, data are categorized as particle only (i.e., organisms are exposed only to microplastics with no additional chemicals added by the researchers), chemical co-exposure (i.e., organisms are simultaneously exposed to microplastics and additional chemicals), chemical transfer (i.e., microplastics are incubated with additional chemicals prior to exposures), or leachate (i.e., chemicals are extracted from microplastics using a solvent and organisms are exposed to the resulting leachate only). It should be noted that some studies contain a mixture of different data types, and that data from chemical co-exposure, chemical transfer, or leachate studies are not necessarily relevant for risk assessment.
Test organisms Information regarding the genus, species, taxa, environment, life stage, and sex of test organisms was extracted from each study. For aquatic organisms, the maximum ingestible particle size was based on mouth size opening, which was either sourced from Koelmans et al., [18] or estimated for each species using methods previously developed by Jâms et al. [19].
Experimental parameters Information about the experimental design and test conditions including, but not limited to, exposure duration, exposure media (e.g., water, sediment, food), use of solvents in addition to water (e.g., ethanol), sample size, dose, and the number of treatment groups was extracted from each study.
Biological effects Biological effects data were captured using modified methods from Jacob et al. [17]. Briefly, all study endpoints were identified and hierarchically sorted into "Broad" and "Specific" endpoint categories according to their biological significance ( Fig. 1). Each endpoint was also assigned a biological level of organization. The target cell type or organ was recorded for tissue, cellular, and subcellular effects. The occurrence of an effect was defined as a statistically significant effect relative to the control group. For example, if a statistically significant reduction in body length was observed at least once throughout a microplastic exposure, this observation would be recorded as a "YES" for effect. If the differences were not statistically significant, this observation was recorded as a "NO. " Effect concentrations were also assigned to each endpoint where applicable (i.e., no observed effect concentration (NOEC), lowest observed effect concentration (LOEC), highest observed effect concentration (HONEC)) or recorded if provided in the study (e.g., effect concentrations, ECx).
Particle characteristics Key particle characteristics such as polymer type, size, morphology, density, charge, functionalization, and weathering were extracted. In addition to nominal and (if available) measured particle sizes, data were also binned into broad categories according to the size ranges provided in the California State Definition of Microplastic [20] based on the average particle size used in the study.
Experimental verification Details pertaining to the verification of key particle characteristics, background contamination, and the exposure concentration were recorded. The types of data captured were based on criteria developed by de Ruijter et al. [16] with some modification. For instance, it was recorded whether studies validated the polymer type, morphology, and size of microplastics as well as which methods were used to do so.

Data prioritization screening
In the Aquatic Organisms Database, studies were scored by two independent reviewers according to the technical and risk assessment quality criteria developed by de Ruijter et al. [16] with some modification to address the specific aims of the California Microplastics Health Effects Workshop (Table S2, see [21] for more details). If studies had been previously evaluated, then those scores were used [16]. Only data pertaining to particle only effects were scored, chemical transfer, chemical co-exposure, and leachate data types were outside the scope of this study and were excluded from the scoring exercise. In the Human Health Database, in vivo ingestion studies were scored according to quality criteria pertaining to particle characteristics, experimental design, and risk assessment [22] (S3).
No scored study in either database passed all the quality criteria (i.e., received a score of at least 1). Thus, a subset of 14 selected criteria was designated as essential "red criteria. " These criteria are intended to reflect the minimum requirements for a study to be considered for a preliminary assessment, with the understanding of the significant limitations of the findings [21,23]. It should be noted that these "red criteria" are not intended to be an acceptable replacement for the full set of criteria presented by de Ruijter et al. [16] or Gouin et al. [22].

Dose conversions
Most studies reported doses in either particle number or mass per volume, rarely both. Calculations to convert doses from numerical to mass-based concentrations as well as estimates of total particle volume and surface area were performed whenever possible as follows.

Particle volume
The average or median particle size (depending on if authors reported a single average or range) was used to estimate particle volume according to morphology. Spheres were assumed to be perfect spheres ( V = 4 3 π r 3 ) where r = sphere diameter/2. Fragment volume was estimated using methods adapted from Koelmans et al. [18] ( V = π 6 L 3 CSF 2 ) where L = particle length and CSF = 0.4 (i.e., the average Corey Shape Factor for environmental microplastic fragments) [24]. Fibers were assumed to be cylinders (V = πr 2 h) where r = particle width/2 and h = particle length. In cases where fiber width was not provided (8 out of 11 studies), width was assumed to be 15 μm, which is the average width of environmental microfibers [24]. Toxicity studies that reported fiber widths were 11.8 ± 3.9 μm (mean ± standard deviation).
Particle mass Particle mass was estimated by multiplying the estimated volume by particle density. In cases where density was not provided, median values for the reported polymer from Kooi & Koelmans [24] were used (Table S4). ) with a, b, c being equal to 0.5x length, 0.5x width, and 0.5x height, respectively [25]. For fibers, surface area was estimated using the equation for the surface area of a cylinder (S = 2πrh + 2πr 2 ), where r = particle width/2, and h = particle length. For cases in which particle width and height were reported, such values were used. Particle width was not often reported for fragments, in which case widths were estimated to be 0.77 × length, which is the average ratio for microplastic fragments in marine surface waters, and height was assumed to be 0.67 × width [25]. For fibers, again width was assumed to be 15 μm when not reported [24].

ToMEx app App development and design
To enable users to explore the database and increase the utility of ToMEx, a web-based app, which may be accessed at https:// micro plast ics. sccwrp. org, was built using Shiny [26], an R package designed to produce interactive web apps using a graphical user interface without requiring users to download R or write commands. The app also depends on numerous R packages, listed in Table S5. The ToMEx app is divided into sections, each with a different purpose and function, aimed at providing the user with a complimentary set of tools to form and test hypotheses, explore relationships, visualize hierarchical relationships of toxicity endpoints, and investigate quality criteria of studies in the database. Each section is briefly described below.
Overview This tool displays a series of data visualizations designed to summarize the database. Stacked bar graphs displaying the number of "endpoints" measured are used to visualize the relative abundance, diversity, and bias in the underlying datasets. There is also an interactive graphic to visualize how endpoints are categorized using the collapsibleTree package (v 0.1.7 [27];) (Fig. 1). Search This tab may be used to access selected components of the database using a series of search bars to find and download specific data types of interest.
Exploration This tab generates customizable figures to visualize differences in toxicity by relevant factors (i.e., organismal characteristics, experimental design, particle characteristics) using boxplots, "violin plots, " or "beeswarm plots" (v0.6.0; Clarke & Sherill-Mix 2017 [29]). Users can filter the database and update the plots using a series of widgets. Filtered data may be downloaded to investigate further.
Study screening This tab allows users to visualize the results of the previously described data prioritization exercise in the form of a heatmap. Data may be filtered in a similar manner to the Exploration tab.

Species sensitivity distribution
This tool provides the ability to probabilistically model the variation of species sensitivities to stressors and estimate ecosystem risks due to microplastics by generating species sensitivity distributions (SSDs) using the ssdtools R package Thorley and Schwartz [30]. Users may quickly filter data from the database and generate SSDs with the flexibility to manipulate model parameters (e.g., selection of effect concentration or dose descriptor, application of assessment factors, data collapsing method, etc.). The default settings of this tool are aligned with the threshold development framework presented by Mehinto et al., [21]. This tool is only applicable to the Aquatic Organisms database.
Calculators This tool allows users to 1) simulate realistic distributions of microplastics (including size, shape, density, mass, volume) using Monte-Carlo modelling techniques described in Kooi & Koelmans [24], and 2) perform alignments described in Kooi et al. [25]. For the simulated distribution tool, users may modify the particle size power law to simulate distributions of particles relevant to the matrix of interest (e.g., marine surface water, sediment, drinking water, etc.) [25]. The simulated dataset can be downloaded and manipulated in other software to estimate parameters, such as the ratio of mass for particles within a given size range relative to another size range. Users may upload toxicity data from laboratory experiments (polydisperse or monodisperse) and align to an ecologically relevant metric (ERM) of interest using site-specific microplastics probability distributions parameters and user-defined bioavailability criteria. This tool is only applicable to the Aquatic Organisms database.
Predictions This tool allows users to predict effect concentrations using the machine-learning model developed by (Coffin S, Cowger W, Koelmans AA, Thornton Hampton LM. Disentangling the complex relationship between microplastic effects and traits using machine learning. In preparation 2022a.). The user may upload a dataset containing relevant information for a real or theoretical laboratory test exposure (e.g., exposure duration and other experimental design parameters, species characteristics, particle characteristics, effect characteristics), and use one of two optimized random forest models described in (Coffin S, Cowger W, Koelmans AA, Thornton Hampton LM. Disentangling the complex relationship between microplastic effects and traits using machine learning. In preparation 2022a.) to predict either the tissue-translocation ERM-aligned or food dilution ERM-aligned exposure concentrations for a given effect (e.g., mortality) and effect concentration (e.g., NOEC, LOEC). The model may be re-trained once additional toxicity data is uploaded to ToMEx, with the anticipation of increasingly accurate predictions. This tool is only applicable to the Aquatic Organisms database.

Additional application features Data selection based on data type and data prioritization scores
In the Exploration and Species Sensitivity Distribution tabs, the database may be filtered to select specific data types. By default, "particle only" type data are initially selected. Before choosing a different selection (e.g., including leachate data), users should carefully consider the data types most appropriate for their specific aims and/or research questions.

Data selection based on data type and data prioritization scores
In the Exploration and Species Sensitivity Distribution tabs, users may also filter the database to exclude studies that did not meet the "red criteria. " Scores for all data screening criteria are also available for download in the Search tab.

Dose metric selection
In the Exploration and Species Sensitivity Distribution tabs, radio buttons provide the ability to toggle between exposure metrics of interest, including mass per volume, particles per volume, volume per volume, and surface area per volume.

Alignment to ecologically relevant metrics
In the Exploration and Species Sensitivity Distribution tabs, effect concentrations of monodisperse microplastics may be compared to polydisperse distributions of microplastics that occur in the marine environment. Concentrations may be aligned to an Ecologically Relevant Metric (ERM) (e.g., surface area, volume) as described by Koelmans et al. [18,31]. An ERM is the dose metric that one needs to use to quantify dose in the context of a certain known effect mechanism. For a given ERM, the concentration may be related to both mono-or polydisperse particles interchangeably so long as the total magnitude of ERM remains the same [18]. Following alignment to an ERM, concentrations may be re-scaled to a default size range (e.g., 1-5000 μm) in the app using methods described in Koelmans et al. [18].
To align a monodisperse effect concentration to an ERM, the effect concentration is also corrected for bioavailability. In the ToMEx app, alignments to surface area and volume are assumed to directly relate to effects triggered by translocation and food dilution, respectively, however all calculations are modular. By default, alignments to volume are adjusted to exclude particles larger than the maximum gape size for a given species. By default, alignments to surface area exclude particles larger than 83 μm, the predicted upper size limit for translocation [21], however users may change this value to align with updated findings or to test the sensitivity of this factor. For more details see Koelmans et al. [18], Kooi et al. [25], and Mehinto et al. [21].

Results and discussion Applications
The ToMEx database and accompanying app were originally created to facilitate the development of healthbased thresholds and support California legislations by coalescing microplastics toxicity data and providing userfriendly tools for data visualization and analysis. Specifically, it was designed to assess how toxicity varied based on key microplastic characteristics such as size, morphology, polymer composition, and observe how effect patterns may shift when data are visualized or analyzed using different exposure metrics (i.e., mass, particles, volume, and surface area per volume). To our knowledge, ToMEx is the largest, most detailed collection of open access microplastics toxicity data, and may continue to be used and updated by scientists and environmental managers alike to efficiently access and analyze microplastic toxicity data for a wide variety of purposes.

Research applications
The ToMEx database and accompanying app enables researchers to quickly identify, extract, visualize, and analyze data that is most relevant to their research questions and objectives. Similar tools have been developed for other environmental contaminants, such as the United States Environmental Protection Agency's (USEPA) ECOTOX Knowledgebase, an open access database which houses toxicity data pertaining to a wide array of contaminants and organisms [32]. These databases provide a substantial advantage over traditional literature searches or mining data individually. For instance, a scientist seeking to learn more about what is known regarding the effects of polyethylene fragments in crustaceans may quickly use the Search tool to filter and download a list of studies meeting these criteria.
Understudied areas of research may also be easily identified using ToMEx. This was particularly apparent using the Exploration tool. For example, visualizing data by particle morphology in the Human Health database highlights the lack of data for microplastic fragments with only three studies available for in vitro data and only one study for in vivo data (Fig. 2). Even more striking is the complete absence of toxicity data for microplastic fibers, despite their abundance across environmental matrices. These and other similar data gaps should be filled to gain a more complete understanding of microplastic toxicity.
The ToMEx database may also inform future experimental designs, not only by aiding in the identification of research gaps as described above, but also by facilitating the selection of experimental doses and relevant toxicity pathways. Because ToMEx is structured around individual endpoints, the Exploration tool can be used to visualize the doses at which biological effects become prevalent for any given endpoint. This may provide context about the likelihood of detecting adverse biological effects at a given dose. Furthermore, these plots can also provide some information about the relative toxicity of various particle characteristics or sensitivities of various species to microplastics. The Species Sensitivity Distribution tool in the Aquatic Organisms Database can quickly identify which species may be most susceptible to microplastic exposure, and even more powerful is the ability to align these data to an ERM of interest for a polydisperse distribution of particles (Fig. 3).

Management applications
ToMEx may also be employed to address regulatory needs and inform risk assessments for both human health and the environment. Here, it was applied to inform the legislative mandates of California's Senate Bills 1422 and 1263. More specifically, workshop participants were tasked with identifying the primary pathways by which microplastics cause adverse effects, prioritizing the characteristics that are of greatest biological concern (Hampton et al., 2022 [33]), and identifying critical thresholds at which adverse effects become pronounced [21,23]. Critical thresholds for the aquatic environment were derived using a combination of tools provided by ToMEx [21], and for assessing impacts to human health additional tools were used (e.g., benchmark dose modelling software; Coffin et al., [23]). Data most appropriate for threshold development were selected based on a variety of factors, including specific screening criteria, and threshold calculations were quickly performed. The efficiency of this process allowed for in-depth discussions and exploration of numerous strategies for threshold derivation, as well as assessments of the sensitivities of derivation parameters [21]. Regarding microplastics in drinking water, ToMEx was used to identify patterns across studies which then facilitated discussions on the suitability of these studies for potential threshold development. For instance, adverse effects on male reproduction such as decreases in sperm viability, were detected across multiple studies conducted by independent groups of researchers (Coffin et al., [23]). During this effort, most data were deemed unfit for threshold development to inform management actions (Coffin et al., [23]).
However, continued curation of the Human Health database could allow for refinements as the field of microplastics research advances. It is anticipated that ToMEx will prove useful in informing similar regulatory decisions regarding microplastics. Besides providing a large, curated dataset for microplastic toxicity, ToMEx also allows users with diverse expertise to intuitively visualize and interact with data to initiate discussions and formulate ideas to address future regulatory challenges without the need for coding expertise. The flexibility of the ToMEx app will also allow regulators to tailor analyses to their specific needs. For example, the USEPA omits algae and plants from SSDs used for deriving water quality criterion, while the European Chemicals Agency requires their inclusion [34]. Users can use ToMEx to filter data prior to building SSDs and can customize statistical models used to fit distributions based on preferred choices.

ToMEx moving forward
As microplastics toxicity research rapidly expands and evolves, keeping up to date with the latest scientific findings will become increasingly difficult for researchers, environmental managers, and the public alike. ToMEx has the potential to become a valuable tool to quickly Fig. 2 Representative graphic generated using the Exploration tab in the Human Health database. Labels on the right side indicate the number of studies and measured endpoints available in the database. Endpoints where effects were detected are indicated by "Yes" (dark blue bars) whereas endpoints where effects were not detected are indicated by "No" (light blue bars). Data are divided into in vitro and in vivo data as indicated by the labels above. Doses for each endpoint are displayed on the x axis in particles per volume, but other dose metrics such as mass and volume are also available explore, summarize, and analyze data from hundreds of scientific studies, lessening the burden on individuals to mine and process these data themselves. In addition, the data visualization tools built into ToMEx provide a variety of customizable graphics that may be used to facilitate scientific communication in research, academic, and management settings. These features can advance scientific collaboration and discussion as well as policy decision-making. Finally, the accessibility of ToMEx is likely its most important feature. By making the ToMEx database and app open access and open source, respectively, individuals can download and interact with a wealth of data that may otherwise be unavailable to them. Overall, prioritizing accessibility and inclusivity will aid in democratizing microplastics research. Though ToMEx provides a large collection of toxicological microplastics data, it is not exhaustive. In its current state, ToMEx represents an incomplete "snapshot" of the available data up until roughly the end of 2020. However, given the growing number of publications on microplastics toxicity, the database will become rapidly outdated. To maintain its utility, scientists are encouraged to upload data from their peerreviewed toxicity studies to ToMEx using an online form or template, ensuring that it houses the most current data. These data will be curated for accuracy by at least two independent researchers before being incorporated into the database. As data management practices become more advanced, it is anticipated that the maintenance of ToMEx will evolve as well. The existing analysis tools will also be updated as refinements and improvements are developed with the help of the greater scientific community. For instance, multivariate analysis tools allowing users to explore how Fig. 3 Representative graphic generated using the Species Sensitivity Distribution tool. Data have been filtered to include "Particle Only" type data which passed "Red Criteria" during study screens. Data have also been aligned to the ecologically effect metric (ERM) of volume