Trash Taxonomy Tool: harmonizing classification systems used to describe trash in environments

Despite global efforts to monitor, mitigate against, and prevent trash (mismanaged solid waste) pollution, no harmonized trash typology system has been widely adopted worldwide. This impedes the merging of datasets and comparative analyses. We addressed this problem by 1) assessing the state of trash typology and comparability, 2) developing a standardized and harmonized framework of relational tables and tools, and 3) informing practitioners about challenges and potential solutions. We analyzed 68 trash survey lists to assess similarities and differences in classification. We created comprehensive harmonized hierarchical tables and alias tables for item and material classes. On average, the 68 survey lists had 20.8% of item classes in common and 29.9% of material classes in common. Multiple correspondence analysis showed that the 68 surveys were not significantly different regarding organization type, ecosystem focus, or substrate focus. We built the Trash Taxonomy Tool (TTT) web-based application with query features and open access at openanalysis.org/trashtaxonomy. The TTT can be applied to improve, create, and compare trash surveys, and provides practitioners with tools to integrate datasets and maximize comparability. The use of TTT will ultimately facilitate improvements in assessing trends across space and time, identifying targets for mitigation, evaluating the effectiveness of prevention measures, informing policymaking, and holding producers responsible.


Introduction
It is widely recognized that the impacts of mismanaged trash on ecosystems pose substantial risks to the environment. Mismanaged trash can kill wildlife via ingestion and entanglement [1] and transport invasive species [2,3]. We use the term "trash" here and throughout this manuscript, meaning any anthropogenic object that has escaped management and entered the environment. We recognize that in some localities, the term "litter, " "anthropogenic litter, " or "marine debris" may be more common synonyms, but we refrain from using it here, recognizing that not all trash comes from the act of littering, or is found in marine environments [4]. Trash on land has the potential to be transported through stormwater to other environments (e.g. rivers, the ocean) and contribute to persistent accumulation in the environment [3,5]. In response to this growing threat, several communities have implemented water quality regulations [6,7] to reduce trash delivery to aquatic water systems; a common way to comply with these regulations is through trash surveys. Surveys of trash in streams, beaches, oceans, and other environmental compartments are conducted to assess risks [8], plan mitigation [9], determine prevention priorities [10], and inform policymaking [11]. Trash surveys often include trash typology (e.g., bottle, plastic, cigarette), trash abundances, and site descriptions.
Trash typology systems are often developed with specific use cases or objectives in mind, which can hamper comparability. Trash survey developers try to minimize the number of classes they use to reduce complexity (and save time) while focusing on the classes most relevant to their management questions. For example, a trash survey (Survey 1) on a beach may explicitly list derelict fishing gear as a class. In contrast, another survey (Survey 2) may specify unique fishing items, such as fishing poles and fishing wire. In this example, fishing gear quantities from Survey 1 cannot be easily converted into fishing poles and fishing wire classes in Survey 2 without those being explicitly listed on the trash survey [12]. There has yet to be an in-depth analysis of the state of comparability between trash typologies.
Standardization and harmonization between existing trash surveys will allow trash monitoring data to be readily used together [13,14]. Standardization involves prescribing one survey list or a set of survey lists for different use cases. The Joint Research Centre (JRC) and Oslo/Paris Convention (OSPAR) created standardized trash survey lists with regional and ecosystem foci which are in widespread use. Standardized trash typologies are also used to improve models, like image classification, by lessening image labeling labor and increasing class interpretability. Harmonization involves developing taxonomic frameworks that facilitate operations between established and new surveys. Hierarchical and alias frameworks (including relational tables) are the primary harmonization tools. Morales-Caselles et al. 2021 developed a spreadsheet for harmonizing multiple survey lists to a standardized format using aliases [15]. Vriend et al. [16] developed a framework for harmonizing river trash monitoring strategies, outlining six hierarchical levels of trash typology: 1) organic/inorganic, 2) material, 3) polymer classes, 4) polymer type, 5) item type, and 6) the raw sample. The JRC (Joint Research Centre) recently developed a hierarchical framework to allow heterogeneous surveys to be combined and analyzed with their standardized typology [17]. There has yet to be a framework proposed for complete harmonization and standardization of all trash survey typologies.
Our study aims to assess trash survey comparability and develop a framework to harmonize and standardize trash surveys. To achieve these aims, we develop and describe the use of a universal trash typology, the Trash Taxonomy Tool (TTT). The TTT is a relational database (which can be queried by hierarchy and alias) and schema matching tool for harmonizing trash surveys. We use the TTT to assess the current state and future of trash typology, trash survey types, and survey comparability. We describe strategies to use the TTT to harmonize and standardize trash surveys. Recognizing that new trash objects are constantly being created, we discuss how to adapt and improve the TTT.

Developing relational tables Approach and assumptions
We compiled 68 English-language survey lists from various countries and organizations, including government, research, nonprofit, and academic groups that describe trash survey types for freshwater, marine, and terrestrial ecosystems. Throughout this report, we italicize class names when referring to trash typology. Three groups of classes were found across most of the surveys, which describe trash in terms of materials (the resource used to make the item, e.g., plastic or paper), item (description of the form of the object, e.g., bottle or fragment), and brand (the logo or manufacture's name identified on the item) ( Fig. 1). We also recognized two relational systems within the data: alias (synonymous words, e.g., cap and lid) and hierarchy (words that are parents or nested as children, e.g., spoon, fork, and knife are nested under utensils). We developed relational tables for comparing words used within and between these structures that originated from the 68 selected survey sheets. To provide potential users with definitions we operationalized in this study, we present a glossary of terms used (Table S1).
The primary assumption within the TTT is that there are no differences in the definitions of a given class between surveys. An example of a violation of this assumption would be two surveys that define fragment based on size, but with different criteria: such as fragment = particles > 1 mm versus fragment = particles < 5 mm. These surveys would classify different sets of objects using the same word. There are other types of information held within the methodological distinctions in definitions that we did not investigate further (e.g., color, shape, size) unless the methodological limitation was encoded in the class name (e.g., rope diameter < 1 cm). This study compared the relationships between the words used to describe trash and how they relate to one another based on professional experience with trash nomenclature.

Material-item relational table
We compiled a table listing the materials and items described by each organization's classification system that we reviewed for our study (Fig. 2). Each row represents a unique material-item relationship (e.g., plastic and straw being listed in a row together). Sometimes it was unclear whether a class described a material class or an item class (e.g., disposable fork, typically made of plastic). To avoid introducing bias and adding words not used explicitly in the surveys, these classes were placed in the item class, and the material class was not inferred.

Misaligned category table
We defined misaligned classes as classes that did not fit within the material, item, or brand classes. If the class was too ambiguous, did not conform to the standard one-parent rule for hierarchical databases, or did not describe trash in environments, we added it to a separate document called the misaligned class table. Examples of misaligned classes include construction materials, fishing gear, and tree.

Alias and hierarchical tables
We developed alias tables for material, item, and brand classes independent of one another (Fig. 2). For the item and material alias tables, all words that were found to have the same meaning were linked using rows in a table where the first column defined the prime word, which was used as a key for joining to the hierarchy ( Fig. 2), while all other columns were defined as aliases (e.g., fork and forks will be under the same alias). Break Free From Plastic, a nonprofit organization promoting a global movement to create a future free from plastic pollution, developed the brand alias table by researching the manufacturers who own the brands found during their annual Brand Audit in 2018 and 2019 [12]. This table was formatted with recurring manufacturer classes in one column corresponding to each brand owned by that manufacturer. In the alias tables, prime words can be merged with the hierarchical tables and vice versa. We established a single alias rule for every word in the alias tables so that any word could only join to one prime word to simplify analysis procedures using the tables.
Additionally, we developed hierarchy tables for material classes (Fig. 3A) and item classes (Fig. 3B). These tables specify the hierarchical position of prime words through multi-level grouping (e.g., the utensils class encompasses forks, knives, spoons, and straws; plastic in materials includes foam and soft plastic). The hierarchy tables only describe the prime words from the alias tables since those words are equivalent to the other words used to describe trash. Hierarchical groups were sometimes obvious. For example, one survey we reviewed used the class glass/ceramic while another split the classes into glass and ceramic. In other cases, the relationships were more nuanced. For example, organic is a more general material description that includes materials like wood and cloth. We established a single parent rule for the Fig. 1 The object in the center is being classified using the material and item hierarchies on the TTT website. In this example of classifying an unlabeled plastic bottle, we can tell that it is made out of hard plastic and is a beverage bottle. However, we cannot tell what type of beverage bottle it is, so it should be classified generically as a beverage bottle. The classes hard plastic and beverage bottles are chosen to best represent the object in as much detail as possible, without assuming beyond the specificity we can observe hierarchical tables where every word could only have up to one parent to simplify analysis procedures using the tables.

Database query tool development
The Trash Taxonomy Tool (TTT) is a database with a set of query tools and all previously mentioned relational tables accessible via an online application (opena nalys is. org/ trash taxon omy). The site was created using the shiny [18], dplyr [19], data.table [20], shinyjs [21], shinythemes [22], DT [23], shinyhelper [24], data.tree [25], and collapsibleTree [26], packages in R (4.0.5) and R Studio (1.4.1106). This site allows users to upload a commaseparated value (csv) file of their survey list to process using the alias and hierarchy framework; an example of the exact required formatting is provided in the supplemental information (Table S2). In technical terms, the TTT is a schema matching tool because it matches and maps schemas from trash surveys to a unified format [27]. The TTT first uses an alias lookup to match and map the user-provided survey classes to prime word keys for material and item words. It then locates the prime word in the hierarchy and allows users to display all recognized words that are more or less specific in their item and material columns. It finds all parent words when the less specific function is called and all child words when the more specific function is called. If the user provides a word that is not in the relational tables, a notification will return for that particular word. More detailed documentation and a video tutorial (https:// youtu. be/ sqeLa JKyol8) can be found on the TTT website.

Relational table cleaning and validation
We cleaned the relational tables using several tests. We created basic queries to identify duplicated terms, remove them, and ensure that all relationship links between the tables (Fig. 2) were equivalent in both directions for the alias to hierarchy relationships. The material-item table keys are equivalent to the alias and misaligned prime keys combined. The key column in the 'material alias' table has the same terms as the key column in the 'material hierarchy' table. We created a visualization within our online tool to inspect all the relational tables for nuanced relationships like semantic relationships within and between the tables. We also uploaded the material-item relational table to the query tool, then returned the relational table's results and visually assessed the matches.

Assessment of the current state of trash typology Summary statistics
We calculated summary statistics on each of the relational tables. The total number of classes was assessed by summing the number of unique words used within the survey lists (e.g., fork or spoon and fork/spoon are considered separate words). We assessed the number of unique classes by summing the unique prime aliases in the alias table (e.g., the two previously mentioned categories are joined to the same class). The number of levels of the hierarchies was assessed using the maximum number of levels of any given branch in the hierarchy tables. Diagrams were developed to demonstrate the depth and complexity of the hierarchical tables.

Factor analysis
The similarities between the groups of survey types (organization, ecosystem, substrate) were assessed with multiple correspondence analysis (MCA), using Facto-MineR [28] and FactoShiny [29]. MCA is recommended for factor analysis of categorical variables instead of PCA [28]. We expected that the survey lists of similar types (e.g., marine trash surveys) would use similar trash typology since they would have similar study goals. We split survey types by organization into research, nonprofit, and academic; ecosystems were split into marine, riverine, estuary, or land; substrate were split into beach, surface water, underwater, or roadside. First, we joined all classes used in each survey's materials-items table to the alias tables. Second, we converted all classes to a matrix with zero denoting that the survey list did not have the class and one denoting that the survey list had the material and item class (one hot encoding). The MCA's supplemental information (information not used to inform the model development) included organization, ecosystem, and substrate types (Table S3). V test statistics were assessed for each supplemental category's first and second dimensions. V tests are used determine if a supplemental category has a MCA dimension significantly different from zero. We asigned a V test statistic value of 2 as the cutoff for significance (Table S4, Table S5).

Comparability analysis
We assessed the comparability of each survey list to all the others by calculating the one-way percent of  alias table: where the Comparability Metric X,Y is a one-way test for how comparable survey X is with survey Y. The metric defines the proportion of the classes in survey list Y that are accounted for by the classes in survey list X after joining the lists to the alias table. The comparability metric helps describe how much one survey accounts for the classes in another survey, a typical operation when merging trash survey lists. We then averaged all comparability metrics for each survey by material and item independently and plotted them to identify the most comparable surveys and discuss strategies for creating a 100% comparable survey list.
Another way to compare trash surveys is to lump them together using the hierarchy. We used the hierarchy and alias tables to compare the Stormwater Monitoring Coalition (SMC) survey list with the NOAA survey list. First, we added randomly sampled trash counts (a standard trash survey method) between 1 and 10 for each trash typology. We joined both surveys to the alias table and then to the hierarchical tables. We used the data.tree [25] package in R to sum up the hierarchies to demonstrate how the two surveys are related based on the hierarchy.

The state of trash typology Relational table summaries
Merging the material, item, and brand groups to their alias tables can inform us about the level of detail and potential applications for each group in use today. The alias lists condensed 87 classes to 25 unique materials, 1,138 items to 416 unique item classes, and 3,740 unique brand classes to 1,239 unique manufacturers. It is apparent that "material" is the most generic class used, "item" is more discretized, and "brand" has an even higher degree of subdivision. For survey development applications, reducing class choices to prime words in the alias lists helps to make surveys clearer and data more consistent. If users reduced classes to these alias terms before machine learning classification (e.g., the TACO trash classification routine) [30], they would improve their classification by clearly differentiating object classes and reducing labeling time.
Inspecting the hierarchy tables can provide insight into the depth of information in the trash taxonomy and improve description clarity. There are four levels (parent-child word relationships) for material classes and six levels for item classes in the hierarchical relational (1) Comparability Metric X,Y = Classess in sheet X equivalent with classes in sheet Y All classes in sheet Y tables (Fig. 3A, B). The item hierarchy was more complex than the material hierarchy. We have not yet developed a hierarchy for brands, but we expect that one could be important for future developments. In an ideal hierarchical system, the terminal ends would encompass all possibilities of their higher class. For example, fiberglass would encompass all possibilities of glass (Fig. 3A). However, that is not the case here since we only characterized the surveys' classes, and there are gaps in how trash surveys have characterized trash. Therefore, to accurately interpret the hierarchy, there is an implied other class as a subclass of each class wherever it is not explicitly made. Out of 1509 material and item classes in the surveys, 392 did not fit our typology. We did not include them in our alias or hierarchical tables and instead made a separate misaligned class table. The main reason for misalignment was the categorization of trash by use, such as fishing-related or construction materials. A major challenge with these descriptors is that they can include a broad range of material, item, and/or brand classes, and thus do not fit within the framework we have developed. For example, the class smoking-related lumps item classes like cigarette ends, tobacco packaging, matches, lighters, and pipes, with material classes like plastic, organic, paper, metal, and glass. Additionally, while one organization may choose to include lighters as a smoking-related item, another may choose to put lighters in a household item use class. These descriptors could be useful for practitioners who want to conduct rapid assessments with lumped use-based descriptors, but we could not fold them into our tabular format because they defied the single parent rule for our hierarchical tables and the single alias rule for our alias tables. We recommend that future surveys designed to assess trash use class first start by describing the material and item classes and then build their use classes by summing observations of the material and item classes that fit their uses. These descriptors might be brought into alignment within the current system when a framework is developed that relates materialitem-brand combinations to the use classes but would likely require a non-relational database schema due to the complexity of the relationships. The list of descriptors that did not fit the typology could be used to quickly filter less often used trash types and ambiguous trash typology during data mining routines.

Factor analysis of survey lists
Survey lists are often described as being for a specific type of organization, ecosystem, or substrate. We tested whether those descriptors reflected differences in the suite of material class (46 survey lists) or item classes (52 survey lists) they describe using MCA (Table S3). No significant differences in material or item classes were found between survey lists by organization, ecosystem, or substrate type using the v.test statistic (Table S4, Table S5). This suggests that there was substantial overlap between the classes used in all types of surveys. In practice, a government marine survey list could be used for a nonprofit river survey, or any other combination of survey types, as long as it encompasses the project goals and is highly comparable with other survey lists.

Comparability analysis of survey list
In total, 4,556 comparability metrics were derived between all combinations of surveys for item and materials (Supplemental Information). The average comparability metric was 29.9% and 20.8% for all materials and items, respectively ( Fig. 4 and Table S6). Some pairs of survey lists were 100% comparable for materials (562 pairs) and items (302 pairs). These lists could be compared directly without any inference or interpolation. A majority of the survey comparisons were 0% comparable for material (2,470 pairs) and item (2,104 pairs) classes. In these cases, list pairs are incompatible for data aggregation and combination purposes at their current specificity level, a problem that cannot be rectified by the alias table alone. Only 15 of the 114 surveys were moderately comparable (> 40%) on average, and 33 surveys were somewhat comparable (40 -20%) in item classes.
None of the surveys are 100% comparable with all others. However, the surveys produced by NOAA [31], SMC [30], OSPAR [32], Project AWARE [33], JRC [34,35], Marine Conservation Society [36], and van der Velde [37] had the highest comparability values on average for material and item types. These surveys are potential candidates for adoption by new practitioners to enhance comparability. The JRC survey has the highest average for item classes (49% comparability), and the Rech [38] survey has the highest average for material classes (60% comparability). If the material classes from Rech [38] and the item classes from JRC survey were used in a survey, that survey would be more comparable than all others currently presented. The ultimate goal of harmonization is, in part, to achieve as much comparability between surveys as possible (Fig. 4).
The alias lists allow us to reach our goal of attaining more universally comparable surveys. If a survey contains all of the prime words for material and item classes, it will be 100% comparable to all of the surveys included in this study based on Eq. 1. An overlap in definitions between survey classes can be resolved by using the hierarchy as a part of the classification routine. This is done by choosing the most specific term within the hierarchical classification system (Fig. 1). This allows for more general and more specific terms to be used simultaneously. Users can already do this on the TTT website using the collapsible trees on the reference table tab. Figure 1 demonstrates how the hierarchy can be used to drill down to the most specific material and item classes that can describe the trash.
For those survey lists that are comparable, their typologies could be made even more similar by utilizing the hierarchy in addition to the alias to lump and split the classes. However, we do not see a clear "best" or "better" path forward within several possible options to lump and split datasets at the present moment. One strategy is to lump the values. All trash typologies will lump to the class trash, but in some cases, it is possible to lump to more specific classes (e.g., forks in one survey and spoons in another survey could lump to utensils) ( Table 1). Another strategy to unify data is to split into more refined classes using the hierarchy tables, e.g., if one survey has utensils and the other survey has forks, knives, spoons, an attempt could be made to split the utensils class into more resolved classes. However, the analyst needs to have a way to infer what proportions the higherlevel class should be split by to equal the values of the  (Table S6) refined class. This problem has yet to be solved for trash typology. Splitting may often not be possible because it requires additional information beyond the survey lists. A common challenge for lumping or splitting arises when a survey focuses on a particular set of items and materials but does not count the rest of the trash typology in an other class. The analyst then needs to infer the quantity and classes of trash that they did not characterize or only compare the quantities they did characterize to other studies. As a first step toward combining survey lists, we have attempted to solve the lumping problem using Table 1. This example demonstrates lumping counts from multiple surveys using different categories that are related by the hierarchy. We wrote an R script to do this automatically for any of the survey lists in the TTT (supplemental information). Although this demonstrates that survey lists can be merged by lumping programmatically, limitations due to previously stated method differences are likely to remain.

Future of the Trash Taxonomy Tool
Trash taxonomy will continue to evolve as new materials and items are created and enter environments [39] and as researchers create new technologies for collecting data about trash and develop new ways of describing trash [40,41]. For further information on how the TTT will evolve with the addition of new survey data sheets, as well as a current example with tobacco specific trash data, refer to the Practice Limitations section in the supplemental information. Our framework, relational tables, and source code will assist in developing and expanding the field of trash typology. Extensions to the TTT can be developed by directly collaborating on Github https:// github. com/ winco wgerD EV/ Trash Taxon omy and submitting requests and feedback to https:// github. com/ winco wgerD EV/ Trash Taxon omy/ issues. Incorporating microplastic taxonomy is at the top of our list for taxonomy expansions [42]. The source code and data are licensed open access (CC by 4.0) attribution only. This analysis will need to be expanded to other languages in the future to accommodate differences in how different languages map the alias and hierarchy relationships. Translations are already being done with cross-country trash databases like the pan European Marine Litter Database [43]. Future work on database development should prioritize non-relational database structures, develop a reconciliation service in a standardized format [44], and assess the feasibility of incorporating semantic closeness and data value matching routines [27].
There is still much work to be done on the fundamentals of trash typology. Accurate brand identification is critical to ensuring the precise application of the principles of extended producer responsibility [45] to hold manufacturers accountable for large loads of post-consumption trash and substantial environmental impacts that result from the use of their products. We suspect that it will take an ongoing large-scale effort to keep up with the evolution of brand classes. Future work on brand classification within the TTT should include linking items, brands, and material combinations to identify the producers that ultimately should be responsible for the products they design and produce.
Several ongoing projects are using the TTT to assist in future developments and expand the current use cases (Table S7). The NOAA Marine Debris Program referenced the TTT in a recent update to their trash survey classes [46]. A recent study on roadside litter published a hierarchical categorization using the TTT to make their study results comparable to the other harmonized surveys [45]. The harmonized tables developed in this study are already being used to develop machine learning image classification in the Clean Currents Coalition so that the labels on trash items can be as restricted as possible without compromising the harmonizability of the dataset (personal communication). The Trash Monitoring Playbook [47] suggested using the TTT to trash survey practitioners.
The widespread adoption of the TTT would harmonize global efforts to measure and document trash loads, trash types, and the extent of trash pollution in environments. The adoption of the TTT can also contribute to facilitating the aggregation of datasets from trash surveys, improving comparisons of trash risk assessments, and illuminating pathways for future work on trash taxonomy. We hope that TTT will be used to support research designed to inform mitigation efforts and prevention efforts, particularly in the realm of policymaking. We recommend using the TTT to foster collaborative research that will generate scientific evidence for holding producers accountable, ultimately by supporting "upstream" policy initiatives that reduce trash pollution of environments, promote changes in consumer behaviors, and mandate changes in producer practices.