A multilevel dataset of microplastic abundance in the world’s upper ocean and the Laurentian Great Lakes

A total of 8218 pelagic microplastic samples from the world’s oceans were synthesized to create a dataset composed of raw, calibrated, processed, and gridded data which are made available to the public. The raw microplastic abundance data were obtained by different research projects using surface net tows or continuous seawater intake. Fibrous microplastics were removed from the calibrated dataset. Microplastic abundance which fluctuates due to vertical mixing under different oceanic conditions was standardized. An optimum interpolation method was used to create the gridded data; in total, there were 24.4 trillion pieces (8.2 × 104 ~ 57.8 × 104 tons) of microplastics in the world’s upper oceans.


Introduction
Microplastics are being reported globally, but it is challenging to compare the data collected when different methods and reporting criteria are followed (e.g., [1]). Harmonized or standardized protocols are therefore recommended for collecting data in the future [2,3]. Data collected by previous studies are still valuable and efforts to critically compare and evaluate these data are urgently needed. Laboratory-based studies on damage to aquatic organisms exposed to microplastics might be inaccurate if microplastic concentration (e.g., weight per unit water volume) estimates are much larger than the reality [4]. Analyzing microplastic abundance by synthesizing observation data from various oceanic basins will be helpful to bridge a gap between the laboratory-based studies and threats in reality. Similarly, real data on microplastic abundance in the oceans is needed to validate the accuracy of numerical models (e.g., [5,6]).
A few studies have synthesized microplastic abundance data for the world's oceans to generate datasets. Eriksen et al. [7] created a publicly available dataset of microplastic abundance based on data obtained from 680 surface net tows conducted by different researchers during 2007-2013. These data were standardized to reduce uncertainty derived from vertical mixing induced by oceanic turbulence, because abundance estimates based on surface net tows are influenced by oceanic conditions: particle counts for light-weight microplastics, which are produced mostly from polyethylene and polypropylene (polymers less dense than seawater, [8]), decrease (or increase) near the sea surface under stormy (or calm) oceanic conditions. They used a formula to estimate the vertical distribution of the particle counts [9], to deduce the total particle count throughout the entire water column under wind speeds measured on the Beaufort scale. However, no description of the significant wave heights required for the formula was provided in Eriksen et al. [7]. Cózar et al. [10] synthesized microplastic abundance data obtained from 841 surface net tows (442 wind-corrected samples), including a circumnavigation cruise of the earth. Published and unpublished microplastic abundance data from 1979 through 2013 (11,632 samples in total) were synthesized by van Sebille et al. [6], although their dataset was not made available to the public. They statistically standardized the data obtained by different researchers using a generalized additive model incorporating the year in which each study was conducted, the geographical locations, and wind speeds given by an atmospheric reanalysis product.
Here, we provide a new dataset of pelagic microplastic abundance in the world's oceans which incorporates different sampling methods. The dataset includes both published and unpublished microplastic abundance data obtained from 2000 to 2019. The number of samples is 10-fold (n = 8218) higher than Eriksen et al. [7] and Cózar et al. [10]. We standardized the data obtained by different researchers in a physical manner. The dataset is publicly available as the Supplementary data in a CSV format.

Methods -description of the dataset
Categorization of data Different from the datasets mentioned above, the data in the present study were categorized as raw, calibrated, processed, and gridded data, similar to satellite products (https://climatedataguide.ucar.edu/climatedata/nasa-satellite-product-levels). Raw data (hereinafter referred to as Level-0 data) were mostly obtained by surface net tows and are provided as "particle count per unit seawater volume (partly, per unit area)". First, these raw data were calibrated to the abundance of microplastics (< 5 mm), except fibrous microplastics (filaments and fibers), as a quality control (Level 1). Second, to reduce uncertainty derived from vertical mixing, integrating microplastic abundance vertically from the sea surface to the infinitely deep layer yielded processed data for both the total particle count (Level 2p) and weight (Level 2w), over the entire water column per unit area, where the subscripts 'p' and 'w' represent the particle count and weight, respectively. Third, the Level-2p and -2w data were gridded to obtain the particle counts (Level 3p) and weight (Level 3w) per unit area using an optimum interpolation method (OIM). Last, these gridded data were converted to monthly particle counts (Level 3 pm; 'm' represents monthly data) and weights (Level 3wm) per unit seawater volume in the uppermost layer. The present paper describes the detailed procedures to create this multilevel dataset.
Level 0 -raw data Data from 27 research projects conducted during the period from 2000 through 2019 (Table 1) were used to create the Level-0 data on pelagic microplastic abundance in the world's oceans and the Laurentian Great Lakes. We synthesized the data collected during the past 20 years to represent the 'current status' of microplastic abundance, because a long-term trend is undetectable in such a short period, as shown by Law et al. [26], who provided a time series of plastic-debris abundance from 1986 to 2008, and because long term change is not a common scheme for floating plastics and microplastics [11,26,[33][34][35]. In total, 23 of the 27 projects collected microplastics only by surface net towing, but Projects #13 and #26 (Table 1) collected data via continuous seawater intake at a depth of 3 m (#12 partly included seawater intake; Table 1): Nonetheless, the target of these two projects was microplastics over several tens of μm in size (see 'Mesh size' in Table 1). Thus, as defined in the present study, the surface layer included seawater from the sea surface to a depth of 3 m. The Projects #25 and #27 collected data via continuous seawater intake at the depth deeper than 3 m, so that these data were included only in the Level-0 and Level-1 (shown next) data. The number of samples obtained after 2014 was smaller than that before 2014, but observations were conducted over all seasons ( Supplementary Fig. 1).
Except for duplicated data (the same location, time/ date/year, and observer) which were removed because of no relation to dataset reliability, we used all data obtained by these 27 projects to ensure that the amount thereof was sufficiently large, although parts of these projects adopted procedures that differed from the latest guidelines. Almost all projects adopted a tow net with a mesh size of 0.2-0.3 mm to collect floating objects, including microplastics. The maximum size of the plastic debris was not recorded in the majority of the projects. We here assumed that plastic debris reported in all projects listed in Table 1 was categorized as microplastics (< 5 mm, as per [8]) unless otherwise stated. This assumption is justified because, for instance, more than 90% of the plastic debris particles collected by surface net tows in Project #9 were < 5 mm. Likewise, microplastics (< 5 mm) accounted for > 93.7% of all particles in Project #3 despite the upper size limit of 50 mm in collecting plastic fragments (Supplementary Figure 2). Nine projects conducted surface net tows without a flowmeter, and measured the seawater volume passing through the net ( Table 1). The absence of a flowmeter may have led to alternations in the volume passing through the net by ocean currents at towing speeds of 2 3 knots. However, a large amount of data was averaged, which can be expected to reduce the deviations due to ambient ocean currents flowing in different directions. Fourth, attenuated total reflection Fourier transform infrared spectrophotometer (ATR-FTIR), μFTIR, or Raman spectroscopy were not used to account for non-plastic materials in 10 projects conducted mostly in the early 2010s. Identification by the naked eye and/or using a stereomicroscope may have led to an overestimation of the particle counts < 2 mm (which accounted for 66.2% of all particles; see Supplementary  Fig. 2) by approximately 50% [5]. Meanwhile, identification using a stereomicroscope has also led to an underestimation of particle counts < 50 μm with a statistical significance [36]. However, the targets of the previous studies in Table 1 were microplastics larger than several hundreds of μm in size, thus these early projects may have overestimated the particle count by approximately 30% (~66.2% × 50%). Both sizes and surface areas of microplastics show a continuous distribution [37] and, thus, the overestimation in small microplastics could be observed even if equivalent lengths computed from areas (e.g., [38]) were used for a measure of microplastic size. The microplastic abundance metric for the Level-0 data is the particle count per unit seawater volume (pieces m − 3 ). Abundance was measured directly using a flowmeter (12 projects) or intake water (4 projects). However, 11 projects measured abundance per unit area, which was computed by converting flowmeter (projects #5, #6 and #21) or global navigation satellite system data (projects #1, #4, #7, #9, #14, #15, #16, and #20). The seawater volume for each of these 11 projects was computed by multiplying the area by tow depth (half the height of the tow net). The abundance in Project #6 was given by weight. For consistency, this was converted into a particle count according to the Eqs. (4)~(7) shown later, although Project #6 converted from the weight to a particle count in a statistical manner.

Level 1calibration by removal of fibrous microplastics
Including fibrous microplastics can cause a pseudo difference in microplastic abundance estimates obtained from different projects; while one group of projects provided abundance data for microplastics including fiber, another group omitted fibrous microplastics from their estimates. Fibrous microplastics were unlikely to have been quantified precisely, unless clean-air devices were used to prevent airborne contamination during sampling or processing, or airborne contamination was removed by a blank test [39,40]. In addition, sampling gear, such as a tow net made from synthetic fibers, might be a source of contamination. Thus, some of the projects (#2, #3, #5, #7, and #17) excluded fibrous microplastics when creating their datasets. Meanwhile, fibrous microplastics constituted a non-negligible fraction of microplastics collected in the ocean close to the coast (projects #13 and #18), or in an estuary (Project #19).
We excluded the fibrous microplastics from the original data as a data quality control to reduce the pseudo difference in synthesizing the data obtained by the various projects. In total, 21 of 27 projects provided nonfibrous microplastic proportions (Table 1); multiplying these proportions given in the Level-0 data resulted in the Level-1 data excluding fibrous microplastics (pieces m − 3 ). The relatively high ratios in Table 1 suggest that fibrous microplastics were a minor component of all microplastics, particularly in the open ocean; textile fibers made from polyester or polyamide are heavier than seawater and are unlikely to move a long distance from land. Recently, Suarial et al. [41] showed that 79.5% of fibers recording in the world's ocean are cellulosic, and 12.3% are of animal origin. Therefore, the ratios were assumed to be 100% for all projects in which the ratios of non-fibrous microplastics were not recorded (projects #1, #4, #11, #20, #22, and #23).
Level 2pprocessing for wind/wave correction The Level-1 data were standardized to obtain the total particle count, by vertically integrating microplastic abundance over the entire water column using the wind speed and significant wave heights during each  [31] and Isobe et al. [5], f Fibrous microplastics were discarded by this project., g With a flowmeter, h Partly published in Isobe et al. [32], i WP2 net, j Manta net, k The authors stated that the "vast majority" of collected microplastics were fragments. l The abundance without fibrous microplastics was provided by the coauthor. m Intake seawater, n The lower size limit in this project, o 88% of fragments collected in this project were smaller than 10 mm, while fragments between 5 and 10 mm in size account for approximately 5% of all microplastics shown in Supplementary Fig. 2 microplastic survey ('wind/wave correction' [5,32]). This processing step was applied because abundance data of buoyant microplastics from surface net tows vary depending on the oceanic turbulence under different ocean conditions [9,42,43]. The vertical distribution of the microplastic concentration (N) can be approximated as follows: where N 0 denotes the particle count per unit seawater volume around the sea surface (z = 0), which corresponds to the Level-1 data in the present study; w is the terminal rise velocity of the microplastics (5.3 mm s − 1 ), which was obtained experimentally [43]; and z is the vertical axis, measured upward from the sea surface. The vertical diffusivity A 0 was calculated as: where u * represents the friction velocity of water (= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi C d ρ a =ρ w p W 10 ); k is the von Karman constant (0.4); Hs is significant wave height; and W 10 is wind speed at 10 m from the sea surface [9]. In the present study, the air density (ρ a ), the seawater density (ρ w ), and drag coefficient (C d ) are set to 1.25 kg m − 3 , 1025 kg m − 3 , and 1.2 × 10 − 3 (4 m s − 1 < W 10 < 11 m s − 1 in Large and Pond [44]), respectively, so that u * ≈ 0.0012 W 10 . The daily windspeed data, provided by the Japanese Ocean Flux Data Sets with Use of Remote Sensing Observations (J-OFURO [45];), were obtained from multiple satellite observations for the period 1988-2013. In addition, daily wind-speed data acquired by the Advanced Scatterometer (ASCAT) [46] from 2014 to the present were used. Daily significant wave heights were computed using the University of Miami wave model (version 1.0.1 [47];) over the world's oceans within ±80°latitude to reduce assumptions of wave properties (e.g., wave speed of dominant wave) included in the parameterization (e.g., [9]). However, the readers who prefer the parameterization rather than the wave model can replace the modeled wave heights given in the supplementary data (Level-012.csv) with other choices. The wave model was driven by the wind data obtained by the J-OFURO and ASCAT. These wind-speed and wave-height data, which were gridded with a 0.25°horizontal resolution in latitude and longitude, were used for the Eq. (2) on the same date and at the same location as the actual observations of each project listed in Table 1.
Vertically integrating Eq. (1) from the sea surface (z = 0) to an infinitely deep layer (z → − ∞ ) yields the total particle count of microplastics per unit area (M) as follows: The result thus obtained, in pieces/km 2 , is independent of oceanic conditions. However, dependence of the terminal rise velocity (w) on the total particle count (M) was examined as shown later in the first subsection in Results and discussion.
Level 2wconversion from particle count to weight The Level-2p particle count was converted to weight in accordance with Isobe et al. [5]. Each microplastic fragment was assumed to be a flat cylinder with a base diameter and height of δ and γδ, respectively, where δ is the maximum size of the fragments, and γ is an adjustable constant (0.4) selected through trial and error to be consistent with the microplastic weight measured directly using a mass scale [5]. We approximated the size distribution of the total particle count of microplastics as follows: where α (0.83 mm − 1 ) represents the reciprocal of the mode size (1.2 mm) obtained by Project #2 across the Southern Ocean and western Pacific, and β is calculated from Eq. (4) as follows: where M represents the Level-2p data for each project in Table 1 (Eq. (3)), and the operator ½ f ðδÞ δ 2 δ 1 corresponds to f(δ 2 ) − f(δ 1 ).
Then, we calculated the microplastic weight (W) for particle sizes between δ 1 (0.3 mm) and δ 2 (5 mm), as follows: or concisely expressed as: where θ n = θ n − 1 (6 − n), θ 0 = 0.2, ρ denotes the plastic density (~1.0 g cm − 3 ) close to polyethylene and polypropylene which are majority of plastic polymers collected in surface net tows in the ocean [48], W is weight per unit area (g/km 2 ). Based on all microplastics collected in Project #2, Isobe et al. [5] estimated that the microplastic weight approximated by Eq. (7) was 85.3% of the actual weight. For comparison, we also created an alternative weight data by using a statistical manner given by the Project #6 as follows: where M represents the Level-2p data as in Eq. (5). The weight obtained by Eq. (8) (W Eq (8) ) is expressed approximately by W in Eq. (7) as follows: log 10 W Eq: 8 ð Þ ¼ 1:2 log 10 W Eq: 7 ð Þ −2:0: ð9Þ The dataset converted using Eq. (7) is referred to as the Level-2w1, while Eq. (8) created the Level-2w2 data. The difference between the Level-2w1 and 2w2 data was described in the first subsection in Results and discussion.
Level 3p and 3wgridded data through OIM The total particle count (Level 2p) and weight (Level 2w1 and w2) per unit area were interpolated to the gridded data (Level 3p, 3w1, and 3w2) using an OIM. Although OIM algorithms have been established by several research projects, the method of Daley [49] and Kako et al. [46] was adopted in the present study as follows: where A g (B g ) is an analysis (first guess) value to be interpolated to a grid cell, g, 5°× 2°in longitude and latitude, and O i (B i ) is an observed (first guess) value given at observation point i, and W i denotes a weight function at observation point i; there are N observation points. The optimum weight, computed so as that the errors included in observed (O) and first guess (B) values in Eq. (10) are unbiased and uncorrelated to generate gridded data free of biases, can be expressed as where μ i,j (or μ i,g ) is a coefficient of error correlation between grid points i and j (or g); superscripts B and O denote observed and first guess values, respectively; μ O i; j is an identity matrix (1 only if i = j, otherwise 0); and μ B i; j is estimated to be where r z (r m ) denotes the zonal (meridional) distance between two arbitrary points (i-j, and i-g in Eq. (11)), and L z (L m ) is the decorrelation scale in the zonal (meridional) direction [46,50]. In the present study, the decorrelation scales of 1000 and 500 km were chosen for L z and L m , respectively, through trial and error. Interpolation was not conducted at grid cells having fewer than observed data points within the decorrelation scales. Zero was used as the first-guess value over the entire domain.
Level 3 pm and 3wmgridded monthly surface concentration data The total particle count (Level 3p) and weight (Level 3w) of microplastics in the grid cells are available for computing the concentration (N 0 in Eq. (3)) under the various wind/wave conditions. For instance, the Levels 3p and 3w1 data were converted to the surface concentration for each month, under the average wind speed and wave height for the period 1993-2018. To be sure, the seasonal variation of surface microplastic abundance should be validated by field surveys in the actual ocean, and so this is a subject of future research beyond the present study. Nonetheless, these data should allow for accurate laboratory-based studies on impact to aquatic organisms exposed to microplastics, so that microplastic concentrations used for exposures are comparable with those in reality. In addition, these data may be capable of predetermining appropriate months and locations of a field campaign to collect sufficiently large numbers of microplastics. The wind speed and wave height data used to create the Level-2 dataset were averaged monthly for the period 1993-2018. Using Eqs. (2) and (3), we converted abundance at Level 3p and 3w1 (M in the equations) to the Level-3 pm and -3wm surface concentrations, respectively, for each month using the monthly averaged wind speed and wave height. Other parameters, such as terminal rise velocity, were the same as those in creating the Level-2 dataset.

Sensitivity of parameter choices on microplastic abundance
Because of limited available knowledge regarding microplastics in the ocean, the present study had to make some parameter choices for processing the data at each level. Here we demonstrate how microplastic abundance depends on the choices made by using different parameters such as terminal rise velocities (w) in Eq. (3) and formulae to convert from the total particle count to weight. The early plastic projects ca. 2010s may have overestimated the particle count by approximately 30% because of misidentification of small fragments in the absence of spectrometry. To quantify how the overestimation diminished the quality of the dataset, the Level-2p data were created from the Level-1 so that the particle counts were reduced by 30% in the projects without spectrometry (Table 1). It was found that the total particle count averaged over the world's ocean in the Level-2p data was reduced approximately by 7%.
The weight of microplastics (W in Eq. (7)) depends significantly on the choice of the formula to convert from the total particle count to weight. When the statistical manner of Eq.
(8) was adopted for the conversion, the weight in Level-2w1 data decreased to 2~20% in the range of 10 2~1 0 7 g km − 2 (Eq. (9) ; Fig. 1b). This is probably because the particle counts in smaller microplastic sizes from Project #6 (their Fig. 3) were more abundant than those observed in Project #2 (Supplementary Fig. 2). The size distributions are unlikely to be homogeneous in the world's ocean and, therefore, it should be noted that the current estimate of weight includes uncertainty as shown in Fig. 1b. Therefore, for reference, the present study created Level-2w2 data using Eq. (8) in addition to Level-2w1 data. Likewise, the gridded data through the OIM using Level-2w2 data were created as Level-3w2 data.

2D maps and statistics
The present study's objective was to generate a new, publicly available dataset and facilitate microplastic research based on actual and reliable ocean data. Although further and more detailed interpretations, analyses, and processing are expected to be carried out by researchers who download the dataset, we present twodimensional (2D) maps with brief explanations of the features of the dataset. Figure 2a and b provide 2D maps of the Level-0 and Level-1 data, respectively, including the microplastic abundance obtained by Project #21, conducted in the Great Lakes of the United States. Microplastic surveys have been conducted in the seas around the United States, European countries, such as the Mediterranean Sea and the eastern North Atlantic, and Japan. Approximately 46% of microplastic surveys have been conducted in the mid-latitude ocean between 30°N and 60°N, while low-latitude surveys of the Indian Ocean and western Pacific (between 30°S and 30°N, and 40°E and 180°E, respectively) account for only 5% of all data.
Integrating the microplastic abundance over the entire water column yielded 2D maps of the total particle count (Level 2p; Fig. 3a) and weight (Level 2w1; Fig. 3b), after removing effects of winds/waves during the observations. Note that the Great Lakes and 2019 data were excluded because of a lack of wind/wave data among the satellite data. Nonetheless, 679 survey positions were added to Fig. 2, because Project #9 originally provided vertically-integrated microplastic abundance data after the wind/wave correction, and those data are not included among the Levels-0 and -1 data.
The gridded data created by the OIM were displayed in 2D maps of the total particle count (Level 3p; Fig. 4a) and weight (Level 3w1; Fig. 4b), which covered approximately 60% of the entire ocean. Note that the grid cells remain white in Fig. 4 when there were fewer than two observed data points within the decorrelation scales. In addition to the interior of the midlatitude subtropical gyres, including the so-called 'Great Garbage Patch' (e.g., [51]) areas, a large number of pelagic microplastics were detected in the seas around Europe, the East Asian seas, and the eastern Indian Ocean. The sum of the particle count (weight) of microplastics was estimated at 24.4 trillion pieces (8.2 × 10 4~5 7.8 × 10 4 tons) ( Table 2), which was larger than the conservative estimate of Eriksen et al. [7]; 5 trillion pieces, and 25 × 10 4 tons especially for the particle count. However, the present estimates are also conservative because gridded data were mostly absent for the western Indian Ocean and South China  Table 2 Microplastic abundance: Level-3p and -3w data (Fig. 4). These values were obtained from grid cells where more than two values exited (i.e., all grid cells except the white areas). Total abundance was computed so that values were representative of each 5°-longitude × 2°-latitude grid cell. The particle count (weight) per unit area was rounded to the 1000 (10)  [52]. The surface concentrations, represented by the particle count (weight) per unit seawater volume are shown in Fig. 5a and b (Fig. 5c and d) for February and August, respectively, as exemplified by the monthly data. The particle count and weight increased in the Northern Hemisphere during the boreal summer under calm oceanic conditions. At the same time, the seasonality of microplastic abundance was not remarkable in the Southern Hemisphere, probably due to the relatively small amount of pelagic microplastics. The annuallyaveraged abundance (both particle count and weight) and maximum values over the entire domain are listed in Table 3.

Conclusion -recommendations for future surveys
Microplastics are oceanic pollutants that have yet to be archived sufficiently for mapping climatological state or variability over the world's oceans, despite observations dating back to the 1970s [53]. The present study attempted to create state-of-the-art 2D maps of microplastic abundance, based on published and unpublished data. However, protocols for microplastic field surveys have only recently become available (e.g., [2,3]), so the sharing and synthesis of observed data, which could facilitate ocean plastic studies, has only just begun. The field campaigns that must be prioritized to further advance marine-plastic-pollution research are discussed below.
First, locations where large amounts of mismanaged plastic waste are discharged should be intensively studied. In particular, a notable shortcoming of the present dataset is the lack of microplastic data for the Indian Ocean and the seas around Southeast Asia (including the South China Sea). Besides waters close to land masses, surveys in the subtropical convergence zones approximately across the 30°-latitude in both hemispheres should be prioritized to determine the total amount of plastics in the world's oceans.
Second, microplastic abundance in the subsurface layer of the ocean should be explored. Recent observations of pelagic microplastics have revealed that a non-negligible fraction of microplastics exists in the subsurface layers of coastal waters [36], and in intermediate and abyssal layers of the open ocean [30,54,55]. It has been suggested that biofouling [56], inclusion within marine aggregates [57][58][59][60], and inclusion within fecal pellets [61] allow microplastics lighter than seawater to settle in the abyssal ocean. Thus, microplastic abundance in the ocean is likely to be much greater than estimated. Three-dimensional maps of microplastic abundance, rather than the 2D maps presented here, are required to determine the ultimate fate of marine plastic debris. Third, field survey protocols of very small microplastics (< 300 μm) urgently required further development and optimization. The lower size limit of ocean microplastics investigated to date is dependent on both the mesh size of tow nets used in field surveys and the operational limitations of the equipment, such as FTIR. However, some studies have reported the existence of very small microplastics down to several tens of μm in the open ocean [38,55,62] and coastal waters [36]. Moreover, the drifting of nanoplastics (< 1 μm) in the ocean was suggested [63]. It is plausible that very small microplastics and nanoplastics could exist in the marine environment, if degradation and fragmentation proceed continuously in nature. Besides these very small microplastics, Tokai et al. [37] reported that 60% of microplastic particles with the size between 0.4 mm and 1 mm pass through the 0.333-mm mesh of surface sampling nets. The fate of plastic debris will remain obscure unless these missing plastic particles are quantified in the water column and bottom sediments. Table 3 Microplastic abundance: Level-3 pm and -3wm data (Fig. 5)