TY - JOUR
T1 - A review of the heterogeneous landscape of biodiversity databases
T2 - Opportunities and challenges for a synthesized biodiversity knowledge base
AU - Feng, Xiao
AU - Enquist, Brian J.
AU - Park, Daniel S.
AU - Boyle, Brad
AU - Breshears, David D.
AU - Gallagher, Rachael V.
AU - Lien, Aaron
AU - Newman, Erica A.
AU - Burger, Joseph R.
AU - Maitner, Brian S.
AU - Merow, Cory
AU - Li, Yaoqi
AU - Huynh, Kimberly M.
AU - Ernst, Kacey
AU - Baldwin, Elizabeth
AU - Foden, Wendy
AU - Hannah, Lee
AU - Jørgensen, Peter M.
AU - Kraft, Nathan J.B.
AU - Lovett, Jon C.
AU - Marquet, Pablo A.
AU - McGill, Brian J.
AU - Morueta-Holme, Naia
AU - Neves, Danilo M.
AU - Núñez-Regueiro, Mauricio M.
AU - Oliveira-Filho, Ary T.
AU - Peet, Robert K.
AU - Pillet, Michiel
AU - Roehrdanz, Patrick R.
AU - Sandel, Brody
AU - Serra-Diaz, Josep M.
AU - Šímová, Irena
AU - Svenning, Jens Christian
AU - Violle, Cyrille
AU - Weitemier, Trang D.
AU - Wiser, Susan
AU - López-Hoffman, Laura
N1 - Publisher Copyright: © 2022 John Wiley & Sons Ltd.
PY - 2022/7
Y1 - 2022/7
N2 - Aim: Addressing global environmental challenges requires access to biodiversity data across wide spatial, temporal and taxonomic scales. Availability of such data has increased exponentially recently with the proliferation of biodiversity databases. However, heterogeneous coverage, protocols, and standards have hampered integration among these databases. To stimulate the next stage of data integration, here we present a synthesis of major databases, and investigate (a) how the coverage of databases varies across taxonomy, space, and record type; (b) what degree of integration is present among databases; (c) how integration of databases can increase biodiversity knowledge; and (d) the barriers to database integration. Location: Global. Time period: Contemporary. Major taxa studied: Plants and vertebrates. Methods: We reviewed 12 established biodiversity databases that mainly focus on geographic distributions and functional traits at global scale. We synthesized information from these databases to assess the status of their integration and major knowledge gaps and barriers to full integration. We estimated how improved integration can increase the data coverage for terrestrial plants and vertebrates. Results: Every database reviewed had a unique focus of data coverage. Exchanges of biodiversity information were common among databases, although not always clearly documented. Functional trait databases were more isolated than those pertaining to species distributions. Variation and potential incompatibility of taxonomic systems used by different databases posed a major barrier to data integration. We found that integration of distribution databases could lead to increased taxonomic coverage that corresponds to 23 years’ advancement in data accumulation, and improvement in taxonomic coverage could be as high as 22.4% for trait databases. Main conclusions: Rapid increases in biodiversity knowledge can be achieved through the integration of databases, providing the data necessary to address critical environmental challenges. Full integration across databases will require tackling the major impediments to data integration: taxonomic incompatibility, lags in data exchange, barriers to effective data synchronization, and isolation of individual initiatives.
AB - Aim: Addressing global environmental challenges requires access to biodiversity data across wide spatial, temporal and taxonomic scales. Availability of such data has increased exponentially recently with the proliferation of biodiversity databases. However, heterogeneous coverage, protocols, and standards have hampered integration among these databases. To stimulate the next stage of data integration, here we present a synthesis of major databases, and investigate (a) how the coverage of databases varies across taxonomy, space, and record type; (b) what degree of integration is present among databases; (c) how integration of databases can increase biodiversity knowledge; and (d) the barriers to database integration. Location: Global. Time period: Contemporary. Major taxa studied: Plants and vertebrates. Methods: We reviewed 12 established biodiversity databases that mainly focus on geographic distributions and functional traits at global scale. We synthesized information from these databases to assess the status of their integration and major knowledge gaps and barriers to full integration. We estimated how improved integration can increase the data coverage for terrestrial plants and vertebrates. Results: Every database reviewed had a unique focus of data coverage. Exchanges of biodiversity information were common among databases, although not always clearly documented. Functional trait databases were more isolated than those pertaining to species distributions. Variation and potential incompatibility of taxonomic systems used by different databases posed a major barrier to data integration. We found that integration of distribution databases could lead to increased taxonomic coverage that corresponds to 23 years’ advancement in data accumulation, and improvement in taxonomic coverage could be as high as 22.4% for trait databases. Main conclusions: Rapid increases in biodiversity knowledge can be achieved through the integration of databases, providing the data necessary to address critical environmental challenges. Full integration across databases will require tackling the major impediments to data integration: taxonomic incompatibility, lags in data exchange, barriers to effective data synchronization, and isolation of individual initiatives.
KW - big data
KW - biodiversity informatics
KW - biogeography
KW - database integration
KW - functional trait
KW - taxonomic system
UR - http://www.scopus.com/inward/record.url?scp=85128256664&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85128256664&partnerID=8YFLogxK
U2 - 10.1111/geb.13497
DO - 10.1111/geb.13497
M3 - Review article
SN - 1466-822X
VL - 31
SP - 1242
EP - 1260
JO - Global Ecology and Biogeography
JF - Global Ecology and Biogeography
IS - 7
ER -