419-5 Development of a Large Data Repository to Support Integrated Research for Evaluating Agricultural Sustainability for the Pacific Northwest.

See more from this Division: SSSA Division: Pedology
See more from this Session: Pedology: II (includes student competition)
Wednesday, November 5, 2014: 2:05 PM
Long Beach Convention Center, Room 202C
Share |

Paul E Gessler1, Erich Seamon1, Edward Flathers1, Stephen Fricke1, Von Walden2, Richard A Rupp2 and Sanford Eigenbrode1, (1)University of Idaho, Moscow, ID
(2)Washington State University, Pullman, WA
Secure and accessible data repositories and archives are an important component of large and broad scale interdisciplinary projects that integrate historical and ongoing measurement and monitoring. These data serve as an important legacy of funded research and enable continued use and development for advancing knowledge and development of new research.  With increased exploration of large datasets several strategies and methodologies for data access, storage, and interrogation have come to the forefront. Yet with differing levels of use, data volume, user location, and analytical expectations, finding common methods and approaches that best meet the needs of a particular team or individual are challenging.

The Regional Approaches to Climate Change for Pacific Northwest Agriculture (REACCH PNA) project is a five-year USDA NIFA-funded interdisciplinary project examining the sustainability of cereal crop production. This talk will review the process implemented to evaluate researcher and end-user needs for aiding in the design and provision of tools for data management, exploration, use and long-term curation. Our system is structured in a virtual multi-server environment (data, applications, web services) that include: a geospatial database and web servers for web mapping (ArcGIS Server); ESRI’s Geoportal Server for data discovery and metadata tagging (ISO 19115-2 standard); the UCAR Thematic Realtime Environmental Distributed Data Services (THREDDS) for data cataloging; and the Interactive Python notebook (iPython) to support data analysis. Initial project data input has uploaded over 10 terabytes of data.

Initial results suggest this hybridized data management methodology is strongly dependent on minimum levels of technology acceptance for success. Engagement levels have been low initially, with an increase as we have evaluated different methods to assist in adoption and use.  The diversity of datasets creates challenges regarding data interaction, analytics, and metadata tagging.  Nevertheless, a new paradigm of data management, preservation, and sharing for continued research is upon us.

See more from this Division: SSSA Division: Pedology
See more from this Session: Pedology: II (includes student competition)