312-5 Clustering Soil Profiles Using the Modified Distance Matrix Calculation.

Poster Number 1200

See more from this Division: SSSA Division: Pedology
See more from this Session: Innovations in International Pedology: II

Tuesday, November 17, 2015
Minneapolis Convention Center, Exhibit Hall BC

Vakhtang Shelia, AgWeatherNet, Washington State University, Prosser, WA and Gerrit Hoogenboom, Ag. and Bio. Engineering, University of Florida, Gainesville, FL
Abstract:

The purpose of this study was to maximize the use of soil horizon and soil layer properties through adjustment techniques based on the modified distance matrix calculation while clustering of soil profiles.  The proximity measure or the distance between vectors or matrices is calculated when they have the same dimensions.  In the case of the soil profile data the corresponding matrices representing different soils and their layer properties usually have different dimensions.  A new approach was explored that allows for adjustment of the soil profile layers.  We assume that if any ith soil layer has  vector of attribute values, then any of its sublayers is characterized with the same values for its attributes.  Based on this assumption the thickness of two soil layers should be compared and additional sublayers can be created.  We then build the matrices with the same dimension and finally calculate the proximity measure - Euclidian distance between the two sublayers.  Based on this approach a distance matrix calculation algorithm and corresponding computer program was created that calculates distances for Big Data of soil profiles.  The proposed approach is shown to be effective when using the existing reliable datasets, such as version 3.1 of the ISRIC-WISE database (WISE3).  Hierarchical clustering was performed using the module based upon the original algorithm of soil profile layers adjustment with a further integration into R.  It was shown that, within limitations, clustering algorithms and parameters have an important influence on the clustering result and should be selected carefully.  The main outcome of this study is that it utilizes several clustering methods with soil profile data on a layer by layer basis and can establish a strong mechanism of using the modified distance matrix calculation that can be applied with different clustering algorithms after surveying a large set of soil profiles.

 

See more from this Division: SSSA Division: Pedology
See more from this Session: Innovations in International Pedology: II