Honggang Bu, Soils, North Dakota State University, Fargo, ND, David W. Franzen, North Dakota State University, Fargo, ND and Lakesh Sharma, Cooperative Extension, University of Florida, Gaineville, FL
Yield prediction is important for making in-season agronomic input decisions and for greater logistical decisions. In predicting the crop yield based on ground-based active optical sensing data, the ordinary statistical unweighted linear or nonlinear regression models are the most popular choices. However, these unweighted models may not be accurate enough for practical use because they are based on the assumption that each data point for regression is obtained with equal precision and that each data point contributes equally to the model construction. These assumptions are probably not accurate. Evidence indicates that there exists statistically significant differences in nitrogen availability within each small spatial field and these differences can be reflected in the sensing data. Using unweighted models relying on the average sensing information alone, some important sensing information such as the variation of the sensor readings within each subplot, is left unmined. To improve the performance of the prediction models, feasibility of developing and applying weighted nonlinear regression models was explored. Novel intensified weights were developed based on the coefficient of variation of sensing data within each subplot. The experiments involved two crops, spring wheat and corn; two sites for each crop, using two ground-based active optical sensors, GreenSeeker™ and Holland Crop Circle™, two NDVI-based crop indices, red NDVI and red edge NDVI, and two general types of regression models, exponential and quadratic (weighted or unweighted) models. Results indicated that the proposed intensified weighted nonlinear regression models are more robust to outliers, and for most single site-year cases, they dramatically outperform their corresponding unweighted regression models in terms of R2. Results also showed that this methodology did not improve predictions using pooled multiple site-year data. The reason may be that pooled data consisted of greater sample number enabling unweighted regression models to yield a more stable and significant relationship.