Estimating Yield Distributions From Truncated Data Using Linear Regression Models

Open Access
Zalewski, Erik John
Area of Honors:
Supply Chain and Information Systems
Bachelor of Science
Document Type:
Thesis Supervisors:
  • Saurabh Bansal, Thesis Supervisor
  • John C Spychalski, Honors Advisor
  • Estimation
  • Yield Distributions
  • Truncated Data
  • Linear Regression
  • Models
  • Modeling
  • Censored Demand
This thesis focuses on a problem in the context of the commercial seed industry. Specifically, it investigates how seed producers can estimate the distribution of production yields when only truncated data from some trials is available, i.e., the numerical values are observable for some but not all data points. There are a few different ways to combat this issue. One is to take the data points available, assume that they constitute the entire data set, and then use them to estimate the mean and standard deviation. Another approach is to acknowledge the presence of other unobserved data points and estimate the mean and standard deviation by assigning a weight to each observation available. The thesis describes a mathematical development for the second approach, and then shows that this approach is superior over the first approach. Using illustrative example, the increase in dollar amount for a representative problem in the commercial seed industry is also calculated.