A method for least squares convex-concave data smoothing is presented that may find application to problems where the data suggest that the shape of the underlying function is sigmoid. The author has developed an algorithm that, by using the structure of the calculation, gives all solutions by using at most 2n-2 quadratic programs on subranges of the data. First, it obtains the best convex approximation of the first k points, and the best concave approximation of the last n-k points, for k = 2,3,...,n-1. Then, it chooses the k that gives the least sum of squares of the residuals. The calculation is made quite efficient by using a relation between the components of the best convex approximations to consecutive data, a B-spline representation of the convex components, and a suitable quadratic program that is employed in an inner loop of the procedure. The numerical results suggest that the method is able to calculate the required fit in O(n) active set replicates over subranges of the data, which means that the method requires about O(n2) computer operations. The author observes that concave-convex smoothing can be achieved by changing the sign of the data components.
This method is based on restricting the sign changes of the second divided differences of the smoothed data. For this special case of a single sign change, the key result is that the optimal index for the transition from convex to concave can be found by determining the best convex approximation to the first k data points, and the best concave approximation to the last n-k. The calculation is made more efficient by a result that shows that the best convex approximation to the first k+1 points can be easily derived from that for the first k. Representing the solution as a linear spline, with knots at points that are not active constraints (the second divided difference is nonzero), reduces the required number of convex programs. The author references a Fortran 77 package implementing the method, which is evidently used to produce the included numerical results.
The first paragraph above is a paraphrase of the author’s conclusion at the end of the paper. It is unfortunate that the presentation that preceded it was not as clear. The text contains many typographical errors, including the use of different forms of the lower-case Greek letter theta, a few occurrences of the letter “a” instead of alpha, and an outright sign error in the statement of Lemma 2. Table 1 summarizes the behavior of the method on constructed data with increasing amounts of added noise. However, the description of the table’s contents makes it hard to confirm the author’s claims. The final result applies the algorithm to real data from an economic substitution problem, which supposedly illustrates its practical utility.