Accuracy and error tolerance are important requirements for most software. A common practice for achieving the desirable accuracy and avoiding potential numerical problems, such as overflow/underflow, is to use high precision for all of the floating-point variables in a program. High precision comes at a cost: increased execution time. This is especially true when the extra precision such as double double is unavailable in hardware and implemented in software; the extra cost can be substantial. Very often, only a portion of the variables and computation need to be in high precision for the desirable accuracy. However, manually identifying the variables and computation that can use lower precision while satisfying the given error tolerance is impractical, if not impossible.
The tool Precimonious presented in this paper automates the tuning of the floating-point precision of the variables of a program. Specifically, given a high-precision program that satisfies the error tolerance, this tool identifies the variables that can be declared in lower precision while achieving the same accuracy without degrading the performance (execution time).
The basic idea behind the tool is as follows. It searches for the program variables whose precision (types) can be lowered, for example, from double double to double; suggests a type configuration mapping variables to types; transforms the original program by applying the type configuration; tests the transformed program to determine if the type configuration is valid in terms of the two criteria, accuracy (error tolerance) and performance (execution time); and repeats until a valid type configuration is found.
The authors tested the tool on programs that use GNU scientific library and NAS parallel benchmarks. They reported speedup ranging from 0 to 41.7 percent, with more cases of significant speedup within a larger error threshold (10-4) than a smaller threshold (10-10).
Tuning floating-point precision to improve performance is of great interest. This work is an ongoing research project rather than a user-friendly software package. Source code (a C program in the tests) is first translated into LLVM bitcode. Mapping the type configuration, which is in LLVM bitcode, to the source code is preformed manually. The type configuration suggested by the tool is valid only for the testing input data, and is not guaranteed for all possible inputs. The user should provide a set of representative input data. The tool only changes variable declaration and switches function calls. Tuning the precision of intermediate results is a challenging future work.
Despite the limitations of the tool, it is of great interest for researchers and developers. The Precimonious source code and all of the data and results presented in the paper are available under BSD license at https://github.com/corvette-berkeley/precimonious.