Cui et al. present an extension of the vector approximation file (VA-file) for processing k-nearest neighbor algorithm (k-NN) searches in high-dimensional databases.
The main idea consists of mapping multidimensional points to one dimension using principal component analysis, and then indexing the approximation file of the VA-file according to such a principal component in a B+ tree; detailed experimentation is reported. Even though Cui et al. use four and six bits per dimension instead of eight bits, which is the best value for this parameter, the merits of the new method are obvious in terms of input/output (I/O) cost. The main advantage of this approach is that it avoids the sequential scan of the whole approximation file during the query processing for k-NN by using the B+ tree built on top of the VA-file. On the other hand, the presented method is not a completely dynamic structure; this is a negative aspect for a highly changeable image database. Also, it lacks experimental results regarding the time required for constructing the index.
In conclusion, despite some drawbacks, the paper proposes an interesting extension of the VA-file, with good I/O performance in k-NN queries.