To adopt functional programming, we must know that programmers prefer it and that its overhead is acceptable. This paper addresses the latter issue and shows that the hardware overhead can be acceptably small: in particular, the penalty for write misses in the cache can be near zero. This is achieved by adopting a cache architecture--available on at least one current workstation--that features “write allocate with subblock placement.” Programs in functional languages tend to do memory writes only to initialize newly allocated objects. With the recommended hardware approach, a write miss to the cache allocates a cache block and marks all words invalid except the one written. As subsequent initializations occur to other words in the allocated object, they replace invalid contents, so no interaction with main memory occurs. By using a sequential allocation strategy for objects, almost all write misses are accomplished without the penalty of reference to main memory. The investigation is carefully done, utilizing lengthy instruction traces from eight large programs written in SML/NJ.