我先大概解释一下,这里“buffer cache”中的buffer指的是以前块设备层中用来缓存磁盘内容的结构,一个buffer大小就是磁盘中一个block的大小。这里“page cache”指的是文件系统层用于缓存读写内容的cache,因为这一层在设备层之上,因此和内核其他地方一样,以page为单位来管理。

这里的“合并”指的是,将两层中的结构统一,都改为page cache,且page中包含一个一个的buffer结构,另外对缓存管理做了优化:在文件系统层缓存的page就无需在块设备层再缓存一份了,而是直接用(块设备中buffer)指针指向(文件系统page cache数据)的方式。当然如果只存在于块设备中的缓存(inode的元数据或直接读写块设备的缓存)还是只产生在块设备层。


What is the major difference between the buffer cache and the page cache? Why were they separate entities in older kernels? Why were they merged later on?


The page cache caches pages of files to optimize file I/O. The buffer cache caches disk blocks to optimize block I/O.

Prior to Linux kernel version 2.4, the two caches were distinct: Files were in the page cache, disk blocks were in the buffer cache. Given that most files are represented by a filesystem on a disk, data was represented twice, once in each of the caches. Many Unix systems follow a similar pattern.

This is simple to implement, but with an obvious inelegance and inefficiency. Starting with Linux kernel version 2.4, the contents of the two caches were unified. The VM subsystem now drives I/O and it does so out of the page cache. If cached data has both a file and a block representation—as most data does—the buffer cache will simply point into the page cache; thus only one instance of the data is cached in memory. The page cache is what you picture when you think of a disk cache: It caches file data from a disk to make subsequent I/O faster.

The buffer cache remains, however, as the kernel still needs to perform block I/O in terms of blocks, not pages. As most blocks represent file data, most of the buffer cache is represented by the page cache. But a small amount of block data isn’t file backed—metadata and raw block I/O for example—and thus is solely represented by the buffer cache.

