Tag Archives: disk io

DB #2 : [MySQL][InnoDB] About DISK I/O

InnoDB uses simulated asynchronous disk I/O: InnoDB creates a number of threads to take care of I/O operations, such as read-ahead.

There are two read-ahead heuristics in InnoDB:

In sequential read-ahead, if InnoDB notices that the access pattern to a segment in the tablespace is sequential, it posts in advance a batch of reads of database pages to the I/O system.

In random read-ahead, if InnoDB notices that some area in a tablespace seems to be in the process of being fully read into the buffer pool, it posts the remaining reads to the I/O system.

InnoDB uses a novel file flush technique called doublewrite. It adds safety to recovery following an operating system crash or a power outage, and improves performance on most varieties of Unix by reducing the need for fsync() operations.

Doublewrite means that before writing pages to a data file, InnoDB first writes them to a contiguous tablespace area called the doublewrite buffer.

Only after the write and the flush to the doublewrite buffer has completed does InnoDB write the pages to their proper positions in the data file. If the operating system crashes in the middle of a page write, InnoDB can later find a good copy of the page from the doublewrite buffer during recovery.

InnoDB는 의사(실제와 비슷한) 비동기 I/O(AIO)를 사용.
다수의 I/O Thread를 작성하고, read-ahead(先読み) 등의 I/O 조작에 대응.

InnoDB는 2개의 read-ahead heuristics가 존재

sequential read-ahead는 InnoDB가 Table space내의 segment로의 access pattern이 sequential인 걸 생각하면, I/O system에서 database의 read batch를 사전에 알려 줌

임의의 read-ahead은, InnoDB가 Table space내의 몇몇의 space가 buffer pool을 완전하게 read하고 있을 때 인 걸 생각하면, I/O system에서 남은 read를 알려 줌.

InnoDB는 doublewrite라고 하는 새로운 file flash technic을 이용. 이것은 OS의 crash 혹은 정전후의 recovery에 안전성을 부여, 또 fsync() operation의 필요성을 줄이는 것으로 거의 모든 종류의 Unix의 성능을 향상 시킴.

Doublewrite란, datafile에 page를 write하기 전에, InnoDB가 최초에 그것들을 doublewrite buffer라고 하는 인접한 tablespace area에 write하는 것을 의미.

doublewrite buffer로의 write와 crash가 완료한 후에 InnoDB는 datafile내의 올바른 위치에 page를 write함. 만약, OS가 page write도중에 crach 하면, InnoDB는 recovery 도중에 doublewrite buffer부터 유효한 copy를 발견하는것이 가능.