While looking around Redis data directory you may have noticed several files, among them a file with .aof extension.
[email protected]:~# ls -alh /var/lib/redis/ total 54M drwxr-x--- 2 redis redis 4.0K Jul 1 10:40 . drwxr-xr-x 39 root root 4.0K Jun 17 11:32 .. -rw-r----- 1 redis redis 39M Jun 25 09:56 appendonly.aof -rw-rw---- 1 redis redis 16M Jul 1 10:36 dump.rdb
You may wonder what it is and what its position is. Actually, it is quite an necessary file for your Redis installation. Let’s have a quick gaze at it and see what it is for.
First, the fundamentals. Redis is an in-memory data store which means that all the data you store in it will reside in memory. As we all know, memory is quite volatile storage and cannot be trusted with any critical data. If we use Redis for storing data we can easily recreate (for example, a caching layer), this may be acceptable (even though, as we mentioned in 1 of our earlier blogs, it nonetheless is better to have backups of your cache nodes) but generally speaking we would like to persist our data on disk so that it can survive restart of the nodes (no matter if it is a planned maintenance or a crash).
Luckily, Redis comes with a mechanism of snapshotting the data to disk. It can be invoked by hand through SAVE or BGSAVE instructions in Redis.
127.0.0.1:6379> SAVE OK 127.0.0.1:6379> BGSAVE Background saving initiated
Former will happen immediately, interfering with the operations on the database, latter will spawn a child process that will perform the dump, minimizing the impact to the performance of the Redis datastore.
Redis may also be configured to routinely snapshot the data.
127.0.0.1:6379> CONFIG GET save 1) "save" 2) "10 1000"
Here the snapshot will be performed every 10 seconds if at least 1000 changes to the dataset were made. You can reconfigure this setting to your liking:
127.0.0.1:6379> CONFIG SET save "5 1000" OK
Here we have increased the frequency of the snapshotting as lengthy as 1000 writes will happen.
This is ok but it is not ideal. Snapshots will be executed every second but you nonetheless may lose some data that occurred inside the last second. When using only the RDB snapshots you do not really have a proper durability.
Enters Append Only File
Given that RDB snapshot can’t deliver proper durability, Append Only File (AOF) has been created. The thought behind this is to store all of the changes that are happening in the database in the file. If you are acquainted with other database systems like PostgreSQL or MySQL, you can think of it as a WAL or binary log. New entries are always appended (thus the name) so the writes are always sequential. This helps with performance, even with SSD sequential entry is faster than the random 1.
AOF has to be enabled in Redis configuration:
Once enabled, it will take the position of a main source of truth regarding the status of the data. What it means is, whenever there will be a need to load the data, either after the restart or to provision replicas, AOF will be used for that.
Redis configuration file presents a couple more settings that govern the durability. First, appendfsync, defines if fsync is executed after every write to AOF. There are 3 options. ‘No’ means that fsync is not executed and when the data will be continued on the disk depends on the settings of the operating system. Data will be continued only when the filesystem cache will be flushed to disk and then continued on the device. This is the quickest option but it does not provide proper durability regarding cases where the whole node crashes or is restarted. Second option, ‘everysec’, means that the fsync is performed after every second so, theoretically, assuming that the disk will persist data immediately after receiving the write, it is possible to lose up to a second of data. Third option, ‘always’, means that the fsync is performed after every write. This is the most expensive option performance-wise but it guarantees the best durability.
AOF file ultimately will have to be rewritten otherwise it will develop indefinitely. How it is going to be done depends on the configuration: auto-aof-rewrite-proportion and auto-aof-rewrite-min-size define when exactly AOF should be rewritten. To reduce the length of the AOF it is also possible to combine RDB and AOF into 1 file. The setting aof-use-rdb-preamble, when enabled, means that the AOF file will be split into 2 parts. One would be RDB file and then the AOF tail. RDB will contain the snapshot of the database at the given moment and then AOF tail will proceed keeping the track of the changes.
As we mentioned, AOF is quite useful for multiple purposes. First, it is obviously a way to persist the data stored in Redis. Then, when you use replication across Redis instances, replicas will reach out to the master and ask for the missing data. Such data will be read from AOF, making sure that the replica is up to data.
You can clearly see that the Append-only File has numerous functions. While not a should-have, it is the only way to obtain proper durability in Redis.