Exploring MySQL BinLogs: Comprehensive Guide on Management, Formats, and Tools

JAIGANESH T S
6 min readMay 26, 2024

--

Introduction

A BinLog is a log file that tracks all operations performed on a database, including data modifications and structural changes. It does not contain select statements or show statements because they are not going to change the data anyway. The binlogs is also referred as Binary logs. The main two use case of the binary logs is for replication purposes and Point In Time Recovery (PITR). PITR means recovering a data from a particular period/past.

Software : MacOS 🖥️🍏 Ventura 13.6

MySQL Version : 5.7.18

Enabling / Disabling BinLogs

  • The system variable log_bin controls the enabling/disabling binlogs.
  • The log_bin variable scope is global level. It is a boolean variable.
  • From MySQL 8.0, binlogs are enabled by default.

To check whether binlog is enabled/disabled,
select @@log_bin;

To enable the binary logs
* Open the configuration file with superuser permission to edit (sudo nano etc/my.cnf)

* Locate the mysqld section in the cnf file. The configuration file contains many sections in it. Sections are usually denoted by [] symbol. Each section is reserved for a particular use. [mysqld] section is reserved for configurations related to mysql server. If [mysqld] section is not present create mysqld section as [mysqld]. Now you have created mysqld section. Now we have to enable binlog under this section. In the next line add the below entry

* Add the following entry log_bin = prefix. (Example : log_bin = jai)

* The prefix can contain any string. This string will be appended as the prefix to all the bin-log files created. The binlog files will be created under the data directory. To know the location of the data directory you can use the command select @@datadir;

* And also it is necessary to set server_id in mysqld section. server_id is a system variable and is set to 0 by default (5.7.18 version). It has to be set in the range of 1 to 2^³²-1. Example: server_id = 1

* Restart MySQL server and check whether binlog is enabled using select @@log_bin;

To disable the binary logs
* Remove the log_bin line or comment using # symbol in the cnf file or add skip-log-bin in the cnf file under mysqld section

* Restart server and check whether binlog is disabled

To see the list of binlogs created
The see all the binlogs created we can use this command show binary logs

Purging Binary Logs

  • It’s is genrally necessary to clean up the bin log files as it would accumulate your storage medium. All the currently available binary log file names will be present in the prefix.index file available in the same location of the log files. When a new bin log file is created a entry will be made on to this file and when purging automatically it will be removed from the file.
  • Purging/Deleting can be done upto a particular file/particular time
    Upto Particular File Command : PURGE BINARY LOGS TO ‘prefix.000020’ - will delete from ‘prefix.000001’ to ‘prefix.000019’
    Upto Particular Time Command : PURGE BINARY LOGS BEFORE ‘2024–05–26 00:09:25’ (YYYY-MM-DD hh:mm:ss)
    Upto Now : PURGE BINARY LOGS BEFORE NOW();

Flushing Binary Logs

  • The flush binary logs (or) flush logs command is used to flush the logs.
  • Here flushing the logs means closing a currently writing file and creates a new file and starts writing into it. So everytime flush is done a new binlog file will be created. You can check it using show binary logs

Expiry Time for Binary Logs

  • The expiry time for Binary logs can be configured using expire_logs_days. When a log file is available after this number of days it will be automatically purged.
  • In MySQL 8 version this expire_logs_days variable is deprecated and binlog_expire_logs_seconds is introduced where value can be in seconds. This is more adaptable and feasible to avoid disk space accumulation.

Decoding Binary Logs

  • The binary logs cannot be read directly since it is available in the binary format. Hence decoding is necessary to read its contents. This is possible with the help of mysqlbinlog. This mysqlbinlog is a utility provided by mysql to read and display the binary logs in a human readable format.
  • The mysqlbinlog is available in the bin folder of mysql directory.

Interpreting MySQL Event Binary Log Contents

  • Now for interpretation, i have created a table and inserted a record
  • The changes i have made will be present in the latest binlog file created so to identify it we can use show binary logs command. The last row will be the latest binlog so we have to check it. Here for me i have only one binlog file. So my contents will be in jai.000001 file
  • So now to decode this file, will be using the mysqlbinlog utility tool
  • A sample binlog file is shown above. For demo, inserted a record in the student table. A start of a event is generally indicated by # at followed by a number. A end of that event is generally indicated by end_log_pos. The red-block indicates a event. It’s start is present at #541 and it’s ending is represented at 586.

BinLog Formats

The binlog format plays a very important role during the replication process. During replication the slave reads from the binlog file and persists the changes into its node so row format is very important inorder to avoid data inconsistency. The row format is controlled by the binlog_format variable. It’s possible values are {ROW | STATEMENT | MIXED}. It’s a dynamic variable. Each row format has its own advantages and disadvantages. Based upon our use-cases and analysis, you can chose the row format. Now let’s consider a query and let’s check how it is present in the binlog in all three binlog formats.

Query : insert into sample values (uuid());

  • Row Based - The row based format is the default format. The row based format makes the binlog file to grow enormous, so it occupies lot of space compared to other row formats. The row based format contains the actual values of the record in the binlog file, so it avoids data inconsistency. In the below case if you see the actual value of the uuid() is present in the binlog so that replicas too will have the same value.
Row-Based BinLog Format
  • Statement Based - The statement format is more compact. The statement format contains the actual query executed in the master/source node. It mostly causes data inconsistency problems between master and replicas, so you have to ensure that your cases may not cause data inconsistency. In the below if you see uuid() is present which when replica calls uuid() it is going to get another new one. In this way consistency is not acheived.
Statement-Based Binlog Format
  • Mixed Based - The mixed format is a combination of row based and statement based. Based upon the query it chooses row or statement based format. It is based upon the non-deterministic values. When it is deterministic it chooses statement and when non-deterministic it opts for row based. The mixed format makes the binlog file compact. In some cases the mixed row format can also cause data inconsistency problems(timezone,etc). In the below case since it was a uuid based query. MySQL internally used row based to avoid inconsistency.
Mixed-Based Binlog Format

Note: Running a server with binlog enabled will cause slowness in the performance but it’s use-cases outweigh its drawbacks and challenges.

--

--