The world's most popular open source database
Each binary logging format has advantages and disadvantages. For most users, the mixed replication format should provide the best combination of data integrity and performance. If, however, you want to take advantage of the features specific to the statement-based or row-based replication format when performing certain tasks, then you can use the information in this section, which provides a summary of their relative advantages and disadvantages, to determine which is best for your needs.
Advantages of statement-based replication:
Proven technology that has existed in MySQL since 3.23.
Less data written to log files. When updates or deletes affect many rows, this results in much less storage space required for log files. This also means that taking and restoring from backups can be accomplished more quickly.
Log files contain all statements that made any changes, so they can be used to audit the database.
Tables replicated using statement-based replication are not required to have explicit primary keys.
Disadvantages of statement-based replication:
Not all UPDATE statements can
be replicated: Any non-deterministic behavior (for example,
when using random functions in an SQL statement) is hard to
replicate when using statement-based replication. For
statements that use a non-deterministic user-defined
function (UDF), it is not possible to replicate the result
using statement-based replication, whereas row-based
replication simply replicates the value returned by the UDF.
A statement cannot be replicated properly if it uses a UDF that is non-deterministic, since the value returned by such a UDF depends on factors other than the parameters supplied to the UDF.
Statements using any of the following functions cannot be replicated properly:
SYSDATE() (unless the
server is started with the
--sysdate-is-now option)
All other functions are replicated correctly (including
RAND(),
NOW(),
LOAD DATA
INFILE, and so forth).
INSERT ... SELECT requires a greater
number of row-level locks than with row-based replication.
UPDATE statements that
require a table scan (because no index is used in the
WHERE clause) must lock a greater number
of rows than with row-based replication.
For InnoDB: An
INSERT statement that uses
AUTO_INCREMENT blocks other
non-conflicting INSERT
statements.
For complex statements, the statement must be evaluated and executed on the slave before the rows are updated or inserted. With row-based replication, the slave only has to run the statement to apply the differences, not the full statement.
Stored functions execute with the same
NOW() value as the calling
statement. However, this is not true of stored procedures.
Deterministic UDFs must be applied on the slaves.
If there is an error in evaluation on the slave, particularly when executing complex statements, then using statement-based replication may slowly increase the margin of error across the affected rows over time.
Tables have to be (almost) identical on master and slave.
Advantages of row-based replication:
All changes can be replicated. This is the safest form of replication.
For MySQL versions earlier than 5.1.14, DDL (data definition
language) statements such as CREATE
TABLE are replicated using statement-based
replication, while DML (data manipulation language)
statements, as well as GRANT
and REVOKE statements, are
replicated using row-based-replication.
In MySQL 5.1.14 and later, the mysql
database is not replicated. The mysql
database is instead seen as a node-specific database.
Row-based replication is not supported on tables in this
database. Instead, statements that would normally update
this information — such as
GRANT,
REVOKE and the manipulation
of triggers, stored routines (including stored procedures),
and views — are all replicated to slaves using
statement-based replication.
For statements like CREATE ... SELECT, a
CREATE statement is generated from the
table definition and replicated using the statement-based
format, while the row insertions are replicated using the
row-based format.
The technology is the same as in most other database management systems; knowledge about other systems transfers to MySQL.
Fewer locks are needed (and thus higher concurrency) on the master for the following types of statements:
Fewer locks are required on the slave for any
INSERT,
UPDATE, or
DELETE statement.
Disadvantages of row-based replication:
RBR tends to generate more data that must be logged. This is because, when using row-based replication to replicate a DML statement, each changed row must be written to the binary log. (When using statement-based replication, only the statement is written to the binary log.) This means that, if the statement changes many rows, row-based replication may write significantly more data to the binary log; this is true even for statements that are rolled back. This also means that taking and restoring from backup can require more time. In addition, the binary log is locked for a longer time to write the data, which may cause concurrency problems.
All tables replicated using row-based replication must have explicit primary keys.
When using row-based replication to replicate a statement
(for example, an UPDATE or
DELETE statement), each
changed row must be written to the binary log. In contrast,
when using statement-based replication, only the statement
is written to the binary log. If the statement changes many
rows, row-based replication may write significantly more
data to the binary log. In these cases the binary log will
be locked for a longer time to write the data, which may
cause concurrency problems.
Deterministic UDFs that generate large
BLOB values take longer to
replicate with row-based replication than with
statement-based replication. This is because, when using
row-based replication, the
BLOB column data is itself
logged, rather than the statement generating the data.
You cannot examine the logs to see what statements were executed, nor can you see on the slave what statements were received from the master and executed.
However, beginning with MySQL 5.1.29, you can see what data
was changed using mysqlbinlog with the
options --base64-output=DECODE-ROWS and
--verbose.
When performing a bulk operation that includes non-transactional storage engines, changes are applied as the statement executes. With row-based replication logging, this means that the binary log is written while the statement is running. On the master, this does not cause problems with concurrency, because tables are locked until the bulk operation terminates. On the slave server, however, tables are not locked while the slave applies changes, because the slave does not know that those changes are part of a bulk operation.
In such cases, if you retrieve data from a table on the
master (for example, using SELECT * FROM
table_name), the server waits for the bulk
operation to complete before executing the
SELECT statement, because the
table is read-locked. On the slave, the server does not wait
(because there is no lock). This means that, until the bulk
operation on the slave has completed, you obtain different
results for the same SELECT
query on the master and on the slave.
This behavior is expected to change in a future MySQL release; however, until it changes, you may prefer to use statement-based replication when your application requires concurrent large bulk inserts and selects.


User Comments
Add your own comment.