Mailing List
Home
Forum Home
MySQL General - General MySQL discussion
MySQL++ - Programming with the C++ API to MySQL
MaxDB - Everything about MaxDB, formerly known as SAP DB
ODBC - ODBC with the MySQL Connector/ODBC driver
MySQL on Win32 - Runing MySQL on Windows 9x/Me/NT/2000/XP
Java Help - Mostly related to the MySQL Connector/J driver
Perl - Perl support for MySQL with DBI and DBD::mysql
GUI - MySQL GUI Tools
Announcement
Subjects
Subject: mysql openssl Question
ERROR 1045: Access denied for user: 'root@localhost ' (Using
password: NO)
Update one field with more fields from another table
Subject: Getting Identity after INSERT
ERROR 2002: Can 't connect to local MySQL server through socket
mysql test 4 1 fails with the gis test
Subject: MySQL Cluster Software
Downgrade Mysql from 4 to 3 23
Mysql 4 0 Oracle Stored Procedure Trigger Conversion
Can 't access mysql after kernel upgrade
Executing MySQL Commands From Within C Program
Comparing and writing out BLOBS
Subject: Re: Preventing Duplicate Entries
FULLTEXT query format question
Strange behavior, Table Level Permission
Does the binary log enabling affect the MySQL performances?
mysql:it 's a db not a dbms how it 's possible?!
mysql have same function mthod as Oracle decode()
 
Subject: SOLVED: Problem with *very* slow replication, FreeBSD 6.2

Subject: SOLVED: Problem with *very* slow replication, FreeBSD 6.2

2007-11-03       - By Christopher E. Brown

 Back


An update for those actually paying attention.

I have been fighing unusual performance issues with replication between
FreeBSD 6.2 machines.

The unusual part is that while replication would never top 10 writes per
second (even while the master was taking hundres of writes per second),
the slave always reported zero seconds behind.

This is on servers with less than 1% CPU used.

The actual problem was not with writing the binlog, or the slave SQL
thread, but the actual transfer of the binlog across the network.

After days of running, the slave would be many Gigs behind the master.

While debugging I tried many things including updating from 5.1.19 to
5.1.22, rebuilding with WITH_PROC_SCOPE_PTH=yes, and even rebuilding using
linuxthreads.


None of this worked.



The problem was rfc1323...  Window scaling *SHOULD* have improved
performance given that this is a jumbo frame GigE network.

For reasons I don't understand, with rfc1323 enabled the data transfer
rate for replication is limited to a ~ 200Kbyte/sec (I do not see the same
slowdown for http or scp transfers).


To verify I rebuilt both systems back to default (native threads),
re-inited the Master<->Master replication loop, shutdown one of the
servers and inserted several million records on the live system (about
1.8Gbyte of binlog).


On restarting the second system it read the binlog into the relay log at
20 - 25 Mbyte/sec.  The seconds behind master value showed sane values,
and it processed the relay-log backlog at about 6600 writes/sec until
finished.


Further testing included 3,000 inserts/sec to each of the servers
(6,000/sec total) with the master/master replication loop active.  During
a run of 10,000,000 inserts to each server replication was never more than
2 seconds behind.


On Tue, 30 Oct 2007, Christopher E. Brown wrote:

> On Thu, 25 Oct 2007, Bob@(protected) wrote:
>
>> Not sure that I get the whole picture.
>>
>> We have been running replication since about 4.0 and we have been through
>> several upgrades and are now at 5.0.27.
>>
>> The 'show slave status' always gives us an accurate reflection of where it
>> is at which is usually 0 seconds behind.
>>
>> Occasionally, it falls behind if the master is really busy (>2200 q/s with
>> about 70% being updates/deletes/inserts).
>>
>> At those times the slave tops out at about 1200 q/s of which most are db
>> mods of some kind and some selects since we have reports running against
>> the replica and it will fall behind temporarily.
>>
>> Can you send show slave status and show master status as well and typical
>> mytop outputs for master and slave?
>>
>> That might let me be able to provide more help.
>>
>>
>> Bob
>
> Unfortunatly I had to tear down replication as it was causing problems with
> the master.  (The master will not delete binlogs that a slave is still
> loading, when the slave is 40 file behind disk gets short).
>
> CPU load was near zero on both systems (98% idle or better).
>
> Disk load is minimal.
>
> The slave is always up to date with relay file processing and reporting zero
> seconds behind.
>
> In short, everything looks fine.
>
>
> What happens is that the master -> slave binlog feed runs very slow (no more
> than abount 10 writes/sec).
>
>
> So, afer a few days the slave is still reporting zero seconds behind, and it
> is zero seconds behind the relay log.
>
> The problem is that while the master is currently writing binlog 650, the
> slave is actually zero seconds behind the feed, but the binlog feed has
> fallen 20 - 30 files behind (our binlog rolls at 256M).
>
>
> Since there is no load issue, I expect there is a timing or trigger issue
> with the master side proc doing the binlog dump, or the slave side receiving
> it.
>
>
> I can stop/start replication and/or reload both servers, it still holds.
>
> I see the replication restart, with the slave running zero seconds behind the
> relay log, the binlog feed starts up right where it left off but the feed
> only runs at about 10 writes a second.
>
>
> Are your running native or LinuxThreads?  This is smelling like threading
> issue to me (we are running FreeBSD 6.2 with native threading and 5.1.19).
>
> The exact same setup was pre-built on Linux systems (2.6.x Slackware) before
> being built out on the production systems (FreeBSD 6.2).
>
> During the testing 1000 writes/sec were no problem (small/simple table, fits
> in memory).  When I forced a backlog of approx 2GB by shuttong down the slave
> on restart the binlog -> relay log feed ran at over 25MB/sec until caught up.
>
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=mysql@(protected)