  | | | - urgent | - urgent 2007-10-03 - By Jan Kirchhoff
Back Ratheesh K J schrieb: > Thanks, > > It helped me a lot. I wanted to know > > 1. what are the various scenarios where my replication setup can > fail? (considering even issues like network failure and server > reboot etc). What is the normal procedure to correct the failure > when something unpredicted happens? > You should first read the right parts of the manual at https//dev.mysql.com/doc before asking such questions. Basically: -Use good hardware with ECC-RAM and RAID-Controllers in order to minimize trouble with faulty hardware. -Never write on the slaves without knowing what this could do to your replication setup -Watch the diskspace and make sure it's always enough space for the binlogs. Otherwise you might end up with half-written binlogs on either the slave or master because of a full disk which can cause trouble and some work to get it up and running again.
When a master goes down or network connection is lost, the slave automatically tries to reconnect once a minute or so. Restarting the master or exchanging some network equipment is no problem. When the slave reboots, it tries to reconnect on startup, too.
This is "out-of-the-box"-behaviour. You can modify it in the my.cnf (i.e. use the "skip-slave-start" option etc)
> 1. What are the scenarios where the SQL THREAD stops running and > what are the scenarios where the IO THREAD stops running? > SQL thread stops when it can't run a SQL-Query from the binlogs for any reason, as you have experiences when the table already existed.
The IO-Thread only stops when it has an error reading a binlog from the master. When its only a lost connection, it automatically reconnects. Other problems (i.e. unable to read a binlog) should never happen as long a you don't delete binlogs on the master that have not yet been copied over to the slave by the io-thread ("show master status" and "show slave status" commands and their output) or you have faulty hardware (io_errors on the harddisk or such things)
> 1. Does SQL_SLAVE_SKIP_COUNTER skip the statement of the master > binlog from being replicated to the slave relay log OR Has the > statement already been copied into the slave relay log and has > been skipped from the relay log? > it skips the entry on the local copy of the binlog. The IO-Thread replicates the whole binlog and the sql-thread skips an entry in it when you use sql_slave_skip_counter > > 1. How do I know immediately that replication has failed? ( > have heard that the enterprise edition has some technique for > this )? > watch the logfile, it is written there. Or run a cronjob once a minute with something like mysql -e 'show slave status\G' |grep '_Running:' >/dev/null || bash my_alarm_script_that_sends_mail_or_whatever.sh
regards Jan
-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=mysql@(protected)
|
|
 |