DRDB split-brain

nisam ljubitelj Aixa, i ostalih crm vec ako nesto malo heartbeata i sopstevnih scripti (ipak svoje je svoje) ali me intresuje vase iskustvo povodom drbd i automatskog failovera i izbjegavanja split-braina stanja.

Konfiguracija je ovaka:

1x primary , 1x secondary drbd instanca

do sada je radjeno rucno prebacijanje napisao sam i kao neku scriptu na brzaka za isto jer mi je muka se prebacivati od ssh do ssh
http://jasminklipic.blogspot.de/2012/06/manual-fail-over-script-drbd.html

naravno sve funcionise super dok su oba gore naravno, ide i ako jedan od njih pogotov secundarny ispadne, cak ako i primary ispaden promjenim mu staus secondary u primary i ok ima se pristup
drbd disku, medjutim kad se secondary tj bivsi primary vrati stanje je split-brain.

Do sad mi se svidjala rucno vracanje jer ako se desi da rikne ima nekog razloga pa ionako moram pogledati jeli, ali posto vise nemam vremena ili sve manje trazim automatsko rijesenje da se nebih naljutio i sam napravio isto prvo da pitam one koji vec imaju iskustva

  1. Manualno ili automatski?
  2. ako automatski kako u konstalaciji ovoj gore navedenoj?

optionalno razmisljam napisati initscript koja ce provjeravati stanje zatim pri rebootu ili restartu ako je bio primary a nije zeljen zvoditi

primary disconnect data

secondary#
drbdadm – --discard-my-data connect resource

primary connect data

onda mijenjaje role ako hocu da je node a primary as default

tolko :slight_smile:

koliko razumijem heartbeatov script drbddisk samo prebaci secondary u primary (kad ovaj ispadne) ali pri kad se bivsi vrati ne rijesava split-brain problem ili grijesim?

mozes to izvest sto hoces ali ti onda treba neka vrsta fencing mehanizma tako da moze jedan od node-ova uvijek provjerit i “ubit” ako je potrebno tako da ne dodje do split-brain-a.
split-brain ce se desiti samo ako u jednom trenutku imas oba diskonektovana i u primary modu.

ja za fencing npr. koristim stonith preko ilo2. pogledaj i ovo: http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html

:slight_smile: nekuzim daj pojasni malo …stari se

napravicu initscript koji ce to pokusati odraditi a dodatno i check proces sa mijenjem rola ali ne kuzim da nema neko jednostanije rijsenje moda ima neda mi se opet izmisljati tolu vodu

pa pogledaj onaj gore link sto sam ti dao. tu imas slucajeve sta da se desi kad on detektuje split-brain. mozes cak i custom skripte pokretat.

evo sta stoji gore u onom linku sto sam ti dao:

DRBD invokes the split-brain handler, if configured, at any time split brain is detected. 
...
The DRBD distribution contains a split brain handler script that installs as /usr/lib/drbd/notify-split-brain.sh. It simply sends a notification e-mail message to a specified address.
...

after-sb-0pri. Split brain has just been detected, but at this time the resource is not in the Primary role on any host. For this option, DRBD understands the following keywords:

    disconnect: Do not recover automatically, simply invoke the split-brain handler script (if configured), drop the connection and continue in disconnected mode.
    discard-younger-primary: Discard and roll back the modifications made on the host which assumed the Primary role last.
    discard-least-changes: Discard and roll back the modifications on the host where fewer changes occurred.
    discard-zero-changes: If there is any host on which no changes occurred at all, simply apply all modifications made on the other and continue. 

...

after-sb-1pri. Split brain has just been detected, and at this time the resource is in the Primary role on one host. For this option, DRBD understands the following keywords:

    disconnect: As with after-sb-0pri, simply invoke the split-brain handler script (if configured), drop the connection and continue in disconnected mode.
    consensus: Apply the same recovery policies as specified in after-sb-0pri. If a split brain victim can be selected after applying these policies, automatically resolve. Otherwise, behave exactly as if disconnect were specified.
    call-pri-lost-after-sb: Apply the recovery policies as specified in after-sb-0pri. If a split brain victim can be selected after applying these policies, invoke the pri-lost-after-sb handler on the victim node. This handler must be configured in the handlers section and is expected to forcibly remove the node from the cluster.
    discard-secondary: Whichever host is currently in the Secondary role, make that host the split brain victim. 

after-sb-2pri. Split brain has just been detected, and at this time the resource is in the Primary role on both hosts. This option accepts the same keywords as after-sb-1pri except discard-secondary and consensus.

u biti ti samo treba init skripta da pokrece drbd ostalo sve mozes odradit iz config-a …

ljubi maka svog mahica :slight_smile: bum probal :wink: