DRBD All about

Zato sto sam idiot.

Probam sve sa heartbeat3 pa se javim. Hvala!

Nego, jos jedno pitanje, da li je bolje da koristim heartbeat-2 ili pacemaker?

hearbeat-2 mogu direktno instalirati a pacemaker bih morao kompajlirati.

Da li sa heartbeatom-2 mogu kontrolisati resurse?

EDIT:

Ima li neko mozda linkove za kakav dobar howto za pacemaker/drbd tutorial? Konfiguracija je potpuno drugacija od heartbeat-1.

EDIT2:

Pacemaker otpada jer je server Xen-Ubuntu 8.04. Update Ubuntua takodjer otpada jer XenServer ne podrzava novije verzije Ubuntua.
Dakle sranje.

Probacu sa heartbeat-2.

ako nemas drugih opcija osim heartbeat2 onda koristi heartbeat2 . ja na jednom clusteru koristim debian package iz lenny-a i radi sasvim kako treba .

http://www.clusterlabs.org/wiki/Documentation

Jos uvijek se patim sa cib.xml.

Sa heartbeat1 sam imao slijedecu konfiguraciju u /etc/ha.d/haresources

Dakle, heartbeat je definisao virtualnu IP adresu, mountovao /storage (DRB Disk) te podigao apache, mysql, nfs, tomcat i activemq.

U /storage se nalazio i direktorij mysql gdje su bile smjestene baze podataka, dakle u /etc/mysql/my.cnf, datadir je upucivao na /storage/mysql

Recimo da hocu, za sada, samo virtualnu IP adresu i mysql na heartbeat2.

Da li je ovo dovoljno:

Resource:

<resources> <master_slave id="ms_drbd_mysql"> <meta_attributes id="ms_drbd_mysql-meta_attributes"> <attributes> <nvpair name="notify" value="yes"/> <nvpair name="globally_unique" value="false"/> </attributes> </meta_attributes> <primitive id="drbd_mysql" class="ocf" provider="heartbeat" type="drbd"> <instance_attributes id="ms_drbd_mysql-instance_attributes"> <attributes> <nvpair name="drbd_resource" value="mysql"/> </attributes> </instance_attributes> <operations id="ms_drbd_mysql-operations"> <op id="ms_drbd_mysql-monitor-master" name="monitor" interval="29s" timeout="10s" role="Master"/> <op id="ms_drbd_mysql-monitor-slave" name="monitor" interval="30s" timeout="10s" role="Slave"/> </operations> </primitive> </master_slave> <group id="rg_mysql"> <primitive class="ocf" type="Filesystem" provider="heartbeat" id="fs_mysql"> <instance_attributes id="fs_mysql-instance_attributes"> <attributes> <nvpair name="device" value="/dev/drbd0"/> <nvpair name="directory" value="/storage/mysql"/> <nvpair name="type" value="ext3"/> </attributes> </instance_attributes> </primitive> <primitive class="ocf" type="IPaddr2" provider="heartbeat" id="ip_mysql"> <instance_attributes id="ip_mysql-instance_attributes"> <attributes> <nvpair name="ip" value="192.168.4.200"/> <nvpair name="nic" value="eth1"/> </attributes> </instance_attributes> </primitive> <primitive class="lsb" type="mysqld" provider="heartbeat" id="mysqld"/> </group> </resources>
Constraints:

<constraints> <rsc_order id="mysql_after_drbd" from="rg_mysql" action="start" to="ms_drbd_mysql" to_action="promote" type="after"/> <rsc_colocation id="mysql_on_drbd" to="ms_drbd_mysql" to_role="master" from="rg_mysql" score="INFINITY"/> </constraints>
EDIT:

/etc/ha.d/ha.cf izgleda ovako:

use_logd on keepalive 500ms deadtime 2 warntime 1 initdead 8 udpport 695 bcast eth2 auto_failback off node node1 node2 crm yes autojoin any
Da li je dovoljno crm yes ili moram staviti crm respawn?

I na serverima imam dva eth interface-a.

eth1 je u mrezi 192.168.4.x i spojen je na switch dok se eth2 koristi samo za drbd sync i replication i u nalazi se u mrezi 192.168.200.x (direct link)
Mogu li koristiti eth2 i za heartbeat komunikaciju (kako vec stoji u ha.cf)?

Ova konfiguracija mi pravo slabo ide (hence working Sunday).

Uglavnom, probao sam par varijanti i nijedna nije funkcionisala onako kako sam ja zamisljao (bice da nisam dobar drug sa XML-om).

Evo trenutne situacije:

imamo dva servera koje bi trebalo konfigurisati u active/passive modelu.

node1 - eth1:192.168.4.201 node2 - eth1:192.168.4.202
Dodatno, oba noda imaju po jos jedan NIC:

node1 - eth2:192.168.200.1 node2 - eth2:192.168.200.2
Ovo je direktni link i koristi se samo za DRBD sync i DRBD replication. Namjeravao sam ga koristiti i za heartbeat komunikaciju.

Trenutno je uspjesno konfigurisan DRBD:

/dev/drbd0

koji se sa drbddisk-om mount-uje u /storage

Direktorij /storage je ustvari poseban HDD na kojem bi se trebao nalaziti cjelokupan sadrzaj (www, msql, nfs) te pojedine konfiguracije (mysql.cf, httpd.conf, exports, sites-available, workers.properties) cisto da bi se izbjela dupla konfiguracija. Na mjestu ovih konfiguracijskih datoteka na oba noda su kreirani symbolic links.

Sadrzaj /storage direktorija:

root@node1:/# ls /storage config mysql nfs www
U /storage/config se nalaze gore pomenuti config file-ovi.

/dev/drbd0 nije automatski mount-ovan nego je taj dio definisan kao resource u heartbeat-u. Automatsko mount-ovanje (preko fstab-a) mi ne odgovara jer u DRBD konfiguraciji samo jedan node moze biti mount-ovan.

Dalje, konfigurisan je heartbeat R1. Ovo su config datoteke:

root@node1:/# cat /etc/ha.d/ha.cf logfile /var/log/ha.log keepalive 500ms deadtime 2 warntime 1 initdead 8 udpport 695 bcast eth2 auto_failback off node node1 node2

root@node1:/# cat /etc/ha.d/authkeys auth 3 3 md5 nekipassword

root@node1:/# cat /etc/ha.d/haresources node1 IPaddr::192.168.4.200/24/eth1 drbddisk::drbd1 Filesystem::/dev/drbd0::/storage::ext3 apache2 mysql nfs-kernel-server tomcat activemq
Ono sto heartbeat ovdje radi je:

  • dodijeljuje virtualnu IP adresu 192.168.4.200 na interface eth1
  • mount-a DRB Disk /dev/drbd0 u /storage
  • podize apache2, mysql, nfs, tomcat i activemq

Ovo funkcionise super, testirano je, u slucaju da node1 rikne, svi resursi se prebacuju na node2. Cijeli failover traje oko 25 sekundi. Nije idealno ali je zadovoljavajuce.

Problem nastaje kada recimo rikne bilo koji od servisa definisanih u haresources. Dakle, ako oborim apache2, isti nece biti dignut na node2.
Za to mi treba heartbeat2.

I od ove tacke, nikako da krenem dalje. Izmjenio sam ha.cf tako da sada izgleda ovako:

root@node1:/# cat /etc/ha.d/ha.cf use_logd on keepalive 500ms deadtime 2 warntime 1 initdead 8 udpport 695 bcast eth2 auto_failback off node node1 node2 crm yes autojoin any
Ono sto mi treba je cib.xml.

Cluster bi trebao da funkcionise tako da u slucaju da bilo koji od servisa definisanih u haresources rikne, hearbeat pogasi sve ostale servise, prebaci node1 u backup node, te sve servise podigne na node2, dakle, Virtualna IP adresa, mount-a /dev/drbd0 u /storage, podigne apache2, mysql, nfs, tomcat i activemq.
Uz ovo, trebalo bi jos i stonith definisati.

Ja se vec tri dana mucim sa ovim i ne ide.

Ako iko ima vremena, znanja i zelje da mi na osnovu ovih gore informacija napravi cib.xml, bio bih neizmjerno zahvalan.

obrisi heartbeat1 installiraj heartbeat2 . evo ti moj ha.cf

use_logd yes ucast eth0 xx.xx.xx.82 ucast eth2 xx.xx.xx.22 keepalive 4 warntime 15 deadtime 30 initdead 60 auto_failback off node node1 node2 crm on
^-- ovo je ha.cf na masini node1 na drugoj masini config izgleda ovako

use_logd yes ucast eth0 xx.xx.xx.81 ucast eth2 xx.xx.xx.21 keepalive 4 warntime 15 deadtime 30 initdead 60 auto_failback off node node1 node2 crm on
Eh sad ne smijes editovat cib.xml direktno vec samo sekcije tog config file-a pomoci “cibadmin” utility-a npr:

^-- ovo ce ti pokazat sta trenutno imas u resources sekciji iz cib.xml-a . tu sekciju spasis u file i onda je updateujes ili replaceas pomocu cibadmin-a direktno u cib.xml (cibadmin -U -o resources -x myresources.xml)

preporucujem ti da postavis slijedece u tvoj /etc/hosts file na obje masine

xx.xx.xx.21 node1 xx.xx.xx.22 node2 xx.xx.xx.81 node1-ext xx.xx.xx.82 node2-ext xx.xx.xx.85 cluster-sharedip
drbd ti mora biti master-slave resource , i da bi ga mount-o na kraju moras imat i “primitive” resource sa Filesystem OCF resource-om , npr:

<master_slave id="master-slave-drbd0"> <meta_attributes id="ma-master-slave-drbd0"> <attributes> <nvpair id="ma-master-slave-drbd0-1" name="clone_max" value="2"/> <nvpair id="ma-master-slave-drbd0-2" name="clone_node_max" value="1"/> <nvpair id="ma-master-slave-drbd0-3" name="master_max" value="1"/> <nvpair id="ma-master-slave-drbd0-4" name="master_node_max" value="1"/> <nvpair id="ma-master-slave-drbd0-5" name="notify" value="yes"/> <nvpair id="ma-master-slave-drbd0-6" name="globally_unique" value="false"/> <nvpair name="target_role" id="ma-master-slave-drbd0-7" value="#default"/> </attributes> </meta_attributes> <primitive id="drbd0" class="ocf" provider="heartbeat" type="drbd"> <instance_attributes id="instance-attr-drbd0"> <attributes> <nvpair id="instance-attr-drbd0-1" name="drbd_resource" value="mysql"/> </attributes> </instance_attributes> <operations> <op id="ms-drbd0_monitor" name="monitor" interval="10" timeout="20" start_delay="1m" role="Started" disabled="false" on_fail="restart"/> </operations> </primitive> <instance_attributes id="master-slave-drbd0"> <attributes> <nvpair id="master-slave-drbd0-target_role" name="target_role" value="started"/> </attributes> </instance_attributes> </master_slave>
^-- gore u liniji " " stoji value=“mysql” to moras promijeniti u tvoj resource name iz drbd-a (npr. kod mene stoji u drbd.conf-u slijedeca linija “resource mysql { … }” )

ovako mountas filesystem:

<primitive class="ocf" provider="heartbeat" type="Filesystem" id="fs0"> <meta_attributes id="ma-fs0"> <attributes> <nvpair name="target_role" id="ma-fs0-1" value="#default"/> </attributes> </meta_attributes> <operations> <op id="fs0_1" name="monitor" interval="30s" timeout="10s"/> </operations> <instance_attributes id="ia-fs0"> <attributes> <nvpair id="ia-fs0-1" name="fstype" value="ext3"/> <nvpair id="ia-fs0-2" name="directory" value="/var/lib/mysql"/> <nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/> </attributes> </instance_attributes> <meta_attributes id="fs0-meta-options"> <attributes> <nvpair id="fs0-meta-options-timeout" name="timeout" value="10s"/> </attributes> </meta_attributes> <instance_attributes id="fs0"> <attributes> <nvpair name="target_role" id="fs0-target_role" value="started"/> </attributes> </instance_attributes> </primitive>
i na kraju da bi starto mysql server moras dodat resource za startanje tj. OCF resource jer ne smijes koristis LSB skriptu (/etc/init.d/mysql) zato sto sa njom heartbeat ne moze vrsit monitoring mysql-a i samim tim ne vidi kad ubijes mysql ili ga ugasis pa da sam ponovo starta.

posto sam ja napiso vlastitu skriptu za startanje mysql-u i da ti dam config od nje nebi ti nista znacilo. evo ti zato primitive resource za apache samo da vidis kako bi to trebalo ici:

<primitive class="ocf" provider="heartbeat" id="apache_resource" type="apache"> <operations> <op id="apache_mon" interval="60s" name="monitor" timeout="30s"/> </operations> <instance_attributes id="apache_res_attr"> <attributes> <nvpair name="configfile" value="/etc/apache2/apache2.conf" id="apache_res_attr_0"/> <nvpair name="httpd" value="/usr/sbin/apache2" id="apache_res_attr_1"/> <nvpair name="statusurl" value="http://localhost/server-status" id="apache_res_attr_2"/> </attributes> </instance_attributes> </primitive>
osim svega toga imas jednu sekciju u cib.xml file-u koja se zove “crm_config” (cibadmin -Q -o crm_config) , nju moras isto podesit evo sta ja imam:

<nvpair id="cib-bootstrap-options-symmetric-cluster" name="symmetric-cluster" value="true"/> <nvpair name="no-quorum-policy" id="cib-bootstrap-options-no-quorum-policy" value="ignore"/> <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="true"/> <nvpair id="cib-bootstrap-options-stonith-action" name="stonith-action" value="reboot"/> <nvpair name="default-resource-stickiness" id="cib-bootstrap-options-default-resource-stickiness" value="100000"/> <nvpair name="default-resource-failure-stickiness" id="cib-bootstrap-options-default-resource-failure-stickiness" value="-10000"/> <nvpair id="cib-bootstrap-options-is-managed-default" name="is-managed-default" value="true"/> <nvpair id="cib-bootstrap-options-stop-orphan-resources" name="stop-orphan-resources" value="true"/> <nvpair id="cib-bootstrap-options-stop-orphan-actions" name="stop-orphan-actions" value="true"/> <nvpair id="cib-bootstrap-options-remove-after-stop" name="remove-after-stop" value="false"/> <nvpair id="cib-bootstrap-options-short-resource-names" name="short-resource-names" value="true"/> <nvpair id="cib-bootstrap-options-startup-fencing" name="startup-fencing" value="true"/> <nvpair id="cib-bootstrap-options-pe-error-series-max" name="pe-error-series-max" value="-1"/> <nvpair id="cib-bootstrap-options-pe-warn-series-max" name="pe-warn-series-max" value="-1"/> <nvpair id="cib-bootstrap-options-pe-input-series-max" name="pe-input-series-max" value="-1"/> <nvpair id="cib-bootstrap-options-transition-idle-timeout" name="transition-idle-timeout" value="60s"/> <nvpair id="cib-bootstrap-options-default-action-timeout" name="default-action-timeout" value="15s"/> <nvpair id="cib-bootstrap-options-start-failure-is-fatal" name="start-failure-is-fatal" value="true"/>
^-- preporucujem ti da svaku opciju provjeris sta radi i procitas detaljno dokumentaciju prije nego ista napravis.

i na kraju moras napraviti “constraints” po tvojim zeljama znaci gdje i kad da se starta koji servis inace sve ovo sto si do sad uradio nece pit vode (jer service moras startat tamo gdje ti je drbd primary)

ja ti ovdje tesko mogu pomoci ako ne uzmes dokumentaciju i ne procitas detaljno , pastiro sam ti gore link na dokumentaciju.

@maher_

Velika hvala. Dosta mi je jasnije kako ci.xml funkcionise. Probam sve pa se javim.

@maher_

Sta bi preporucio za external fencing device?

Odnosno, da li HP iLO fercera na home made serveru sa xen-ubuntu 8.04?

[quote=Amar]@maher_

Sta bi preporucio za external fencing device?

Odnosno, da li HP iLO fercera na home made serveru sa xen-ubuntu 8.04?[/quote]
treba ti neki uredjaj sa kojim mozes uraditi hard reset masine ili hard shutdown . ako je vec homemade server ne vidim puno opcija osim da na UPS-u (USV na njemackom :D) gasis odredjene strujne portove (na koje ces spojit te dvije masine) ili da kupis PDU (power distribution unit) sa kojim mozes preko mreze da gasis uredjaje. za ovo drugo ces vjerovatno morat napisat skriptu koja ti radi fancing tj. stonith skriptu, a za UPS (ako je APC) vjerovatno vec postoji gotova skripta

HP iLO ti fercera samo na HP server-ima sa iLO uredjajem (http://h18013.www1.hp.com/products/servers/management/remotemgmt.html?jumpid=servers/lights-out)

Vjerovatno je ista prica i sa drac-om.

Recimo, nasao sam ovu kartu:

http://cgi.ebay.de/Dell-8N289-Remote-Access-PCI-Card-DRAC-IIIXT-1600SC-E10-/370449873029?pt=Controller&hash=item5640887885#ht_2585wt_1139

Ako ne fercera preko external fencing device-a, jel bas belaj ako napravim stonith preko SSH?

Trenutno nismo u mogucnosti da kupujemo PDU.

[quote=Amar]Vjerovatno je ista prica i sa drac-om.

Recimo, nasao sam ovu kartu:

http://cgi.ebay.de/Dell-8N289-Remote-Access-PCI-Card-DRAC-IIIXT-1600SC-E10-/370449873029?pt=Controller&hash=item5640887885#ht_2585wt_1139

Ako ne fercera preko external fencing device-a, jel bas belaj ako napravim stonith preko SSH?

Trenutno nismo u mogucnosti da kupujemo PDU.[/quote]
nisam siguran da ce ti to radit u nekoj masini osim dell-ove za koju je ta kartica namijenjena . stonith preko ssh mozes radi testiranja koristit u produktivne svrhe ti ne preporucujem.

http://www.42u.com/raritan-eric-g4.htm

Nazalost je 499$ za nas trenutno nedostizna cifra.

Da li se moze management raditi preko nekog node-a?

Recimo, imamo server01 i server02 koji su u active/passive modu, sa DRBD-om kao dijeljenim resursom. Da li mi ovdje mozemo ubaciti i treci node koji ce, uz home-made skripte, vrsiti monitoring servera01 i servera02 i po potrebi ih gasiti, resetovati i sl.? Moze li se stonith na takav nacin konfigurisati? Nazalost, mi trenutno mozemo prisustiti samo neko jeftinije rjesenje.

cib.xml do sada. Ako ko primjeti kakve greske, please point out.

[code]

<crm_config>
<cluster_property_set id=“cib-bootstrap-options”>





















</cluster_property_set>
</crm_config>






<master_slave id=“master-slave-drbd0”>
<meta_attributes id=“ma-master-slave-drbd0”>









</meta_attributes>

<instance_attributes id=“instance-attr-drbd0”>



</instance_attributes>




<instance_attributes id=“master-slave-drbd0”>



</instance_attributes>
</master_slave>

		<primitive class="ocf" provider="heartbeat" type="Filesystem" id="fs0">
			<meta_attributes id="ma-fs0">
				<attributes>
					<nvpair name="target_role" id="ma-fs0-1" value="#default"/>
				</attributes>
			</meta_attributes>
			<operations>
				<op id="fs0_1" name="monitor" interval="30s" timeout="10s"/>
			</operations>
			<instance_attributes id="ia-fs0">
				<attributes>
					<nvpair id="ia-fs0-1" name="fstype" value="ext3"/>
					<nvpair id="ia-fs0-2" name="directory" value="/storage"/>
					<nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
				</attributes>
			</instance_attributes>
			<meta_attributes id="fs0-meta-options">
				<attributes>
					<nvpair id="fs0-meta-options-timeout" name="timeout" value="10s"/>
				</attributes>
			</meta_attributes>
			<instance_attributes id="fs0">
				<attributes>
					<nvpair name="target_role" id="fs0-target_role" value="started"/>
				</attributes>
			</instance_attributes>
		</primitive>
		
		<primitive class="ocf" id="IP1" provider="heartbeat" type="IPaddr">
			<operations>
				<op id="IP1_mon" interval="10s" name="monitor" timeout="5s"/>
			</operations>
			<instance_attributes id="IP1_inst_attr">
				<attributes>
						<nvpair id="IP1_attr_0" name="ip" value="192.168.4.200"/>
						<nvpair id="IP1_attr_1" name="netmask" value="24"/>
						<nvpair id="IP1_attr_2" name="nic" value="eth1"/>
				</attributes>
			</instance_attributes>
		</primitive>
		
		<primitive id="mysql" class="ocf" type="mysql" provider="heartbeat"> 
			<instance_attributes id="attr_mysql"> 
				<attributes> 
					<nvpair id="binary"	name="binary"	value="/usr/bin/mysqld_safe"/> 
					<nvpair id="config"	name="config"	value="/storage/config/mysql/my.cnf"/> 
					<nvpair id="datadir"	name="datadir"	value="/storage/mysql"/> 
					<nvpair id="log"	name="log"	value="/storage/log/mysql.log"/> 
					<nvpair id="pid"	name="pid"	value="/storage/config/mysql/mysql.pid"/> 
					<nvpair id="socket"	name="socket"	value="/storage/config/mysql/mysql.sock"/> 
					<nvpair id="my_max_con"	name="additional_parameters"	value="--max_connections=1000"/> 
				</attributes> 
			</instance_attributes> 
			<operations> 
				<op id="my_mon" name="monitor" interval="30s" timeout="59s" on_fail="restart"/> 
				<op id="my_start" name="start" timeout="59s" on_fail="restart"/> 
				<op id="my_stop" name="stop" timeout="59s" on_fail="restart"/> 
			</operations> 
		</primitive> 
		
		<primitive class="ocf" provider="heartbeat" id="apache_resource" type="apache">
			<operations>
				<op id="apache_mon" interval="60s" name="monitor" timeout="30s"/>
			</operations>
			<instance_attributes id="apache_res_attr">
				<attributes>
					<nvpair name="configfile_1" value="/storage/config/apache2/apache2.conf" id="apache_res_attr_0"/>
					<nvpair name="configfile_2" value="/storage/config/apache2/httpd.conf" id="apache_res_attr_1"/>
					<nvpair name="httpd" value="/usr/sbin/apache2" id="apache_res_attr_2"/>
					<nvpair name="statusurl" value="http://localhost/server-status/index.html" id="apache_res_attr_3"/>
				</attributes>
			</instance_attributes>
		</primitive>
		
		<primitive class="lsb" id="nfs_resource" provider="heartbeat" type="nfs-kernel-server">
			<operations>
				<op id="nfs_mon" interval="60s" name="monitor" timeout="30s"/>
			</operations>
  </primitive>
		
		<primitive class="lsb" id="activemq_resource" provider="heartbeat" type="activemq">
			<operations>
				<op id="activemq_mon" interval="60s" name="monitor" timeout="30s"/>
			</operations>
		</primitive>
	</group>

	<clone id="stonithclone">
<instance_attributes>
  <attributes>
    <nvpair name="clone_max" value="2"/>
    <nvpair name="clone_node_max" value="1"/>
  </attributes>
</instance_attributes>
<primitive id="child_stonithclone" class="stonith" type="ssh">
  <operations>
    <op name="monitor" interval="5s" timeout="20s"
    prereq="nothing"/>
    <op name="start" timeout="20s" prereq="nothing"/>
  </operations>
  <instance_attributes>
    <attributes>
      <nvpair name="hostlist" value="node1 node2"/>
    </attributes>
  </instance_attributes>
</primitive>
<constraints>

<rsc_location id="stonith-cons:0" rsc="stonithclone">
  <rule id="stonith-cons-rule-node1" score="100000">
    <expression id="stonith-cons-rule-expression-node1" attribute="#uname" operation="eq" value="node1"/>
  </rule>

</rsc_location>

<rsc_location id="stonith-cons:1" rsc="stonithclone">
  <rule id="stonith-cons-rule-node2" score="100000">
    <expression id="stonith-cons-rule-expression-node2" attribute="#uname" operation="eq" value="node2"/>
  </rule>

</rsc_location>


[/code]
Fali jos tomcat (dvije instance), i naravno constraints dio.

Ostao tomcat.

Dakle, imam dvije instalacije tomcat-a, catalina skripte se nalaze u:

/opt/tomcat/apache-tomcat-6.0.20_1/bin/catalina.sh
/opt/tomcat/apache-tomcat-6.0.20_2/bin/catalina.sh

Za njih bih htio konfigurisati dva ocf resource-a.

Koje bih parametre trebao definisati, vezano za lokaciju catalina.sh?

Ono sto sam nasao kao moguci config je:

primitive tomcat ocf:heartbeat:tomcat \
params java_home="/usr/lib/jvm/java-6-sun/jre"
catalina_home="/opt/tomcat/apache-tomcat-6.0.20_1" \
op monitor interval=“10s” timeout=“30s” depth=“0” \
meta target-role=“Started”

prebaceno u xml, to bi otprilike izgledalo ovako:

<primitive class="ocf" provider="heartbeat" id="tomcat_1_resource" type="tomcat"> <operations> <op id="tomcat_1_mon" interval="10s" name="monitor" timeout="30s"/> </operations> <instance_attributes id="tomcat_1_res_attr"> <attributes> <nvpair name="catalina_home" value="/opt/tomcat/apache-tomcat-6.0.20_1" id="tomcat_1_res_attr_0"/> <nvpair name="java_home" value="/usr/lib/jvm/java-6-sun/jre" id="tomcat_1_res_attr_1"/> </attributes> </instance_attributes> </primitive>
Jel ovo ok?

I kako izgleda meta target-role=“Started” u xml code-u?

<rsc_order id="order_master_slave_drbd0_before_fs0" from="master-slave-drbd0" type="before" to="fs0"/> <rsc_order id="order_fs0_before_IP1" from="fs0" type="before" to="IP1"/> <rsc_order id="order_IP1_before_mysql_resource" from="IP1" type="before" to="mysql_resource"/> <rsc_order id="order_mysql_resource_before_apache_resource" from="mysql_resource" type="before" to="apache_resource"/> <rsc_order id="order_apache_resource_before_tomcat_1_resource" from="apache_resource" type="before" to="tomcat_1_resource"/> <rsc_order id="order_tomcat_1_resource_before_tomcat_2_resource" from="tomcat_1_resource" type="before" to="tomcat_2_resource"/> <rsc_order id="order_tomcat_2_resource_before_nfs_resource" from="tomcat_2_resource" type="before" to="nfs_resource"/> <rsc_order id="order_nfs_resource_before_activemq_resource" from="nfs_resource" type="before" to="activemq_resource"/> <rsc_order id="order_activemq_resource_before_stonithclone" from="activemq_resource" type="before" to="stonithclone"/>
Constraints/order.

da ti malo skratim muke. idi u slijedeci folder: /usr/lib/ocf/resource.d/heartbeat , tamo otvori “tomcat” skriptu i vidit ces u komentarima na pocetku skripte ovo:

[code]# OCF_RESKEY_tomcat_name - The name of the resource. Default is tomcat

OCF_RESKEY_script_log - A destination of the log of this script. Default /var/log/OCF_RESKEY_tomcat_name.log

OCF_RESKEY_tomcat_stop_timeout - Time-out at the time of the stop. Default is 5

OCF_RESKEY_tomcat_suspend_trialcount - The re-try number of times awaiting a stop. Default is 10

OCF_RESKEY_tomcat_user - A user name to start a resource. Default is root

OCF_RESKEY_statusurl - URL for state confirmation. Default is http://127.0.0.1:8080

OCF_RESKEY_java_home - Home directory of the Java. Default is None

OCF_RESKEY_catalina_home - Home directory of Tomcat. Default is None

OCF_RESKEY_catalina_pid - A PID file name of Tomcat. Default is OCF_RESKEY_catalina_home/logs/catalina.pid[/code]

u biti su ti ovo parametri koje mozes navest u cib.xml-u . do njih mozes doci i na slijedeci nacin:

node2:~# cd /usr/lib/ocf/resource.d/heartbeat node2:/usr/lib/ocf/resource.d/heartbeat# OCF_ROOT=/usr/lib/ocf/ ./tomcat meta-data
i ovo je konstrukcija primitivnog resource-a

<primitive class="ocf" provider="heartbeat" id="tomcat1_rsc" type="tomcat"> <operations> <op id="tomcat_mon" interval="120s" name="monitor" timeout="60s"/> <op id="tomcat_start" name="start" timeout="120s"/> <op id="tomcat_stop" name="stop" timeout="120s"/> </operations> <instance_attributes id="tomcat_res_attr"> <attributes> <nvpair name="tomcat_name" value="tomcat1" id="tomcat_res_attr_0"/> <nvpair name="tomcat_user" value="tomcat" id="tomcat_res_attr_1"/> <nvpair name="java_home" value="/usr/lib/java" id="tomcat_res_attr_2"/> <nvpair name="catalina_home" value="/opt/tomcat/apache-tomcat-6.0.20_1" id="tomcat_res_attr_3"/> <nvpair name="catalina_pid" value="/var/run/tomcat.pid" id="tomcat_res_attr_4"/> <nvpair name="tomcat_stop_timeout" value="45" id="tomcat_res_attr_5"/> <nvpair name="statusurl" value="http://127.0.0.1:8080/someapp/" id="tomcat_res_attr_6"/> </attributes> </instance_attributes> <instance_attributes id="tomcat_resource"> <attributes> <nvpair id="tomcat_resource-target_role" name="target_role" value="started"/> </attributes> </instance_attributes> </primitive>
java_home sam definiso kao /usr/lib/java , a to je symbolic link “java -> jdk1.6.0_21” . koristi “jdk” a ne “jre” …

meta atributi ti ne trebaju sad , njih mozes setovat on-the-fly kao i “instance_attributes” (pomocu cibadmin-a i crm_resource-a)

[quote=Amar]<rsc_order id="order_master_slave_drbd0_before_fs0" from="master-slave-drbd0" type="before" to="fs0"/> <rsc_order id="order_fs0_before_IP1" from="fs0" type="before" to="IP1"/> <rsc_order id="order_IP1_before_mysql_resource" from="IP1" type="before" to="mysql_resource"/> <rsc_order id="order_mysql_resource_before_apache_resource" from="mysql_resource" type="before" to="apache_resource"/> <rsc_order id="order_apache_resource_before_tomcat_1_resource" from="apache_resource" type="before" to="tomcat_1_resource"/> <rsc_order id="order_tomcat_1_resource_before_tomcat_2_resource" from="tomcat_1_resource" type="before" to="tomcat_2_resource"/> <rsc_order id="order_tomcat_2_resource_before_nfs_resource" from="tomcat_2_resource" type="before" to="nfs_resource"/> <rsc_order id="order_nfs_resource_before_activemq_resource" from="nfs_resource" type="before" to="activemq_resource"/> <rsc_order id="order_activemq_resource_before_stonithclone" from="activemq_resource" type="before" to="stonithclone"/>
Constraints/order.[/quote]
hmm tu ne vidim rsc_order za drbd promote. ako ce ti resursi radit uvijek na jednoj masini (na onoj gdje je drbd primary) , onda grupisi resource tj. napravi “resource group” . kako ih u grupi definises tim redoslijedom ce se startat i obrnutim gasit. onda ce ti biti lakse napraviti constraints , npr sa grupom bi to izgledalo ovako:

<rsc_order id="promote_drbd0_before_group" action="start" from="my_rsc_group" type="after" to_action="promote" to="ms-drbd0"/> <rsc_colocation id="fs0_on_drbd0-stopped" to="ms-drbd0" to_role="stopped" from="fs0" score="-infinity"/> <rsc_colocation id="fs0_on_drbd0-slave" to="ms-drbd0" to_role="slave" from="fs0" score="-infinity"/> <rsc_colocation id="group_where_drbd0_is" to="ms-drbd0" to_role="master" from="my_rsc_group" score="infinity"/>
ako ne zelis grupu napraviti , onda moras constraints za svaki resource za startanje i za zaustavljanje definisat.

Pa napravio sam grupu. U njoj su mi svi resursi, osim stonith-a.

Evo citavog cib.xml-a:

[code]

<crm_config>
<cluster_property_set id=“cib-bootstrap-options”>





















</cluster_property_set>
</crm_config>






<master_slave id=“master-slave-drbd0”>
<meta_attributes id=“ma-master-slave-drbd0”>









</meta_attributes>

<instance_attributes id=“instance-attr-drbd0”>



</instance_attributes>




<instance_attributes id=“master-slave-drbd0”>



</instance_attributes>
</master_slave>

		<primitive class="ocf" provider="heartbeat" type="Filesystem" id="fs0">
			<meta_attributes id="ma-fs0">
				<attributes>
					<nvpair name="target_role" id="ma-fs0-1" value="#default"/>
				</attributes>
			</meta_attributes>
			<operations>
				<op id="fs0_1" name="monitor" interval="30s" timeout="10s"/>
			</operations>
			<instance_attributes id="ia-fs0">
				<attributes>
					<nvpair id="ia-fs0-1" name="fstype" value="ext3"/>
					<nvpair id="ia-fs0-2" name="directory" value="/storage"/>
					<nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
				</attributes>
			</instance_attributes>
			<meta_attributes id="fs0-meta-options">
				<attributes>
					<nvpair id="fs0-meta-options-timeout" name="timeout" value="10s"/>
				</attributes>
			</meta_attributes>
			<instance_attributes id="fs0">
				<attributes>
					<nvpair name="target_role" id="fs0-target_role" value="started"/>
				</attributes>
			</instance_attributes>
		</primitive>
		
		<primitive class="ocf" id="IP1" provider="heartbeat" type="IPaddr">
			<operations>
				<op id="IP1_mon" interval="10s" name="monitor" timeout="5s"/>
			</operations>
			<instance_attributes id="IP1_inst_attr">
				<attributes>
						<nvpair id="IP1_attr_0" name="ip" value="192.168.4.200"/>
						<nvpair id="IP1_attr_1" name="netmask" value="24"/>
						<nvpair id="IP1_attr_2" name="nic" value="eth1"/>
				</attributes>
			</instance_attributes>
		</primitive>
		
		<primitive id="mysql_resource" class="ocf" type="mysql" provider="heartbeat"> 
			<instance_attributes id="attr_mysql"> 
				<attributes> 
					<nvpair id="binary"	name="binary"	value="/usr/bin/mysqld_safe"/> 
					<nvpair id="config"	name="config"	value="/storage/config/mysql/my.cnf"/> 
					<nvpair id="datadir"	name="datadir"	value="/storage/mysql"/> 
					<nvpair id="log"	name="log"	value="/storage/log/mysql.log"/> 
					<nvpair id="pid"	name="pid"	value="/storage/config/mysql/mysql.pid"/> 
					<nvpair id="socket"	name="socket"	value="/storage/config/mysql/mysql.sock"/> 
					<nvpair id="my_max_con"	name="additional_parameters"	value="--max_connections=1000"/> 
				</attributes> 
			</instance_attributes> 
			<operations> 
				<op id="my_mon" name="monitor" interval="30s" timeout="59s" on_fail="restart"/> 
				<op id="my_start" name="start" timeout="59s" on_fail="restart"/> 
				<op id="my_stop" name="stop" timeout="59s" on_fail="restart"/> 
			</operations> 
		</primitive> 
		
		<primitive class="ocf" provider="heartbeat" id="apache_resource" type="apache">
			<operations>
				<op id="apache_mon" interval="60s" name="monitor" timeout="30s"/>
			</operations>
			<instance_attributes id="apache_res_attr">
				<attributes>
					<nvpair name="configfile_1" value="/storage/config/apache2/apache2.conf" id="apache_res_attr_0"/>
					<nvpair name="configfile_2" value="/storage/config/apache2/httpd.conf" id="apache_res_attr_1"/>
					<nvpair name="httpd" value="/usr/sbin/apache2" id="apache_res_attr_2"/>
					<nvpair name="statusurl" value="http://localhost/server-status/index.html" id="apache_res_attr_3"/>
				</attributes>
			</instance_attributes>
		</primitive>
		
		<primitive class="ocf" provider="heartbeat" id="tomcat_1_resource" type="tomcat">
			<operations>
				<op id="tomcat_1_mon" interval="10s" name="monitor" timeout="30s"/>
			</operations>
			<instance_attributes id="tomcat_1_res_attr">
				<attributes>
					<nvpair name="catalina_home" value="/opt/tomcat/apache-tomcat-6.0.20_1" id="tomcat_1_res_attr_0"/>
					<nvpair name="java_home" value="/usr/lib/jvm/java-6-sun/jre" id="tomcat_1_res_attr_1"/>
				</attributes>
			</instance_attributes>
		</primitive>
					
		<primitive class="ocf" provider="heartbeat" id="tomcat_2_resource" type="tomcat">
			<operations>
				<op id="tomcat_2_mon" interval="10s" name="monitor" timeout="30s"/>
			</operations>
			<instance_attributes id="tomcat_2_res_attr">
				<attributes>
					<nvpair name="catalina_home" value="/opt/tomcat/apache-tomcat-6.0.20_2" id="tomcat_2_res_attr_0"/>
					<nvpair name="java_home" value="/usr/lib/jvm/java-6-sun/jre" id="tomcat_2_res_attr_1"/>
				</attributes>
			</instance_attributes>
		</primitive>
		
		<primitive class="lsb" id="nfs_resource" provider="heartbeat" type="nfs-kernel-server">
			<operations>
				<op id="nfs_mon" interval="60s" name="monitor" timeout="30s"/>
			</operations>
  </primitive>
		
		<primitive class="lsb" id="activemq_resource" provider="heartbeat" type="activemq">
			<operations>
				<op id="activemq_mon" interval="60s" name="monitor" timeout="30s"/>
			</operations>
		</primitive>
	</group>

	<clone id="stonithclone">
<instance_attributes>
  <attributes>
    <nvpair name="clone_max" value="2"/>
    <nvpair name="clone_node_max" value="1"/>
  </attributes>
</instance_attributes>
<primitive id="child_stonithclone" class="stonith" type="ssh">
  <operations>
    <op name="monitor" interval="5s" timeout="20s" prereq="nothing"/>
    <op name="start" timeout="20s" prereq="nothing"/>
  </operations>
  <instance_attributes>
    <attributes>
      <nvpair name="hostlist" value="node1 node2"/>
    </attributes>
  </instance_attributes>
</primitive>
<constraints>

<rsc_location id="stonith-cons:0" rsc="stonithclone">
  <rule id="stonith-cons-rule-node1" score="100000">
    <expression id="stonith-cons-rule-expression-node1" attribute="#uname" operation="eq" value="node1"/>
  </rule>

</rsc_location>

<rsc_location id="stonith-cons:1" rsc="stonithclone">
  <rule id="stonith-cons-rule-node2" score="100000">
    <expression id="stonith-cons-rule-expression-node2" attribute="#uname" operation="eq" value="node2"/>
  </rule>

</rsc_location>

<rsc_location id="group_placement_node1" rsc="cluster_1">
	<rule id="group_rule_node1" score="100000">
		<expression id="group_expression_node1" attribute="#uname" operation="eq" value="node1"/>
	</rule>
</rsc_location>

<rsc_location id="group_placement_node2" rsc="cluster_1">
	<rule id="group_rule_node2" score="-100000">
		<expression id="group_expression_node2" attribute="#uname" operation="eq" value="node2"/>
	</rule>
</rsc_location>

<rsc_order id="order_master_slave_drbd0_before_fs0" from="master-slave-drbd0" type="before" to="fs0"/>
<rsc_order id="order_fs0_before_IP1" from="fs0" type="before" to="IP1"/>
<rsc_order id="order_IP1_before_mysql_resource" from="IP1" type="before" to="mysql_resource"/>
<rsc_order id="order_mysql_resource_before_apache_resource" from="mysql_resource" type="before" to="apache_resource"/>

q <rsc_order id=“order_apache_resource_before_tomcat_1_resource” from=“apache_resource” type=“before” to=“tomcat_1_resource”/>
<rsc_order id=“order_tomcat_1_resource_before_tomcat_2_resource” from=“tomcat_1_resource” type=“before” to=“tomcat_2_resource”/>
<rsc_order id=“order_tomcat_2_resource_before_nfs_resource” from=“tomcat_2_resource” type=“before” to=“nfs_resource”/>
<rsc_order id=“order_nfs_resource_before_activemq_resource” from=“nfs_resource” type=“before” to=“activemq_resource”/>
<rsc_order id=“order_activemq_resource_before_stonithclone” from=“activemq_resource” type=“before” to=“stonithclone”/>


[/code]
@maher_ ako imas vremena, bilo bi super da prodjes kroz config te eventualno ispravis sta nije kako treba konfigurisano.

stonith problem ostaje. Za sada cemo raditi preko SSH, ali danas sam razgovarao sa sefom, narucicemo vrlo brzo neko hardware rjesenje. Imamo jos jedan server koji bi mogli koristiti kao management node. Problem je sto na njemu nemamo nikakvu management karticu. Ploca je Intel S5520HCR. Mislio sam naruciti neki IPMI kontroler koji je kompatibilan sa plocom. Any ideas?

@maher_ hvala velika na pomoci.

Uspio sam ubaciti gotovo cijeli config u cib.xml sa:

Ostao je jos DRBD resource:

<resources> <master_slave id="master-slave-drbd0"> <meta_attributes id="ma-master-slave-drbd0"> <attributes> <nvpair id="ma-master-slave-drbd0-1" name="clone_max" value="2"/> <nvpair id="ma-master-slave-drbd0-2" name="clone_node_max" value="1"/> <nvpair id="ma-master-slave-drbd0-3" name="master_max" value="1"/> <nvpair id="ma-master-slave-drbd0-4" name="master_node_max" value="1"/> <nvpair id="ma-master-slave-drbd0-5" name="notify" value="yes"/> <nvpair id="ma-master-slave-drbd0-6" name="globally_unique" value="false"/> </attributes> </meta_attributes> <primitive id="drbd0" class="ocf" provider="heartbeat" type="drbd"> <instance_attributes id="instance-attr-drbd0"> <attributes> <nvpair id="instance-attr-drbd0-1" name="drbd_resource" value="drbd1"/> </attributes> </instance_attributes> <operations id="master-slave-drbd0-operations"> <op id="master-slave-drbd0-monitor-master" name="monitor" interval="29s" timeout="10s" role="Master"/> <op id="master-slave-drbd0-monitor-slave" name="monitor" interval="30s" timeout="10s" role="Slave"/> </operations> </primitive> </master_slave> </resources>
kada pokusam:

Dobijem slijedecu gresku:

Call cib_modify failed (-47): Update does not conform to the DTD in /usr/share/heartbeat/crm.dtd <null>
Negdje u XML code-u sam napravio gresku, ali nikako da nadjem gdje.

HELP!!!