Pgsql HA resource agent

Evo mene opet sa high availability cluster problemima.

Dakle, pokusavam da ubacim postgres u cluster. Do sada je odradjeno ovo:

  • instaliran postgres na oba node-a (webnode01, webnode02)
  • konfiguracija prebacena na DRBD shared disk - /etc/postgres i /etc/postgres-common kopirani na DRBD disk, pobrisani iz /etc foledera i napravljeni symlinkovi na oba noda
  • pgdata prebacen na DRBD shared disk - /var/lib/postgresql kopiran na DRBD disk, pobrisan iz /var/lib direktorija i napravljeni symlinkovi na oba noda

Ovo bi trebalo biti dovoljno.

Kada pokrenem postgres sa init skriptom, sve bude ok, na oba noda. Dakle, u prvom slucaju je webnode01 master, na njemu je podignut DRBD disk, virtelna IP adresa i nginx. Sa /etc/init.d/postgres start se postgres starta bez problema, dakle, procita sve sa shared diska. Onda oborim prvi node, drugi se promote na master, sve isto odradim na drugom node-u i sve bude ok.

Eh sada, kada sve ovo pokusam da odradim sa OCF startup skriptom, nailazim na problem.

Ovo je moj config:

primitive ocf:heartbeat:pgsql \ params psql="/bin/psql" \ pgdate="/var/lib/postgresql/8.4/main" logfile="/var/log/postgres/postgres.log \ op start timeout=120s \ op stop timeout=120s \ op monitor depth=0 interval=30s timeout=30s \
Verify ne pokazuje nikakav error.
Medjutim, kada startam cluster, pgsql se ne moze pokrenuti.

crm_mon mi pokazuje slijedeci error:

[code]============
Last updated: Tue Oct 11 12:23:16 2011
Stack: openais
Current DC: webnode02 - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected votes
2 Resources configured.

Online: [ webnode02 webnode01 ]

Master/Slave Set: drbd_cluster
Masters: [ webnode01 ]
Slaves: [ webnode02 ]
Resource Group: cluster_1
fs_res (ocf::heartbeat:Filesystem): Started webnode01
ClusterIP (ocf::heartbeat:IPaddr2): Started webnode01
nginx_res (ocf::yorxs:nginx2): Started webnode01
postgres_res (ocf::heartbeat:pgsql): Stopped

Failed actions:
postgres_res_start_0 (node=webnode01, call=84, rc=5, status=complete): not installed
postgres_res_start_0 (node=webnode02, call=66, rc=5, status=complete): not installed[/code]
Funny things is, logovi ne pokazuju nista. messages, syslog, postgres.log ne pokazuju nista.

Pomoc bi dobro dosla :slight_smile:

@maher, sta ima kod tebe? :slight_smile:

Iz pgsql ocf skripte:

[code]# OCF parameters:

OCF_RESKEY_pgctl - Path to pg_ctl. Default /usr/bin/pg_ctl

OCF_RESKEY_start_opt - Startup options, options passed to postgress with -o

OCF_RESKEY_ctl_opt - Additional options for pg_ctl (-w, -W etc…)

OCF_RESKEY_psql - Path to psql. Default is /usr/bin/psql

OCF_RESKEY_pgdata - PGDATA directory. Default is /var/lib/pgsql/data

OCF_RESKEY_pgdba - userID that manages DB. Default is postgres

OCF_RESKEY_pghost - Host/IP Address where PostgreSQL is listening

OCF_RESKEY_pgport - Port where PostgreSQL is listening

OCF_RESKEY_pgdb - database to monitor. Default is template1

OCF_RESKEY_logfile - Path to PostgreSQL log file. Default is /dev/null

OCF_RESKEY_stop_escalate - Stop waiting time. Default is 30[/code]

Nekako mi fali parametar za postgres.conf…

Jednog dana ovaj Amar ce da malko fuli i okaci neke passworde na forum :smiley:

znaci i ti cekas taj dan, samo da ga kasnije zahebavas na forumu :stuck_out_tongue:

samo vi cekajte :slight_smile:

nego ima li ko kakav predlog za moj problem?

@Amar: pokusaj rucno startat skriptu iz konzole sa svim env varijablama koje ti trebaju.

npr:
export OCF_ROOT="/usr/lib/ocf"

export OCF_RESKEY_start_opt="-c config_file=/etc/postgresql/8.3/main/postgresql.conf"
export OCF_RESKEY_pgdata="/var/lib/postgresql/8.3/main"
export OCF_RESKEY_psql="/usr/bin/psql"
export OCF_RESKEY_pgdba=“postgres”

pg_ctl moras obavezno navest tako se starta postgres

itd…
i na kraju

/usr/lib/ocf/resource.d/heartbeat/pgsql start

vidis lijepo sta je output i sta se desava …

Ok, probam sutra na poslu. Hvala!

Mislis start_opt moram navesti? Ovo da procita config file? To mi je nekako i logicno.

@maheru,

evo probao ovo, nista se ne desava.

Unio sve varijable, pokrenuo. No reply. ps aux kaze da proces nije tu…

Na ha-lists su mi rekli da moze biti problem sa verzijom paketa resource-agents koji kao ima neke konflikte sa fuser tools-ima.

Anyhow, dosta me ovo nervira sve skupa, probaj jos jedno sat-dva, ako ne proradi pokrenucu ga sa lsb skriptom i vozdra.

There are just 4 scenarios in which pgsql returns OCF_ERR_INSTALLED:

  • The resource agent is not installed or is not executable (unlikely);
  • pgctl or psql are not installed or not executable;
  • the configuration file does not exist or is not readable during a
    non-probe;
  • the username identified by the “pgdba” resource parameter does not
    resolve to a uid.

All of those do log error messages to the log though. You can grep for
ERROR in your logs, it should turn up what went wrong.

Ovo sam dobio na ha-lists. Odoh malo istrazivati…

Sredio resource-agents.

Odradicu jos jedan config, valjda ce raditi.

@maheru, ako budes imao vremena navrati na #lugbih@irc.freenode