Réplication qui ne fonctionne pas, mais pas d'erreurs dans la log

guilm · 29/12/2010 15:05:06

Bonjour à tous,

Je suis un nouvel utilisateur de postgresql 8.4 que je viens tout juste d'installer sur deux serveurs couplé à deux pgpool II-2.2.4 dans le but de faire de la replication et du load balancing.

Après avoir terminé la configuration de pgpool et postgres pour la réplication, je démarre pour la première fois les services pgpool et postgres, afin de vérifier mon travail. Dans postgres sur le serveur1 je créer une nouvelle base de donnée avec une table, puis je vérifie si la réplication ce fait bien sur le serveur 2. Et ce n'est pas le cas.

J'ai donc cherché dans la log, afin de touver des indices.

Log de pgpool :

pid 639: pgpool successfully started

Tout semble normal.

Log de postgres :

UTC LOG: database system is ready to accept connections
UTC LOG: autovacuum launcher started
UTC LOG: incomplete startup packet

J'ai lu sur ce forum que ce n'était pas une erreur et que ca n'avait pas vraiment d'importance.
Mais alors pourquoi ma réplication ne fonctionne pas ?
Il y a t il quelque chose à faire pour corriger le "incomplete startup packet" ?

En lisant des tutos ca me paraissait simple, mais la sans plus d'indications dans les log je ne vois pas comment continuer.
J'ai donc besoin d'un oeil exterieur pour vérifier que rien ne manque.

Voici ma configuration pgpool sur le serveur 1 :

#
# pgpool-II configuration file sample
# $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample,v 1.26 2009/02/15 05:26:28 t-ishii Exp $


# Host name or IP address to listen on: '*' for all, '' for no TCP/IP
# connections

listen_addresses = '*'


# Port number for pgpool

port = 9999


# Port number for pgpool communication manager

pcp_port = 9898


# Unix domain socket path.  (The Debian package defaults to
# /var/run/postgresql.)

socket_dir = '/var/run/postgresql'


# Unix domain socket path for pgpool communication manager.
# (Debian package defaults to /var/run/postgresql)

pcp_socket_dir = '/tmp'


# Unix domain socket path for the backend. Debian package defaults to /var/run/postgresql!

backend_socket_dir = '/var/run/postgresql'


# pgpool communication manager timeout. 0 means no timeout, but strongly not recommended!

pcp_timeout = 300


# number of pre-forked child process

num_init_children = 32


# Number of connection pools allowed for a child process

max_pool = 4


# If idle for this many seconds, child exits.  0 means no timeout.

child_life_time = 300


# If idle for this many seconds, connection to PostgreSQL closes.
# 0 means no timeout.

connection_life_time = 360


# If child_max_connections connections were received, child exits.
# 0 means no exit.

child_max_connections = 0


# If client_idle_limit is n (n > 0), the client is forced to be
# disconnected whenever after n seconds idle (even inside an explicit
# transactions!)
# 0 means no disconnect.

client_idle_limit = 0



# Maximum time in seconds to complete client authentication.
# 0 means no timeout.

authentication_timeout = 60


# Logging directory

logdir = '/var/log/pgpool'


# pid file name

pid_file_name = '/var/run/pgpool/pgpool.pid'


# Replication mode

replication_mode = true
replication_strict = true
replication_timeout = 5000


# Load balancing mode, i.e., all SELECTs are load balanced.

# This is ignored if replication_mode is false.

load_balance_mode = false


# if there's a data mismatch between master and secondary
# start degeneration to stop replication mode

replication_stop_on_mismatch = false


# If true, replicate SELECT statement when load balancing is disabled.
# If false, it is only sent to the master node.

replicate_select = false


# Semicolon separated list of queries to be issued at the end of a session
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'
# for 8.3 or newer PostgreSQL versions DISCARD ALL can be used as
# follows. However beware that DISCARD ALL holds exclusive lock on
# pg_listener so it will be a serious performance problem if there are
# lots of concurrent sessions.

reset_query_list = 'ABORT; DISCARD ALL'


# If true print timestamp on each log line.
print_timestamp = true


# If true, operate in master/slave mode.

master_slave_mode = false


# If true, cache connection pool.

connection_cache = true


# Health check timeout.  0 means no timeout.

health_check_timeout = 20


# Health check period.  0 means no health check.

health_check_period = 30


# Health check user

health_check_user = 'postgres'


# Execute command by failover.
# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %% = '%' character
#

failover_command = 'echo host:%h, node id:%d, port number:%p, database cluster path:%D, new master id:%m, old master id:%M > /var/log/failover.log'


# Execute command by failback.
# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %% = '%' character
#

failback_command = 'echo host:%h, node id:%d, port number:%p, database cluster path:%D, new master id:%m, old master id:%M > /var/log/failback.log'


# If true, automatically locks a table with INSERT statements to keep
# SERIAL data consistency.  If the data does not have SERIAL data
# type, no lock will be issued. An /*INSERT LOCK*/ comment has the
# same effect.  A /NO INSERT LOCK*/ comment disables the effect.

insert_lock = true


# If true, ignore leading white spaces of each query while pgpool judges
# whether the query is a SELECT so that it can be load balanced.  This
# is useful for certain APIs such as DBI/DBD which is known to adding an
# extra leading white space.

ignore_leading_white_space = true


# If true, print all statements to the log.  Like the log_statement option
# to PostgreSQL, this allows for observing queries without engaging in full
# debugging.

log_statement = true


# If true, incoming connections will be printed to the log.

log_connections = true


# If true, hostname will be shown in ps status. Also shown in
# connection log if log_connections = true.
# Be warned that this feature will add overhead to look up hostname.

log_hostname = true


# if non 0, run in parallel query mode

parallel_mode = false


# if non 0, use query cache

enable_query_cache = false


#set pgpool2 hostname 

pgpool2_hostname = ''


# system DB info

system_db_hostname = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'postgres'
system_db_password = ''


# backend_hostname, backend_port, backend_weight
# here are examples
#backend_hostname0 = 'host1'
#backend_port0 = 5432
#backend_weight0 = 1
#backend_data_directory0 = '/data'
#backend_hostname1 = 'host2'
#backend_port1 = 5433
#backend_weight1 = 1
#backend_data_directory1 = '/data1'


backend_hostname0 = 'localhost'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/postgresql/8.4/main'
backend_hostname1 = 'pgpool2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/postgresql/8.4/main'


# - HBA -


# If true, use pool_hba.conf for client authentication. In pgpool-II
# 1.1, the default value is false. The default value will be true in
# 1.2.

enable_pool_hba = false


# - online recovery -
# online recovery user

recovery_user = 'postgres'


# online recovery password

recovery_password = ''


# execute a command in first stage.

recovery_1st_stage_command = 'copy_base_backup'


# execute a command in second stage.

recovery_2nd_stage_command = 'pgpool_recovery_pitr'


# maximum time in seconds to wait for the recovering node's postmaster
# start-up. 0 means no wait.
# this is also used as a timer waiting for clients disconnected before
# starting 2nd stage

recovery_timeout = 90


# If client_idle_limit_in_recovery is n (n > 0), the client is forced
# to be disconnected whenever after n seconds idle (even inside an
# explicit transactions!)  0 means no disconnect. This parameter only
# takes effect in recovery 2nd stage.
client_idle_limit_in_recovery = 0

Voici ma configuration pgpool sur le serveur 2 :

#
# pgpool-II configuration file sample
# $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample,v 1.26 2009/02/15 05:26:28 t-ishii Exp $


# Host name or IP address to listen on: '*' for all, '' for no TCP/IP
# connections

listen_addresses = '*'


# Port number for pgpool

port = 9999


# Port number for pgpool communication manager

pcp_port = 9898


# Unix domain socket path.  (The Debian package defaults to
# /var/run/postgresql.)

socket_dir = '/var/run/postgresql'


# Unix domain socket path for pgpool communication manager.
# (Debian package defaults to /var/run/postgresql)

pcp_socket_dir = '/tmp'


# Unix domain socket path for the backend. Debian package defaults to /var/run/postgresql!

backend_socket_dir = '/var/run/postgresql'


# pgpool communication manager timeout. 0 means no timeout, but strongly not recommended!

pcp_timeout = 300


# number of pre-forked child process

num_init_children = 32


# Number of connection pools allowed for a child process

max_pool = 4


# If idle for this many seconds, child exits.  0 means no timeout.

child_life_time = 300


# If idle for this many seconds, connection to PostgreSQL closes.
# 0 means no timeout.

connection_life_time = 360


# If child_max_connections connections were received, child exits.
# 0 means no exit.

child_max_connections = 0


# If client_idle_limit is n (n > 0), the client is forced to be
# disconnected whenever after n seconds idle (even inside an explicit
# transactions!)
# 0 means no disconnect.

client_idle_limit = 0


# Maximum time in seconds to complete client authentication.

# 0 means no timeout.

authentication_timeout = 60


# Logging directory

logdir = '/var/log/pgpool'


# pid file name

pid_file_name = '/var/run/pgpool/pgpool.pid'


# Replication mode
replication_mode = true
replication_strict = true
replication_timeout = 5000


# Load balancing mode, i.e., all SELECTs are load balanced.
# This is ignored if replication_mode is false.

load_balance_mode = false


# if there's a data mismatch between master and secondary
# start degeneration to stop replication mode

replication_stop_on_mismatch = false


# If true, replicate SELECT statement when load balancing is disabled.
# If false, it is only sent to the master node.

replicate_select = false


# Semicolon separated list of queries to be issued at the end of a session
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'
# for 8.3 or newer PostgreSQL versions DISCARD ALL can be used as
# follows. However beware that DISCARD ALL holds exclusive lock on
# pg_listener so it will be a serious performance problem if there are
# lots of concurrent sessions.

reset_query_list = 'ABORT; DISCARD ALL'


# If true print timestamp on each log line.

print_timestamp = true


# If true, operate in master/slave mode.

master_slave_mode = false


# If true, cache connection pool.

connection_cache = true


# Health check timeout.  0 means no timeout.

health_check_timeout = 20


# Health check period.  0 means no health check.

health_check_period = 30


# Health check user

health_check_user = 'postgres'


# Execute command by failover.

# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %% = '%' character
#

failover_command = 'echo host:%h, node id:%d, port number:%p, database cluster path:%D, new master id:%m, old master id:%M > /var/log/failover.log'


# Execute command by failback.

# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %% = '%' character
#

failback_command = 'echo host:%h, node id:%d, port number:%p, database cluster path:%D, new master id:%m, old master id:%M > /var/log/failover.log'


# If true, automatically locks a table with INSERT statements to keep
# SERIAL data consistency.  If the data does not have SERIAL data
# type, no lock will be issued. An /*INSERT LOCK*/ comment has the
# same effect.  A /NO INSERT LOCK*/ comment disables the effect.

insert_lock = true


# If true, ignore leading white spaces of each query while pgpool judges
# whether the query is a SELECT so that it can be load balanced.  This
# is useful for certain APIs such as DBI/DBD which is known to adding an
# extra leading white space.

ignore_leading_white_space = true


# If true, print all statements to the log.  Like the log_statement option
# to PostgreSQL, this allows for observing queries without engaging in full
# debugging.

log_statement = true


# If true, incoming connections will be printed to the log.

log_connections = true


# If true, hostname will be shown in ps status. Also shown in
# connection log if log_connections = true.
# Be warned that this feature will add overhead to look up hostname.

log_hostname = true


# if non 0, run in parallel query mode

parallel_mode = false


# if non 0, use query cache

enable_query_cache = false


#set pgpool2 hostname 

pgpool2_hostname = ''


# system DB info

system_db_hostname = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'postgres'
system_db_password = ''


# backend_hostname, backend_port, backend_weight
# here are examples
#backend_hostname0 = 'host1'
#backend_port0 = 5432
#backend_weight0 = 1
#backend_data_directory0 = '/data'
#backend_hostname1 = 'host2'
#backend_port1 = 5433
#backend_weight1 = 1
#backend_data_directory1 = '/data1'




backend_hostname0 = 'pgpool1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/postgresql/8.4/main'
backend_hostname1 = 'localhost'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/postgresql/8.4/main'


# - HBA -


# If true, use pool_hba.conf for client authentication. In pgpool-II
# 1.1, the default value is false. The default value will be true in
# 1.2.

enable_pool_hba = false


# - online recovery -
# online recovery user

recovery_user = 'postgres'


# online recovery password

recovery_password = ''


# execute a command in first stage.

recovery_1st_stage_command = 'copy_base_backup'


# execute a command in second stage.

recovery_2nd_stage_command = 'pgpool_recovery_pitr'


# maximum time in seconds to wait for the recovering node's postmaster
# start-up. 0 means no wait.
# this is also used as a timer waiting for clients disconnected before
# starting 2nd stage

recovery_timeout = 90


# If client_idle_limit_in_recovery is n (n > 0), the client is forced
# to be disconnected whenever after n seconds idle (even inside an
# explicit transactions!)  0 means no disconnect. This parameter only
# takes effect in recovery 2nd stage.
client_idle_limit_in_recovery = 0

Merci de votre aide.

gleu · 30/12/2010 00:56:28

Il y a t il quelque chose à faire pour corriger le "incomplete startup packet" ?

À part corriger le client qui se connecte en ne fournissant pas toutes les informations, non. Mais vraiment, aucun intérêt. Ça n'a aucun rapport avec votre problème de réplication.

Concernant la réplication, je ne vois pas l'intérêt d'avoir deux serveurs pgPool. Vous avez besoin de deux serveurs PostgreSQL et d'un serveur pgpool qui peut se connecter aux deux serveurs PostgreSQL.

Quand vous dites « je crée une base de données sur un serveur et une table dans ce serveur », comment le faites vous réellement ? vous connectez-vous bien au serveur pgPool et non pas à PostgreSQL directement ? il ne faut pas oublier de passer par le port 9999, qui est le port de pgPool, sinon en effet la réplication ne peut pas fonctionner.

guilm · 30/12/2010 17:25:02

Bonjour Gleu,

gleu a écrit :

Quand vous dites « je crée une base de données sur un serveur et une table dans ce serveur », comment le faites vous réellement ?

De mon serveur pgpool sur lequel postgresql est installé, je me connecte en "su postgres" puis je passe la commande de création d'un base de données, mais lorsque je cherche à passer par le port 9999 cela me renvoi le message,

 psql: n'a pas pu se connecter au serveur : Aucun fichier ou dossier de ce type
        Le serveur est-il actif localement et accepte-t-il les connexions sur la
        socket Unix « /var/run/postgresql/.s.PGSQL.9999 » ?

ce que je ne comprend pas d'ailleur. Par contre lorsque je passe par le port 5432 le port de postgresql, cela fonctionne mais la réplication ne se fait pas (je comprend maintenant pourquoi grâce à ton explication).

Je pense maintenant que c'est effectivement la manière dont je me connecte qui pose problème.

Peux tu m'expliquer ce que je ne fais pas correctement.

Merci

Dernière modification par guilm (30/12/2010 18:27:15)

gleu · 31/12/2010 00:34:29

Par défaut, sans l'option -h, il utilise une connexion par socket de domaine unix. Sous Debian, il va chercher ça dans /var/run/postgresql. Il devrait la trouver là, or ça n'a pas l'air d'être le cas. Le mieux est certainement de lui passer l'option -h en indiquant l'adresse IP.

guilm · 31/12/2010 15:48:43

Salut,

Ok, le problème est résolu en faisant ma connexion comme suite :

psql -p 9999 -h host1 -U postgres base_de_test

Apparemment il je suis obligé de passer par cette commmande pour que la réplication se fasse. Ca parait logique en même temps.

Merci gleu pour ton aide ++

Dernière modification par guilm (31/12/2010 15:49:17)

Forums PostgreSQL.fr

#1 29/12/2010 15:05:06

Réplication qui ne fonctionne pas, mais pas d'erreurs dans la log

#2 30/12/2010 00:56:28

Re : Réplication qui ne fonctionne pas, mais pas d'erreurs dans la log

#3 30/12/2010 17:25:02

Re : Réplication qui ne fonctionne pas, mais pas d'erreurs dans la log

#4 31/12/2010 00:34:29

Re : Réplication qui ne fonctionne pas, mais pas d'erreurs dans la log

#5 31/12/2010 15:48:43

Re : Réplication qui ne fonctionne pas, mais pas d'erreurs dans la log

Pied de page des forums