PostgreSQL Server Configuration

See the PostgreSQL documentation for more information about how to change the PostgreSQL configuration in general.

Performance tuning

The default configuration of PostgreSQL uses system resource very sparingly. To improve the performance of the ANNIS service it is necessary to change a few settings in the PostgreSQL configuration file postgresql.conf as shown in the excerpt below.

Most of the options below are commented out in the postgresql.conf file. This means that PostgreSQL will use the default value, i.e. the value as it appears in the postgresql.conf file, for this option. To make your changes take effect you have to uncomment it.

The values below are for machine with 2 GB RAM that is exclusively dedicated to running PostgreSQL. If you're running ANNIS on your local machine and don't have large corpora, you should use lower values as explained in the comments.

Changes in postgresql.conf:

max_connections = 100        # expected maximum number of connections (users) at peak load

shared_buffers = 512MB      # RAM cache shared across all sessions
                            # 25% of available RAM
                            # use lower value on a desktop system, i.e. 128MB or 256MB

work_mem = 128MB            # RAM for *one* sort, aggregate or hash operation inside a query plan 
                            # RAM / (2 x max_connections)
                            # (many operations can run in parallel!)
                            # increase for large corpora, i.e. 256MB, 512MB
                            # decrease if you have many users

maintenance_work_mem = 256MB    # RAM for maintenance operations during corpus import
                                # CREATE INDEX, VACUUM etc.
                                # increase for large corpora

effective_cache_size = 1536MB   # estimated size of disk cached used by the OS
                                # 75% of available RAM
                                # use lower value on a desktop system, where other applications are running
                                # e.g. 512MB for a desktop with 2 GB RAM

default_statistics_target = 100     # size of value histogram for each table column
                                    # use maximum value

# The following only applies to versions of PostgreSQL prior to 9.5:
checkpoint_segments = 20    # affects how quickly buffers are written
                            # to disk inside a transaction
# For version 9.5 and higher, you can set the parameter max_wal_size to
# approx. 3 * checkpoint_segments * 16MB, but since the default for
# max_wal_size is much higher than the default for checkpoint_segments
# used to be, this may no longer be necessary.

autovacuum = off        # VACUUM is done automatically during corpus import

More information on these settings can be found in the PostgreSQL manual:

Logging

If you want to log the duration of SQL statements you should also set the following options in postgresql.conf:

log_min_duration_statement = 0

Remote access

If the PostgreSQL server runs on a separate machine, remote access has to be enabled.

Changes in postgresql.conf:

listen_adresses = 'localhost,192.168.1.1'

Where 192.168.1.1 is the IP address of the machine running PostgreSQL.

Changes in pg_hba.conf:

host annis_db annis_user 192.168.1.2/0 md5

Where 192.168.1.2 is the machine running the ANNIS service that is connecting to the remote PostgreSQL server.

Configuration of System Resources

PostgreSQL needs to access large areas of continuous RAM which can easily exceed the maximum size allowed by the operating system. PostgreSQL will check the OS resource settings during startup and exit with an error if they are not adequate.

Reproduced below are the commands to change the resource settings on Linux and OS X. More information can be found in the PostgreSQL manual: [Managing Kernel Resources](http://www.postgresql.org/docs/8.3/interactive/kernel-resources.html 17.4.).

Linux

sysctl -w kernel.shmmax=536870912   # bytes; corresponds to 512MB

This command takes effect immediately. To make the change permanent across system reboots, add it to the file /etc/sysctl.conf.

Mac OS X

sysctl -w kern.sysv.shmmax=536870912        # bytes; corresponds to 512MB
sysctl -w kern.sysv.shmall=131072           # measured in 4 kB pages

OS X has to be rebooted for the command to take effect. To make the change permanent across system reboots, add it to the file /etc/sysctl.conf.