PostgreSQL Server Configuration

See the PostgreSQL documentation for more information about how to change the PostgreSQL configuration in general.

Performance tuning

The default configuration of PostgreSQL uses system resource very sparingly. To improve the performance of the ANNIS service it is necessary to change a few settings in the PostgreSQL configuration file postgresql.conf as shown in the excerpt below.

Most of the options below are commented out in the postgresql.conf file. This means that PostgreSQL will use the default value, i.e. the value as it appears in the postgresql.conf file, for this option. To make your changes take effect you have to uncomment it.

The values below are for machine with 2 GB RAM that is exclusively dedicated to running PostgreSQL. If you're running ANNIS on your local machine and don't have large corpora, you should use lower values as explained in the comments.

Changes in postgresql.conf:

max_connections = 100        # expected maximum number of connections (users) at peak load

shared_buffers = 512MB      # RAM cache shared across all sessions
                            # 25% of available RAM
                            # use lower value on a desktop system, i.e. 128MB or 256MB

work_mem = 128MB            # RAM for *one* sort, aggregate or hash operation inside a query plan 
                            # RAM / (2 x max_connections)
                            # (many operations can run in parallel!)
                            # increase for large corpora, i.e. 256MB, 512MB
                            # decrease if you have many users

maintenance_work_mem = 256MB    # RAM for maintenance operations during corpus import
                                # CREATE INDEX, VACUUM etc.
                                # increase for large corpora

effective_cache_size = 1536MB   # estimated size of disk cached used by the OS
                                # 75% of available RAM
                                # use lower value on a desktop system, where other applications are running
                                # e.g. 512MB for a desktop with 2 GB RAM

default_statistcs_target = 100      # size of value histogram for each table column
                                    # use maximum value

checkpoint_segments = 20    # affects how quickly buffers are written
                            # to disk inside a transaction

autovacuum = off        # VACUUM is done automatically during corpus import

More information on these settings can be found in the PostgreSQL manual:


If you want to log the duration of SQL statements you should also set the following options in postgresql.conf:

log_min_duration_statement = 0

Remote access

If the PostgreSQL server runs on a separate machine, remote access has to be enabled.

Changes in postgresql.conf:

listen_adresses = 'localhost,'

Where is the IP address of the machine running PostgreSQL.

Changes in pg_hba.conf:

host annis_db annis_user md5

Where is the machine running the ANNIS service that is connecting to the remote PostgreSQL server.

Configuration of System Resources

PostgreSQL needs to access large areas of continuous RAM which can easily exceed the maximum size allowed by the operating system. PostgreSQL will check the OS resource settings during startup and exit with an error if they are not adequate.

Reproduced below are the commands to change the resource settings on Linux and OS X. More information can be found in the PostgreSQL manual: [Managing Kernel Resources]( 17.4.).


sysctl -w kernel.shmmax=536870912   # bytes; corresponds to 512MB

This command takes effect immediately. To make the change permanent across system reboots, add it to the file /etc/sysctl.conf.

Mac OS X

sysctl -w kern.sysv.shmmax=536870912        # bytes; corresponds to 512MB
sysctl -w kern.sysv.shmall=131072           # measured in 4 kB pages

OS X has to be rebooted for the command to take effect. To make the change permanent across system reboots, add it to the file /etc/sysctl.conf.