Monday, September 18, 2006

Performance Tuning PostgreSQL

Introduction

PostgreSQL is the most advanced and flexible Open Source SQL database today. With this power and flexibility comes a problem. How do the PostgreSQL developers tune the default configuration for everyone? Unfortunately the answer is they can't.

The problem is that every database is not only different in its design, but also its requirements. Some systems are used to log mountains of data that is almost never queried. Others have essentially static data that is queried constantly, sometimes feverishly. Most systems however have some, usually unequal, level of reads and writes to the database. Add this little complexity on top of your totally unique table structure, data, and hardware configuration and hopefully you begin to see why tuning can be difficult.

The default configuration PostgreSQL ships with is a very solid configuration aimed at everyone's best guess as to how an "average" database on "average" hardware should be setup. This article aims to help PostgreSQL users of all levels better understand PostgreSQL performance tuning.

Understanding the process

The first step to learning how to tune your PostgreSQL database is to understand the life cycle of a query. Here are the steps of a query:

  1. Transmission of query string to database backend
  2. Parsing of query string
  3. Planning of query to optimize retrieval of data
  4. Retrieval of data from hardware
  5. Transmission of results to client

The first step is the sending of the query string ( the actual SQL command you type in or your application uses ) to the database backend. There isn't much you can tune about this step, however if you have a very large queries that cannot be prepared in advance it may help to put them into the database as a stored procedure and cut the data transfer down to a minimum.

Once the SQL query is inside the database server it is parsed into tokens. This step can also be minimized by using stored procedures.

The planning of the query is where PostgreSQL really starts to do some work. This stage checks to see if the query is already prepared if your version of PostgreSQL and client library support this feature. It also analyzes your SQL to determine what the most efficient way of retrieving your data is. Should we use an index and if so which one? Maybe a hash join on those two tables is appropriate? These are some of the decisions the database makes at this point of the process. This step can be eliminated if the query is previously prepared.

Now that PostgreSQL has a plan of what it believes to be the best way to retrieve the data, it is time to actually get it. While there are some tuning options that help here, this step is mostly effected by your hardware configuration.

And finally the last step is to transmit the results to the client. While there aren't any real tuning options for this step, you should be aware that all of the data that you are returning is pulled from the disk and sent over the wire to your client. Minimizing the number of rows and columns to only those that are necessary can often increase your performance.

General Tuning

There are several postmaster options that can be set that drastically affect performance, below is a list of the most commonly used and how they effect performance:

  • max_connections = — This option sets the maximum number of database backend to have at any one time. Use this feature to ensure that you do not launch so many backends that you begin swapping to disk and kill the performance of all the children. Depending on your application it may be better to deny the connection entirely rather than degrade the performance of all of the other children.
  • shared_buffers = — Editing this option is the simplest way to improve the performance of your database server. The default of 1000 is very low for most modern hardware. Each buffer is typically 8192 bytes unless changed at compile time. General wisdom says that this should be set to roughly 10-15% of available RAM on the system. Like most of the options I will outline here you will simply need to try them and see how well it works on your system. I would recommend playing in the 10,000 to 20,000 shared_buffers range. Much more than that and you can actually degrade performance.
  • work_mem = — This option is used to control the amount of memory using in sort operations and hash tables. While you may need to increase the amount of memory if you do a ton of sorting in your application, care needs to be taken. This isn't a system wide parameter, but a per operation one. So if a complex query has several sort operations in it it will use multiple work_mem units of memory. Not to mention that multiple backends could be doing this at once. This query can often lead your database server to swap if the value is too large. This option was previously called sort_mem in older versions of PostgreSQL.
  • max_fsm_pages = — This option helps to control the free space map. When something is deleted from a table it isn't removed from the disk immediately, it is simply marked as "free" in the free space map. The space can then be reused for any new INSERTs that you do on the table. If your setup has a high rate of DELETEs and INSERTs it may be necessary increase this value to avoid table bloat.
  • fsync = — This option determines if all your WAL pages are fsync()'ed to disk before a transactions is committed. Having this on is safer, but can reduce write performance. If fsync is not enabled there is the chance of unrecoverable data corruption. Turn this off at your own risk.
  • commit_delay = and commit_siblings = — These options are used in concert to help improve performance by writing out multiple transactions that are committing at once. If there are commit_siblings number of backends active at the instant your transaction is committing then the server waiting commit_delay microseconds to try and commit multiple transactions at once.
  • effective_cache_size = — This value helps PostgreSQL's optimizer to determine how effective the operating system's disk cache is when determining if it should use an index or not. The larger the value increases the likely hood of using an index. You should play with this value based on the amount of disk pages your hardware/operating system combo typically has as disk cache.
  • random_page_cost = — random_page_cost controls the way PostgreSQL views non-sequential disk reads. A higher value makes it more likely that a sequential scan will be used over an index scan indicating that your server has very fast disks.

Note that many of these options consume shared memory and it will probably be necessary to increase the amount of shared memory allowed on your system to get the most out of these options.

Hardware Issues

Obviously the type and quality of the hardware you use for your database server drastically impacts the performance of your database. Here are a few tips to use when purchasing hardware for your database server:

  • CPUs — The more CPUs the better, however if your database does not use many complex functions your money is best spent on a better disk subsystem. Also, avoid Intel Xeon processors with PostgreSQL as there is a problem with the context switching in these processors that gives sub-par performance. Opterons are generally accepted as being a superior CPU for PostgreSQL databases.
  • RAM — The more RAM you have the more disk cache you will have. This greatly impacts performance considering memory I/O is thousands of times faster than disk I/O.
  • Disk types — Obviously fast Ultra-320 SCSI disks are your best option, however high end SATA drives are also very good. With SATA each disk is substantially cheaper and with that you can afford more spindles than with SCSI on the same budget.
  • Disk configuration — The optimum configuration is RAID 1+0 with as many disks as possible and with your transaction log (pg_xlog) on a separate disk all by itself. RAID 5 is not a very good option for databases unless you have more than 6 disks in your volume. With newer versions of PostgreSQL you can also use the tablespaces option to put different tables, databases, and indexes on different disks to help optimize performance. Such as putting your often used tables on a fast SCSI disk and the less used ones slower IDE or SATA drives.

In general the more RAM and disk spindles you have in your system the better it will perform. This is because with the extra RAM you will access your disks less. And the extra spindles help spread the reads and writes over multiple disks to increase throughput and to reduce drive head congestion.

Another good idea is to separate your application code and your database server onto different hardware. Not only does this provide more hardware dedicated to the database server, but the operating system's disk cache will contain more PostgreSQL data and not other various application or system data this way.

For example, if you have one web server and one database server you can use a cross-over cable on a separate ethernet interface to handle just the web server to database network traffic to ensure you reduce any possible bottlenecks there. You can also obviously create an entirely different physical network for database traffic if you have multiple servers that access the same database server.

Useful Tuning Tools

The most useful tool in tuning your database is the SQL command EXPLAIN ANALYZE. This allows you to profile each SQL query your application performs and see exactly how the PostgreSQL planner will process the query. Let's look at a short example, below is a simple table structure and query.

CREATE TABLE authors (
id int4 PRIMARY KEY,
name varchar
);

CREATE TABLE books (
id int4 PRIMARY KEY,
author_id int4,
title varchar
);

If we use the query:


EXPLAIN ANALYZE SELECT authors.name, books.title FROM books, authors
WHERE books.author_id=16 and authors.id = books.author_id ORDER BY books.title;

You will get output similar to the following:

                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Sort (cost=29.71..29.73 rows=6 width=64) (actual time=0.189..16.233 rows=7 loops=1)
Sort Key: books.title
-> Nested Loop (cost=0.00..29.63 rows=6 width=64) (actual time=0.068..0.129 rows=7 loops=1)
-> Index Scan using authors_pkey on authors (cost=0.00..5.82 rows=1 width=36) (actual time=0.029..0.033 rows=1 loops=1)
Index Cond: (id = 16)
-> Seq Scan on books (cost=0.00..23.75 rows=6 width=36) (actual time=0.026..0.052 rows=7 loops=1)
Filter: (author_id = 16)
Total runtime: 16.386 ms

You need to read this output from bottom to top when analyzing it. The first thing PostgreSQL does is do a sequence scan on the books table looking at each author_id column for values that equal 16. Then it does an index scan of the authors table, because of the implicit index created by the PRIMARY KEY options. The finally the results our sorted by books.title.

The values you see in parenthesis are the estimated and actual cost of that portion of the query. The closer together the estimate and the actual costs are the better performance you will typically see.

Now, let's change the structure a little bit by adding an index on books.author_id to avoid the sequence scan with this command:

CREATE INDEX books_idx1 on books(author_id);

If you rerun the query again, you won't see any noticeable change in the output. This is because PostgreSQL has not yet re-analyzed the data and determined that the new index may help for this query. This can be solved by running:

ANALYZE books;

However, in this small test case I'm working with the planner still favors the sequence scan because there aren't very many rows in my books table. If a query is going to return a large portion of a table then the planner chooses a sequence scan over an index because it is actually faster. You can also force PostgreSQL to favor index scans over sequential scans by setting the configuration parameter enable_seqscan to off. This doesn't remove all sequence scans, since some tables may not have an index, but it does force the planner's hand into always using an index scan when it is available. This is probably best done by sending the command SET enable_seqscan = off at the start of every connection rather than setting this option database wide. This way you can control via your application code when this is in effect. However, in general disabling sequence scans should only be used in tuning your application and is not really intended for every day use.

Typically the best way to optimize your queries is to use indexes on specific columns and combinations of columns to correspond to often used queries. Unfortunately this is done by trial and error. You should also note that increasing the number of indexes on a table increases the number of write operations that need to be performed for each INSERT and UPDATE. So don't do anything silly and just add indexes for each column in each table.

You can help PostgreSQL do what you want by playing with the level of statistics that are gathered on a table or column with the command:

ALTER TABLE ALTER COLUMN  SET STATISTICS ;

This value can be a number between 0 and 1000 and helps PostgreSQL determine what level of statistics gathering should be performed on that column. This helps you to control the generated query plans without having slow vacuum and analyze operations because of generating large amounts of stats for all tables and columns.

Another useful tool to help determine how to tune your database is to turn on query logging. You can tell PostgreSQL which queries you are interested in logging via the log_statement configuration option. This is very useful in situations where you many users executing ad hoc queries to your system via something like Crystal Reports or via psql directly.

Database Design and Layout

Sometimes the design and layout of your database affects performance. For example, if you have an employee database that looks like this:

CREATE TABLE employees (
id int4 PRIMARY KEY,
active boolean,
first_name varchar,
middle_name varchar,
last_name varchar,
ssn varchar,
address1 varchar,
address2 varchar,
city varchar,
state varchar(2),
zip varchar,
home_phone varchar,
work_phone varchar,
cell_phone varchar,
fax_phone varchar,
pager_number varchar,
business_email varchar,
personal_email varchar,
salary int4,
vacation_days int2,
sick_days int2,
employee_number int4,
office_addr_1 varchar,
office_addr_2 varchar,
office_city varchar,
office_state varchar(2),
office_zip varchar,
department varchar,
title varchar,
supervisor_id int4
);

This design is easy to understand, but isn't very good on several levels. While it will depend on your particular application, in most cases you won't need to access all of this data at one time. In portions of your application that deal with HR functions you are probably only interested in their name, salary, vacation time, and sick days. However, if the application displays an organization chart it would only be concerned with the department and supervisor_id portions of the table.

By breaking up this table into smaller tables you can get more efficient queries since PostgreSQL has less to read through, not to mention better functionality. Below is one way to make this structure better:

CREATE TABLE employees (
id int4 PRIMARY KEY,
active boolean,
employee_number int4,
first_name varchar,
middle_name varchar,
last_name varchar,
department varchar,
title varchar,
email varchar
);

CREATE TABLE employee_address (
id int4 PRIMARY KEY,
employee_id int4,
personal boolean,
address_1 varchar,
address_2 varchar,
city varchar,
state varchar(2),
zip varchar
);

CREATE TABLE employee_number_type (
id int4 PRIMARY KEY,
type varchar
);

CREATE TABLE employee_number (
id int4 PRIMARY KEY,
employee_id int4,
type_id int4,
number varchar
);

CREATE TABLE employee_hr_info (
id int4 PRIMARY KEY,
employee_id int4,
ssn varchar,
salary int4,
vacation_days int2,
sick_days int2
);

With this table structure the data associated with an employee is broken out into logical groupings. The main table contains the most frequently used information and the other tables store all of the rest of the information. The added benefit of this layout is that you can have any number of phone numbers and addresses associated with a particular employee now.

Another useful tip is to use partial indexes on columns where you typically query a certain value more often than another. Take for example the employee table above. You're probably only displaying active employees throughout the majority of the application, but creating a partial index on that column where the value is true can help speed up the query and may help the planner to choose to use the index in cases where it otherwise would not. You can create a partial index like this:

CREATE INDEX employee_idx2 ON employee(active) WHERE active='t';

Or you may have a situation where a row has a column named 'employee_id' that is null until the row is associated with an employee, maybe in some trouble ticket like system. In that type of application you would probably have a 'View Unassigned Tickets' portion of the application which would benefit from a partial index such as this:

CREATE INDEX tickets_idx1 ON tickets(employee_id) WHERE employee_id IS NULL;

Application Development

There are many different ways to build applications which use a SQL database, but there are two very common themes that I will call stateless and stateful. In the area of performance there are different issues that impact each.

Stateless is typically the access type used by web based applications. Your software connects to the database, issues a couple of queries, returns to results to the user, and disconnects. The next action the users takes restarts this process with a new connect, new set of queries, etc.

Stateful applications are typically non-web based user interfaces where an application initiates a database connection and holds it open for the duration the application is in use.

Stateless Applications

In web based applications each time something is requested by the user , the application initiates a new database connection. While PostgreSQL has a very short connection creation time and in general it is not a very expensive operation, it is best to use some sort of database connection pooling method to get maximum performance.

There are several ways to accomplish database connection pooling, here is a short list of common ones:

  • Pgpool is a small server that you run on the same server as your clients that will pool database connections to some local or remote server. The application simply points at the pgpool instance instead of the normal postmaster. From the application's perspective nothing has changed as the connection pooling is hidden from it.
  • In a mod_perl environment you can use Apache::DBI to handle database connection pooling inside of Apache itself.
  • SQLRelay is another db connection manager that is somewhat database agnostic. It works with with several databases other than PostgreSQL.
  • You can always write a small bit of code to do this for you yourself, but I would highly recommend using an already developed solution to reduce the amount of debugging you have to do.

It should be noted that in a few bizarre instances I've actually seen database connection pooling reduce the performance of web based applications. At a certain point the cost of handling the pooling is more expensive than simply creating a new connection. I suggest testing it both ways to see which is best for your environment.

Stateful Applications

When building stateful applications you should look into using database cursors via the DECLARE command. A cursor allows you to plan and execute a query, but only pull back the data as you need it, for example one row at a time. This can greatly increase the snappiness of the UI.

General Application Issues

These issues typically effect both stateful and stateless applications in the same fashion. One good technique is to use server side prepared queries for any queries you execute often. This reduces the overall query time by caching the query plan for later use.

It should be noted however if you prepare a query in advance using placeholder values ( such as 'column_name = ?' ) then the planner will not always be able to choose the best plan. For example, your query has a placeholder for the boolean column 'active' and you have a partial index on false values the planner won't use it because it cannot be sure the value passed in on execution will be true or false.

You can also obviously utilize stored procedures here to reduce the transmit, parse, and plan portions of the typical query life cycle. It is best to profile your application and find commonly used queries and data manipulations and put them into a stored procedure.

Other Useful Resources

Here is a short list of other items that may be of help.

No comments: