Tuesday, March 17, 2015

Some Exceptions coming while Create Maven project with hibernate

Below i am describing some of the exceptions and error which might come while you are trying to build project with hibernate annotation.
In maven i have used below dependency


org.hibernate hibernate-core 4.3.8.Final org.hibernate hibernate-annotations 3.5.6-Final org.hibernate hibernate-commons-annotations 3.3.0.ga junit junit 4.8.1 we have created dto class UserDetails.java (dto.user.UserDetails) below is the hibernate.cfg.xml configuration file. false com.mysql.jdbc.Driver root123 jdbc:mysql://localhost:3306/test1 root org.hibernate.dialect.MySQLDialect true create If you can see above, we have include "com.common.Stock" for mapping and we have not defined this class in the project. So this will lead to below error. Before that below is our test code ------ public class hibernate { public static void main(String[] args) { UserDetails user = new UserDetails(); user.setUserId(1); user.setUserName("First User"); // create session factory - one object- get session from SessionFactory // need configuration to build sessionFactory // new Configuration().configure() // u need to have hibernate.cfg.xml in default place resources SessionFactory sessionFactory = new Configuration().configure() .buildSessionFactory(); Session session = sessionFactory.openSession(); session.beginTransaction(); session.save(user); session.getTransaction().commit(); session.close(); } } ------------------------------- Exception in thread "main" org.hibernate.MappingException: Unable to load class [ com.common.Stock] declared in Hibernate configuration entry at org.hibernate.cfg.Configuration.parseMappingElement(Configuration.java:2281) at org.hibernate.cfg.Configuration.parseSessionFactory(Configuration.java:2229) at org.hibernate.cfg.Configuration.doConfigure(Configuration.java:2209) at org.hibernate.cfg.Configuration.doConfigure(Configuration.java:2162) at org.hibernate.cfg.Configuration.configure(Configuration.java:2077) at org.hibernate.cfg.Configuration.configure(Configuration.java:2056) at hibernate.main(hibernate.java:17) Caused by: java.lang.ClassNotFoundException: com.common.Stock at java.net.URLClassLoader$1.run(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.hibernate.internal.util.ReflectHelper.classForName(ReflectHelper.java:193) at org.hibernate.cfg.Configuration.parseMappingElement(Configuration.java:2278) ... 6 more -------------------------------------- As you have not defined you should either remove from hibernate configuration class or comment that. Best is remove. --- 2. Error number 2 : missing of driver class for Mysql If you can look at the above mentioned dependencies, I have not included a driver for Mysql. This will give below error. Exception in thread "main" org.hibernate.boot.registry.classloading.spi.ClassLoadingException: Unable to load class [com.mysql.jdbc.Driver] at org.hibernate.boot.registry.classloading.internal.ClassLoaderServiceImpl.classForName(ClassLoaderServiceImpl.java:243) at org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl.loadDriverIfPossible(DriverManagerConnectionProviderImpl.java:200) at org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl.buildCreator(DriverManagerConnectionProviderImpl.java:156) at org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl.configure(DriverManagerConnectionProviderImpl.java:95) at org.hibernate.boot.registry.internal.StandardServiceRegistryImpl.configureService(StandardServiceRegistryImpl.java:111) at org.hibernate.service.internal.AbstractServiceRegistryImpl.initializeService(AbstractServiceRegistryImpl.java:234) at org.hibernate.service.internal.AbstractServiceRegistryImpl.getService(AbstractServiceRegistryImpl.java:206) at org.hibernate.engine.jdbc.internal.JdbcServicesImpl.buildJdbcConnectionAccess(JdbcServicesImpl.java:260) at org.hibernate.engine.jdbc.internal.JdbcServicesImpl.configure(JdbcServicesImpl.java:94) at org.hibernate.boot.registry.internal.StandardServiceRegistryImpl.configureService(StandardServiceRegistryImpl.java:111) at org.hibernate.service.internal.AbstractServiceRegistryImpl.initializeService(AbstractServiceRegistryImpl.java:234) at org.hibernate.service.internal.AbstractServiceRegistryImpl.getService(AbstractServiceRegistryImpl.java:206) at org.hibernate.cfg.Configuration.buildTypeRegistrations(Configuration.java:1887) at org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1845) at org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1930) at hibernate.main(hibernate.java:18) Caused by: java.lang.ClassNotFoundException: Could not load requested class : com.mysql.jdbc.Driver at org.hibernate.boot.registry.classloading.internal.ClassLoaderServiceImpl$AggregatedClassLoader.findClass(ClassLoaderServiceImpl.java:228) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.hibernate.boot.registry.classloading.internal.ClassLoaderServiceImpl.classForName(ClassLoaderServiceImpl.java:240) ... 15 more ------------- Solution 2: Include dependency for driver:Mysql mysql mysql-connector-java 5.1.9 -------------- It will work fine and you will have below out in the eclipse. Note : we have used "create"\ create where this will lead to drop exist table and create every time runs. ---------------------------------- Mar 17, 2015 1:38:07 PM org.hibernate.annotations.common.reflection.java.JavaReflectionManager INFO: HCANN000001: Hibernate Commons Annotations {4.0.5.Final} Mar 17, 2015 1:38:07 PM org.hibernate.Version logVersion INFO: HHH000412: Hibernate Core {4.3.8.Final} Mar 17, 2015 1:38:07 PM org.hibernate.cfg.Environment INFO: HHH000206: hibernate.properties not found Mar 17, 2015 1:38:07 PM org.hibernate.cfg.Environment buildBytecodeProvider INFO: HHH000021: Bytecode provider name : javassist Mar 17, 2015 1:38:07 PM org.hibernate.cfg.Configuration configure INFO: HHH000043: Configuring from resource: /hibernate.cfg.xml Mar 17, 2015 1:38:07 PM org.hibernate.cfg.Configuration getConfigurationInputStream INFO: HHH000040: Configuration resource: /hibernate.cfg.xml Mar 17, 2015 1:38:07 PM org.hibernate.internal.util.xml.DTDEntityResolver resolveEntity WARN: HHH000223: Recognized obsolete hibernate namespace http://hibernate.sourceforge.net/. Use namespace http://www.hibernate.org/dtd/ instead. Refer to Hibernate 3.6 Migration Guide! Mar 17, 2015 1:38:07 PM org.hibernate.cfg.Configuration doConfigure INFO: HHH000041: Configured SessionFactory: null Mar 17, 2015 1:38:08 PM org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl configure WARN: HHH000402: Using Hibernate built-in connection pool (not for production use!) Mar 17, 2015 1:38:08 PM org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl buildCreator INFO: HHH000401: using driver [com.mysql.jdbc.Driver] at URL [jdbc:mysql://localhost:3306/test1] Mar 17, 2015 1:38:08 PM org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl buildCreator INFO: HHH000046: Connection properties: {user=root, password=****} Mar 17, 2015 1:38:08 PM org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl buildCreator INFO: HHH000006: Autocommit mode: false Mar 17, 2015 1:38:08 PM org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl configure INFO: HHH000115: Hibernate connection pool size: 20 (min=1) Mar 17, 2015 1:38:13 PM org.hibernate.dialect.Dialect INFO: HHH000400: Using dialect: org.hibernate.dialect.MySQLDialect Mar 17, 2015 1:38:13 PM org.hibernate.engine.jdbc.internal.LobCreatorBuilder useContextualLobCreation INFO: HHH000423: Disabling contextual LOB creation as JDBC driver reported JDBC version [3] less than 4 Mar 17, 2015 1:38:14 PM org.hibernate.engine.transaction.internal.TransactionFactoryInitiator initiateService INFO: HHH000399: Using default transaction strategy (direct JDBC transactions) Mar 17, 2015 1:38:14 PM org.hibernate.hql.internal.ast.ASTQueryTranslatorFactory INFO: HHH000397: Using ASTQueryTranslatorFactory Mar 17, 2015 1:38:17 PM org.hibernate.tool.hbm2ddl.SchemaExport execute INFO: HHH000227: Running hbm2ddl schema export Hibernate: drop table if exists UserDetails Hibernate: create table UserDetails (userId integer not null, userName varchar(255), primary key (userId)) Mar 17, 2015 1:38:19 PM org.hibernate.tool.hbm2ddl.SchemaExport execute INFO: HHH000230: Schema export complete Hibernate: insert into UserDetails (userName, userId) values (?, ?)

Tuesday, February 10, 2009

Change data TYPE of a column in a table

Change data TYPE of a column in a table - ALTER data type in a COLUMN in a table


alter table test_table alter column table_id TYPE bigint

Error :
column "table_id" cannot be cast to type "pg_catalog.int8"

Solution
-----------


ALTER TABLE sometable
ALTER COLUMN somecolumn TYPE new_data_type
USING some_function_call_to_cast(somecolumn);


Step 1 : you need to make a function to cast existing type to new data type
Step 2 : call alter USING the casting function

Below i have worked out one example
----------------------------------------

ALTER TABLE test_table
ALTER COLUMN table_id TYPE bigint
USING test_function_call_to_cast(table_id);

Here i have alter the column table_id from VARCHAR to BIGINT in test_table


body of the function :

-----------------------------------------------------------------

CREATE OR REPLACE FUNCTION
test_function_call_to_cast(VARCHAR) RETURNS bigint
AS
$BODY$
SELECT
CASE
WHEN trim($1) SIMILAR TO '[0-9]+'
THEN CAST(trim($1) AS bigint)
ELSE NULL
END;

$BODY$
LANGUAGE 'sql' IMMUTABLE STRICT;

--------------------------------------------------------------------------

important


Even if the above did work, you've got some stuff in there you don't really care about - letters and so forth or an empty string. So you want to control how the cast is done anyway

in the above function
which takes a string value and if it looks like a number, it converts it to a number otherwise it just returns NULL:

Monday, February 2, 2009

Checking Null Values

I tried to check null values with
expression = NULL

but i did not get any results. Obviously the results are available for my query and i was wondering why always result set was zero. Here is the solution

--------------------------------------

To check whether a value is or is not null, use the constructs
expression IS NULL
expression IS NOT NULL

or the equivalent, but nonstandard, constructs

expression ISNULL
expression NOTNULL

Do not write expression = NULL because NULL is not "equal to" NULL. (The null value represents an unknown value, and it is not known whether two unknown values are equal.) This behavior conforms to the SQL standard.

Tip: Some applications may expect that expression = NULL returns true if expression evaluates to the null value. It is highly recommended that these applications be modified to comply with the SQL standard. However, if that cannot be done the transform_null_equals configuration variable is available. If it is enabled, PostgreSQL will convert x = NULL clauses to x IS NULL. This was the default behavior in PostgreSQL releases 6.5 through 7.1.

---------------------------------------------------------------------

 So always remember to use IS NULL to check the null values 



Thursday, January 8, 2009

Postgress Error : update or delete on "acc_cost" violates foreign key constraint

I got the following error in when i tried to delete a record from a table "acc_contract"

"ERROR: update or delete on "acc_cost" violates foreign key constraint "acc_room2acc_cost" on "acc_room"
DETAIL: Key (contract_id,cost_no)=(1000,2) is still referenced from table "acc_room".
CONTEXT: SQL statement "DELETE FROM ONLY "public"."acc_cost" W "

I am using EMS to access the database, but it does not show any foreign keys in the table "acc_cost"

Then i accessed directly to DB using putty (an interface which helps you to access and run linux commands and so on) and got the description of the table "acc_cost"
it gave me the follwing out put

------------------------------------------

\d acc_cost
Table "public.acc_cost"
Column | Type | Modifiers
-------------+---------+------------------------
contract_id | bigint | not null
cost_no | integer | not null
basic | boolean | not null default false

Indexes:
"pk_acc_cost" PRIMARY KEY, btree (contract_id, cost_no)

Triggers:
"RI_ConstraintTrigger_982292245" AFTER DELETE ON acc_cost FROM acc_room NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_restrict_del"('acc_room2acc_cost', 'acc_room', 'acc_cost', 'UNSPECIFIED', 'contract_id', 'contract_id', 'cost_no', 'cost_no')
"RI_ConstraintTrigger_982292246" AFTER UPDATE ON acc_cost FROM acc_room NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_upd"('acc_room2acc_cost', 'acc_room', 'acc_cost', 'UNSPECIFIED', 'contract_id', 'contract_id', 'cost_no', 'cost_no')
"RI_ConstraintTrigger_982292715" AFTER INSERT OR UPDATE ON acc_cost FROM acc_contract NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('acc_cost2acc_contract', 'acc_cost', 'acc_contract', 'UNSPECIFIED', 'contract_id', 'contract_id')


-------------------------------------------------------

So u can see the RI constraints triggers in the table.

the problem comes because

---------------------------------------------------------

Triggers:

"RI_ConstraintTrigger_982292245" AFTER DELETE ON acc_cost FROM acc_room NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_restrict_del"('acc_room2acc_cost', 'acc_room', 'acc_cost', 'UNSPECIFIED', 'contract_id', 'contract_id', 'cost_no', 'cost_no')

------------------

So i dropped this trigger

DROP TRIGGER "RI_ConstraintTrigger_982292245" ON acc_cost;

Important
RI_ConstraintTrigger_982292245
should be within the quotations cause its name is in different case and postgres is case sensitive
after that i could delete records

What is meant by RI Trigger

I think RI stand for referential integrity. Foreign keys used to be implemented using 'create constraint trigger' which automatically names triggers 'RI_ConstraintTrigger_' then some integer which I guess is an oid (object id).

Constraint triggers execute functions to implement a constraint. RI_FKey_... are the functions that implement foreign key constraints for different events like insert, update, and delete.

Thursday, December 13, 2007

Mathematical Functions and Operators - PostgreSQL

Mathematical operators are provided for many PostgreSQL types. For types without common mathematical conventions for all possible permutations (e.g., date/time types) we describe the actual behavior in subsequent sections.

Table 7-2 shows the available mathematical operators.

Table 7-2: Mathematical Operators
Operator Description
+ addition
e.g. 2 + 3 => 5
- subtraction
e.g. 2 - 3 => -1
* multiplication
e.g. 2 * 3 => 6
/ division (integer division truncates results)
e.g. 4 / 2 => 2
% modulo (remainder)
e.g. 5 % 4 => 1
^ exponentiation
e.g. 2.0 ^ 3.0 => 8
|/ square root
e.g. |/ 25.0 => 5
||/ cube root
e.g. ||/ 27.0 => 3
! factorial
e.g. 5 ! => 120
!! factorial (prefix operator)
e.g. !! 5 => 120
@ absolute value
e.g. @ -5.0 => 5
& bitwise AND
e.g. 91 & 15 => 11
| bitwise OR
e.g. 32 | 3 => 35
# bitwise XOR
e.g. 17 # 5 => 20
~ bitwise NOT
e.g. ~1 => -2
<< bitwise shift left
e.g. 1 <<> => 16
>> bitwise shift right
e.g. 8 >> 2 => 2

The bitwise operators work only on integral data types, whereas the others are available for all numeric data types. The bitwise operators are also available for the bit string types bit and bit varying, as shown in Table 7-10.

Table 7-3 shows the available mathematical functions. In the table, dp indicates double precision. Many of these functions are provided in multiple forms with different argument types. Except where noted, any given form of a function returns the same data type as its argument. The functions working with double precision data are mostly implemented on top of the host system's C library; accuracy and behavior in boundary cases may therefore vary depending on the host system.

Table 7-3: Mathematical Functions
Function Return Type Description
abs(x) (same as x) absolute value
e.g. abs(-17.4) => 17.4
cbrt(dp) dp cube root
e.g. cbrt(27.0) => 3
ceil(dp or numeric) (same as input) smallest integer not less than argument
e.g. ceil(-42.8) => -42
ceiling(dp or numeric) (same as input) smallest integer not less than argument (alias for ceil)
e.g. ceiling(-95.3) => -95
degrees(dp) dp radians to degrees
e.g. degrees(0.5) => 28.6478897565412
exp(dp or numeric) (same as input) exponential
e.g. exp(1.0) => 2.71828182845905
floor(dp or numeric) (same as input) largest integer not greater than argument
e.g. floor(-42.8) => -43
ln(dp or numeric) (same as input) natural logarithm
e.g. ln(2.0) => 0.693147180559945
log(dp or numeric) (same as input) base 10 logarithm
e.g. log(100.0) => 2
log(b numeric, x numeric) numeric logarithm to base b
e.g. log(2.0, 64.0) => 6.0000000000
mod(y, x) (same as argument types) remainder of y/x
e.g. mod(9,4) => 1
pi() dp “\pi” constant
e.g. pi() => 3.14159265358979
power(a dp, b dp) dp a raised to the power of b
e.g. power(9.0, 3.0) => 729
power(a numeric, b numeric) numeric a raised to the power of b
e.g. power(9.0, 3.0) => 729
radians(dp) dp degrees to radians
e.g. radians(45.0) => 0.785398163397448
random() dp random value between 0.0 and 1.0
e.g. random()
round(dp or numeric) (same as input) round to nearest integer
e.g. round(42.4) => 42
round(v numeric, s int) numeric round to s decimal places
e.g. round(42.4382, 2) => 42.44
setseed(dp) int set seed for subsequent random() calls (value between 0 and 1.0)
e.g. setseed(0.54823) => 1177314959
sign(dp or numeric) (same as input) sign of the argument (-1, 0, +1)
e.g. sign(-8.4) => -1
sqrt(dp or numeric) (same as input) square root
e.g. sqrt(2.0) => 1.4142135623731
trunc(dp or numeric) (same as input) truncate toward zero
e.g. trunc(42.8) => 42
trunc(v numeric, s int) numeric truncate to s decimal places
e.g. trunc(42.4382, 2) => 42.43
width_bucket(op numeric, b1 numeric, b2 numeric, count int) int return the bucket to which operand would be assigned in an equidepth histogram with count buckets, in the range b1 to b2
e.g. width_bucket(5.35, 0.024, 10.06, 5) => 3

Finally, Table 7-4 shows the available trigonometric functions. All trigonometric functions take arguments and return values of type double precision.

Table 7-4: Trigonometric Functions
Function Description
acos(x) inverse cosine
asin(x) inverse sine
atan(x) inverse tangent
atan2(x, y) inverse tangent of x/y
cos(x) cosine
cot(x) cotangent
sin(x) sine
tan(x) tangent

Thursday, November 22, 2007

Tuning PostgreSQL for performance

Table of Contents

1 Introduction
2 Some basic parameters
2.1 Shared buffers
2.2 Sort memory
2.3 Effective Cache Size
2.4 Fsync and the WAL files
3 Some less known parameters
3.1 random_ page_cost
3.2 Vacuum_ mem
3.3 max_fsm_pages
3.4 max fsm_ relations
3.5 wal_buffers
4 Other tips
4.1 Check your file system
4.2 Try the Auto Vacuum daemon
4.3 Try FreeBSD
5 The CONF Setting Guide

1 Introduction

This is a quick start guide for tuning PostgreSQL's settings for performance. This assumes minimal familiarity with PostgreSQL administration. In particular, one should know,

It also assumes that you have gone through the PostgreSQL administration manual before starting, and to have set up your PostgreSQL server with at least the default configuration.

There are two important things for any performance optimization:

Please also note that the tuning advices described here are hints. You should not implement them all blindly. Tune one parameter at a time and test its impact and decide whether or not you need more tuning. Testing and benchmarking is an integral part of database tuning.

Tuning the software settings explored in this article is only about one-third of database performance tuning, but it's a good start since you can experiment with some basic setting changes in an afternoon, whereas some other aspects of tuning can be very time-consuming. The other two-thirds of database application tuning are:

2 Some basic parameters

2.1 Shared buffers

Shared buffers defines a block of memory that PostgreSQL will use to hold requests that are awaiting attention from the kernel buffer and CPU. The default value is quite low for any real world workload and need to be beefed up. However, unlike databases like Oracle, more is not always better. There is a threshold above which increasing this value can hurt performance.

This is the area of memory PostgreSQL actually uses to perform work. It should be sufficient enough to handle load on database server. Otherwise PostgreSQL will start pushing data to file and it will hurt the performance overall. Hence this is the most important setting one needs to tune up.

This value should be set based on the dataset size which the database server is supposed to handle at peak loads and on your available RAM (keep in mind that RAM used by other applications on the server is not available). We recommend following rule of thumb for this parameter:

PLEASE NOTE. PostgreSQL counts a lot on the OS to cache data files and hence does not bother with duplicating its file caching effort. The shared buffers parameter assumes that OS is going to cache a lot of files and hence it is generally very low compared with system RAM. Even for a dataset in excess of 20GB, a setting of 128MB may be too much, if you have only 1GB RAM and an aggressive-at-caching OS like Linux.

There is one way to decide what is best for you. Set a high value of this parameter and run the database for typical usage. Watch usage of shared memory using ipcs or similar tools. A recommended figure would be between 1.2 to 2 times peak shared memory usage.

2.2 Sort memory

This parameter sets maximum limit on memory that a database connection can use to perform sorts. If your queries have order-by or group-by clauses that require sorting large data set, increasing this parameter would help. But beware: this parameter is per sort, per connection. Think twice before setting this parameter too high on any database with many users. A recommended approach is to set this parameter per connection as and when required; that is, low for most simple queries and higher for large, complex queries and data dumps.

2.3 Effective Cache Size

This parameter allows PostgreSQL to make best possible use of RAM available on your server. It tells PostgreSQL the size of OS data cache. So that PostgreSQL can draw different execution plan based on that data.

Say there is 1.5GB RAM in your machine, shared buffers are set to 32MB and effective cache size is set to 800MB. So if a query needs 700MB of data set, PostgreSQL would estimate that all the data required should be available in memory and would opt for more aggressive plan in terms of optimization, involving heavier index usage and merge joins. But if effective cache is set to only 200MB, the query planner is liable to opt for the more I/O efficient sequential scan.

While setting this parameter size, leave room for other applications running on the server machine. The objective is to set this value at the highest amount of RAM which will be available to PostgreSQL all the time.

2.4 Fsync and the WAL files

This parameters sets whether or not write data to disk as soon as it is committed, which is done through Write Ahead Logging (WAL). If you trust your hardware, your power company, and your battery power supply enough, you set this to No for an immediate boost to data write speed. But be very aware that any unexpected database shutdown will force you to restore the database from your last backup.

If that's not an option for you, you can still have the protection of WAL and better performance. Simply move your WAL files, using either a mount or a symlink to the pg_xlog directory, to a separate disk or array from your main database files. In high-write-activity databases, WAL should have its own disk or array to ensure continuous high-speed access. Very large RAID arrays and SAN/NAS devices frequently handle this for you through their internal management systems.

3 Some less known parameters

3.1 random_page_cost

This parameter sets the cost to fetch a random tuple from the database, which influences the planner's choice of index vs. table scan. This is set to a high value as the default default based on the expectation of slow disk access. If you have reasonably fast disks like SCSI or RAID, you can lower the cost to 2. You need to experiment to find out what works best for your setup by running a variety of queries and comparing execution times.

3.2 Vacuum_mem

This parameter sets the memory allocated to Vacuum. Normally, vacuum is a disk intensive process, but raising this parameter will speed it up by allowing PostgreSQL to copy larger blocks into memory. Just don't set it so high it takes significant memory away from normal database operation. Things between 16-32MB should be good enough for most setups.

3.3 max_fsm_pages

PostgreSQL records free space in each of its data pages. This information is useful for vacuum to find out how many and which pages to look for when it frees up the space.

If you have a database that does lots of updates and deletes, that is going to generate dead tuples, due to PostgreSQL's MVCC system. The space occupied by dead tuples can be freed with vacuum, unless there is more wasted space than is covered by the Free Space Map, in which case the much less convenient "vacuum full" is required. By expanding the FSM to cover all of those dead tuples, you might never again need to run vacuum full except on holidays.

The best way to set max _ fsm _ pages is interactive; First, figure out the vacuum (regular) frequency of your database based on write activity; next, run the database under normal production load, and run "vacuum verbose analyze" instead of vacuum, saving the output to a file; finally, calculate the maximum total number of pages reclaimed between vacuums based on the output, and use that.

Remember, this is a database cluster wide setting. So bump it up enough to cover all databases in your database cluster. Also, each FSM page uses 6 bytes of RAM for administrative overhead, so increasing FSM substantially on systems low on RAM may be counter-productive.

3.4 max _ fsm _ relations

This setting dictates how many number of relations (tables) will be tracked in free space map. Again this is a database cluster-wide setting, so set it accordingly. In version 7.3.3 and later, this parameter should be set correctly as a default. In older versions, bump it up to 300-1000.

3.5 wal_buffers

This setting decides the number of buffers WAL(Write ahead Log) can have. If your database has many write transactions, setting this value bit higher than default could result better usage of disk space. Experiment and decide. A good start would be around 32-64 corresponding to 256-512K memory.

4 Other tips

4.1 Check your file system

On OS like Linux, which offers multiple file systems, one should be careful about choosing the right one from a performance point of view. There is no agreement between PostgreSQL users about which one is best.

Contrary to popular belief, today's journaling file systems are not necessarily slower compared to non-journaling ones. Ext2 can be faster on some setups but the recovery issues generally make its use prohibitive. Different people have reported widely different experiences with the speed of Ext3, ReiserFS, and XFS; quite possibly this kind of benchmark depends on a combination of file system, disk/array configuration, OS version, and database table size and distribution. As such, you may be better off sticking with the file system best supported by your distribution, such as ReiserFS for SuSE Linux or Ext3 for Red Hat Linux, not to forget XFS known for it's large file support . Of course, if you have time to run comprehensive benchmarks, we would be interested in seeing the results!

As an easy performance boost with no downside, make sure the file system on which your database is kept is mounted "noatime", which turns off the access time bookkeeping.

4.2 Try the Auto Vacuum daemon

There is a little known module in PostgreSQL contrib directory called as pgavd. It works in conjunction with statistics collector. It periodically connects to a database and checks if it has done enough operations since the last check. If yes, it will vacuum the database.

Essentially it will vacuum the database when it needs it. It would get rid of playing with cron settings for vacuum frequency. It should result in better database performance by eliminating overdue vacuum issues.

4.3 Try FreeBSD

Large updates, deletes, and vacuum in PostgreSQL are very disk intensive processes. In particular, since vacuum gobbles up IO bandwidth, the rest of the database activities could be affected adversely when vacuuming very large tables.

OS's from the BSD family, such as FreeBSD, dynamically alter the IO priority of a process. So if you lower the priority of a vacuum process, it should not chew as much bandwidth and will better allow the database to perform normally. Of course this means that vacuum could take longer, which would be problematic for a "vacuum full."

If you are not done with your choice of OS for your server platform, consider BSD for this reason.

5 The CONF Setting Guide

Available here is an Annotated Guide to the PostgreSQL configuration file settings, in both OpenOffice.org and PDF format. This guide expands on the official documentation and may eventually be incorporated into it.
  • The first column of the chart is the GUC setting in the postgresql.conf file.
  • The second is the maximum range of the variable; note that the maximum range is often much larger than the practical range. For example, random_page_cost will accept any number between 0 and several billion, but all practical numbers are between 1 and 5.
  • The third column contains an enumeration of RAM or disk space used by each unit of the parameter.
  • The fourth column indicates whether or not the variable may be SET from the PSQL terminal during an interactive setting. Most settings marked as "no" may only be changed by restarting PostgreSQL.
  • The fifth column quotes the official documentation available from the PostgreSQL web site.
  • The last column is our notes on the setting, how to set it, resources it uses, etc. You'll notice some blank spaces, and should be warned as well that there is still strong disagreement on the value of many settings.
Users of PostgreSQL 7.3 and earlier will notice that the order of the parameters in this guide do not match the order of the parameters in your postgresql.conf file. This is because this document was generated as part of an effort to re-organize the conf parameters and documentation; starting with 7.4, this document, the official documentation, and the postgresql.conf file are all in the same logical order.

As noted in the worksheet, it covers PostgreSQL versions 7.3 and 7.4. If you are using an earlier version, you will not have access to all of these settings, and defaults and effects of some settings will be different.