Sql-server – PostgreSQL 9.3 on Ubuntu Server 12.04 v.s. MS SQL Server 2008 R2 on Windows 7 Ultima

linuxpostgresqlsql serverUbuntuwindows

I have searched many articles in Google about the performance of PostgreSQL vs MS SQL 2008 R2 and found that many of those articles say the performance of PostgreSQL is better than that of MSSQL 2008 R2. So, I make a real test with my company's real case.

Hardware:
I build two vmware guests in ESXi 5.5, each of the guest has the same virtual hardware specification, which include (I only list the key items):

  • Memory: 4GB
  • CPUs: one virtual socket with two cores
  • Virtual Disk: 50GB for Ubuntu server and 200GB for Windows 7 (well, as you know, Microsoft always consumes more storage space)

VM Host (a single box ):

  • RAM: 16GB with intel-i5 2 cores with hyper threading
  • HD: 1TGB 2.5"

Software:

  • Windows 7 Ultima + MS SQL Server 2008 R2, all with default parameters
  • Ubuntu Linux Server 12.04 + PostgreSQL 9.3, all with default parameters
  • DB tools: Db Visualizer 9.0.9 (on another laptop)
  • JDBC:
    • PostgreSQL: postgresql-9.3-1101.jdbc4
    • MSSQL 2008R2: sqljdbc_4.0.2206.100_cht.tar

Table schemas (same on PostgreSQL and MSSQL 2008 R2, except the naming convention)

I have three tables as below:

CREATE TABLE gourmet (
  vendor_url CHARACTER VARYING(200),
  ticket_price CHARACTER VARYING(8000),
  tour_advice CHARACTER VARYING(8000),
  gourmet_name_c CHARACTER VARYING(180),
  tel_ext CHARACTER VARYING(5),
  gourmet_name_e CHARACTER VARYING(60),
  arrangement CHARACTER VARYING(8000),
  data_source CHARACTER VARYING(200),
  open_time CHARACTER VARYING(8000),
  vendor_name CHARACTER VARYING(60),
  gps_x CHARACTER VARYING(20),
  gourmet_id CHARACTER VARYING(36) NOT NULL,
  gps_y CHARACTER VARYING(20),
  updated_by CHARACTER VARYING(20),
  recom_level DOUBLE PRECISION,
  fax_num CHARACTER VARYING(10),
  address_c CHARACTER VARYING(240),
  fax_area_code CHARACTER VARYING(4),
  long_desc CHARACTER VARYING(8000),
  for_search_only BOOLEAN,
  status CHARACTER VARYING(1),
  category CHARACTER VARYING(3),
  area_code CHARACTER VARYING(4),
  county CHARACTER VARYING(3),
  email CHARACTER VARYING(40),
  updated_date TIMESTAMP(6) WITHOUT TIME ZONE,
  data_source_url CHARACTER VARYING(200),
  tour_duration CHARACTER VARYING(400),
  short_name CHARACTER VARYING(180),
  created_by CHARACTER VARYING(20),
  created_date TIMESTAMP(6) WITHOUT TIME ZONE,
  town CHARACTER VARYING(3),
  address_e CHARACTER VARYING(80),
  parking_info CHARACTER VARYING(8000),
  remark CHARACTER VARYING(8000),
  location_info CHARACTER VARYING(8000),
  tel_no CHARACTER VARYING(10),
  CONSTRAINT gourmet_pk PRIMARY KEY (gourmet_id)
);

CREATE INDEX _dta_index_gourmet_5_574625090__k9_k4_k1_k10_k31_k2_11_14_15_33 ON gourmet (
  county,
  status,
  gourmet_id,
  town,
  vendor_url,
  gourmet_name_c
);

CREATE INDEX  _dta_index_gourmet_7_1796201449__k9_k1_k2 ON   gourmet
(
  county,
  gourmet_id,
  gourmet_name_c
);

CREATE INDEX idx_grourmet_status ON gourmet (
  status
);
=======================================================

CREATE TABLE photos (
  source CHARACTER VARYING(100),
  caption CHARACTER VARYING(100),
  display_order INTEGER,
  main_photo BOOLEAN,
  bytes_original BYTEA,
  folder_id CHARACTER VARYING(36),
  photo_id INTEGER DEFAULT nextval('photos_photo_id_seq'::regclass) NOT NULL,
  bytes_thumb BYTEA,
  CONSTRAINT photos_pk PRIMARY KEY (photo_id)
);

CREATE INDEX  _dta_index_photos_5_1925581898__k2_k5_1 ON  photos  (
  folder_id,
  main_photo
);

CREATE INDEX idx_photos_folder_id ON photos (
  folder_id
);

CREATE INDEX _dta_index_photos_5_1925581898__k5_k2_k1 ON  photos  (
  main_photo,
  folder_id,
  photo_id
);

====================================================

CREATE TABLE counters (
  item_id CHARACTER VARYING(36) NOT NULL,
  counter_name CHARACTER VARYING(30) NOT NULL,
  target_date CHARACTER(10) NOT NULL,
  total_hit BIGINT,
  CONSTRAINT counters_pk PRIMARY KEY (item_id, counter_name, target_date)
);

There are 4108 records in table gourmet, 37451 records in table photos, and 11659353 records in table counters.

For avoiding the resultset's transmission consuming the timing, so I just count the record number.

SQL statement used in PostgreSQL:

SELECT count(*) FROM (
  SELECT        a.gourmet_id item_id, a.gourmet_name_c item_name, a.vendor_url external_url,  'fnf-item.aspx?pid=' || a.gourmet_id fun_taiwan_url,                    
        b.photo_id,coalesce(SUM(c.total_hit),0) hit_count,to_char(a.created_date, 'YYYY/MM/DD HH24:MI:SS') created_date,                                                  
        '1' is_gourmet, a.for_search_only, a.area_code || '-' || a.tel_no tel_no, a.tel_ext, a.county, a.town, a.address_c , 'gourmet.png' icon_name    ,            
        now() as updated_date, 'NonOrderScheduler' as updated_by                                                                                                                                         
  FROM      gourmet a                                                                                                                                                                                          
  LEFT JOIN photos b ON b.folder_id=a.gourmet_id AND b.main_photo='1'                                                                                       
  LEFT JOIN counters c ON c.item_id=a.gourmet_id AND c.counter_name='FnfHit'                                                                              
  WHERE     a.status='A'                                                                                                                               
  GROUP BY  a.gourmet_id, a.gourmet_name_c, length(a.vendor_url),a.vendor_url, 'fnf-item.aspx?pid=' || a.gourmet_id,b.photo_id, to_char(a.created_date,'YYYY/MM/DD HH24:MI:SS'),
        a.for_search_only, a.area_code || '-' || a.tel_no, a.tel_ext,a.county, a.town, a.address_c
) t

SQL used in MSSQL 2008 R2:

CHECKPOINT; 
GO 
DBCC DROPCLEANBUFFERS; --Clean Query Cache
GO
SELECT count(*) FROM (
  SELECT        a.GourmetId ItemId, a.GourmetNameC ItemName, a.VendorUrl ExternalUrl,  'fnf-item.aspx?pid=' + a.GourmetId FunTaiwanUrl,                    
        b.PhotoId,ISNULL(SUM(c.TotalHit),0) HitCount, CONVERT(VARCHAR,a.CreatedDate,111) CreatedDate,                                                  
        '1' IsGourmet, a.ForSearchOnly, a.AreaCode + '-' + a.TelNo TelNo, a.TelExt, a.County, a.Town, a.AddressC , 'gourmet.png' IconName   ,            
        GetDate() as UpdatedDate, 'NonOrderScheduler' as UpdatedBy                                                                                                                                       
  FROM      Gourmet a                                                                                                                                                                                          
  LEFT JOIN Photos b ON b.FolderId=a.GourmetId AND b.MainPhoto=1                                                                                       
  LEFT JOIN Counters c ON c.ItemId=a.GourmetId AND c.CounterName='FnfHit'                                                                              
  WHERE     a.[Status]='A'                                                                                                                               
  GROUP BY  a.GourmetId, a.GourmetNameC, len(a.VendorUrl),a.VendorUrl, 'fnf-item.aspx?pid=' + a.GourmetId,b.PhotoId,    CONVERT(VARCHAR,a.CreatedDate,111),
        a.ForSearchOnly, a.AreaCode + '-' + a.TelNo, a.TelExt,a.County, a.Town, a.AddressC 
) t

Actually, the two SQL statements are identical , except the syntax and function.

Test method:

I execute the SQL statement in DbVisualizer from another laptop. There are just the two vmware guests in the vmware host.
When I start to test PostgreSQL, the Windows 7 Ultimate keeps no foreground software running, and vice versa on Ubuntu server.

With the same table schemas, same records, and the same SQL statement in MS SQL Server 2008 on Windows 7 Ultimate, I found the result is far different from those articles who say PostgreSQL is faster that of MSSQL2008R2.

For counting the resultset size, MSSQL2008 R2 spends 0.016 seconds, and PostgreSQL spends 132 seconds. As you can see, before executing SQL on MSSQL2008R2, I clean its query cache first, but I didn't clean query cache on PostgreSQL.

My question is, how should I tune PostgreSQL so that it can query faster than that of MSSQL2008 R2 in this case ?

I know some of you may ask me to tune the SQL statement with explain analyze command. However, basically, I don't wanna tune the SQL statement unless the server(OS + PostgreSQL) parameters have been tuned to the best situation because I am planing to moving all data in my company from MSSQL2008R2 to PostgreSQL. So, it is impossible to tune each SQL statement for every system.

If by tuning the OS parameters and PostgreSQL parameters can solve my problem, I prefer not to tune SQL statements in the short time. You know it will be a huge job.

Thank you for your help.

Attached the explain analyze result of PostgreSQL

    HashAggregate  (cost=305063.90..328086.20 rows=920892 width=126) (actual time=4595.501..4598.155 rows=4019 loops=1)
      ->  Nested Loop Left Join  (cost=450.60..272832.68 rows=920892 width=126) (actual time=4.304..2567.521 rows=1690393 loops=1)
            ->  Hash Right Join  (cost=450.04..1230.97 rows=4132 width=118) (actual time=4.223..20.423 rows=4019 loops=1)
                  Hash Cond: ((b.folder_id)::text = (a.gourmet_id)::text)
                  ->  Bitmap Heap Scan on photos b  (cost=200.44..848.93 rows=12149 width=13) (actual time=0.896..5.782 rows=12166 loops=1)
                        Filter: main_photo
                        ->  Bitmap Index Scan on _dta_index_photos_5_1925581898__k5_k2_k1  (cost=0.00..197.41 rows=12149 width=0) (actual time=0.840..0.840 rows=12166 loops=1)
                              Index Cond: (main_photo = true)
                  ->  Hash  (cost=199.36..199.36 rows=4019 width=114) (actual time=3.305..3.305 rows=4019 loops=1)
                        Buckets: 1024  Batches: 1  Memory Usage: 603kB
                        ->  Seq Scan on gourmet a  (cost=0.00..199.36 rows=4019 width=114) (actual time=0.012..1.768 rows=4019 loops=1)
                              Filter: ((status)::text = 'A'::text)
                              Rows Removed by Filter: 90
            ->  Index Scan using counters_pk on counters c  (cost=0.56..60.72 rows=223 width=16) (actual time=0.029..0.132 rows=421 loops=4019)
                  Index Cond: (((item_id)::text = (a.gourmet_id)::text) AND ((counter_name)::text = 'FnfHit'::text))
    Total runtime: 4600.750 ms

MSSQL2008R2 Explain Analyze Result:

  |--Compute Scalar(DEFINE:([Expr1013]='fnf-item.aspx?pid='+[SlowTravel].[dbo].[Gourmet].[GourmetId] as [a].[GourmetId], [Expr1014]=isnull([Expr1012],(0)), [Expr1015]='1', [Expr1016]='gourmet.png', [Expr1017]=getdate(), [Expr1018]='NonOrderScheduler'))
       |--Compute Scalar(DEFINE:([Expr1012]=CASE WHEN [Expr1027]=(0) THEN NULL ELSE [Expr1028] END))
            |--Stream Aggregate(GROUP BY:([a].[GourmetId], [Expr1008], [Expr1009], [b].[PhotoId], [Expr1010], [Expr1011]) DEFINE:([Expr1027]=COUNT_BIG([SlowTravel].[dbo].[Counters].[TotalHit] as [c].[TotalHit]), [Expr1028]=SUM([SlowTravel].[dbo].[Counters].[TotalHit] as [c].[TotalHit]), [a].[GourmetNameC]=ANY([SlowTravel].[dbo].[Gourmet].[GourmetNameC] as [a].[GourmetNameC]), [a].[VendorUrl]=ANY([SlowTravel].[dbo].[Gourmet].[VendorUrl] as [a].[VendorUrl]), [a].[ForSearchOnly]=ANY([SlowTravel].[dbo].[Gourmet].[ForSearchOnly] as [a].[ForSearchOnly]), [a].[TelExt]=ANY([SlowTravel].[dbo].[Gourmet].[TelExt] as [a].[TelExt]), [a].[County]=ANY([SlowTravel].[dbo].[Gourmet].[County] as [a].[County]), [a].[Town]=ANY([SlowTravel].[dbo].[Gourmet].[Town] as [a].[Town]), [a].[AddressC]=ANY([SlowTravel].[dbo].[Gourmet].[AddressC] as [a].[AddressC])))
                 |--Nested Loops(Left Outer Join, OUTER REFERENCES:([a].[GourmetId], [Expr1026]) WITH UNORDERED PREFETCH)
                      |--Merge Join(Left Outer Join, MERGE:([a].[GourmetId])=([b].[FolderId]), RESIDUAL:([SlowTravel].[dbo].[Photos].[FolderId] as [b].[FolderId]=[SlowTravel].[dbo].[Gourmet].[GourmetId] as [a].[GourmetId]))
                      |    |--Sort(ORDER BY:([a].[GourmetId] ASC))
                      |    |    |--Compute Scalar(DEFINE:([Expr1008]=len([SlowTravel].[dbo].[Gourmet].[VendorUrl] as [a].[VendorUrl]), [Expr1009]='fnf-item.aspx?pid='+[SlowTravel].[dbo].[Gourmet].[GourmetId] as [a].[GourmetId], [Expr1010]=CONVERT(varchar(30),[SlowTravel].[dbo].[Gourmet].[CreatedDate] as [a].[CreatedDate],111), [Expr1011]=([SlowTravel].[dbo].[Gourmet].[AreaCode] as [a].[AreaCode]+'-')+[SlowTravel].[dbo].[Gourmet].[TelNo] as [a].[TelNo]))
                      |    |         |--Table Scan(OBJECT:([SlowTravel].[dbo].[Gourmet] AS [a]), WHERE:([SlowTravel].[dbo].[Gourmet].[Status] as [a].[Status]='A'))
                      |    |--Index Seek(OBJECT:([SlowTravel].[dbo].[Photos].[_dta_index_Photos_5_1925581898__K5_K2_K1] AS [b]), SEEK:([b].[MainPhoto]=(1)) ORDERED FORWARD)
                      |--Clustered Index Seek(OBJECT:([SlowTravel].[dbo].[Counters].[Counters_PK] AS [c]), SEEK:([c].[CounterName]='FnfHit' AND [c].[ItemId]=[SlowTravel].[dbo].[Gourmet].[GourmetId] as [a].[GourmetId]) ORDERED FORWARD)

How do I avoid sort in the explain result? Actuall I didn't ask any sort in the SQL Statement.
Thank you for your help.

Best Answer

How do I avoid sort in the explain result? Actuall I didn't ask any sort in the SQL Statement.

You've asked for rows to be aggregated. One way to do this is to sort the data set and then scan it to collapse out duplicates. This can be faster than hash aggregation, which is the other way PostgreSQL knows how to do grouping.

So while you didn't explicitly say "sort the rows", it's still sorting them because of something you asked for.

The immediate problem is that PostgreSQL is being very conservative with how much memory it's using for sorts:

Sort Method: external merge Disk: 317656kB

and is doing a 300MB on-disk sort. You can see that pretty clearly if you take a look at the plain on explain.depesz.com.

If you:

SET work_mem = '400MB';

before running the query, it should be a whole lot different.

Unfortunately it's not a simple as changing your configuration, because PostgreSQL isn't too clever about resource management. Per the documentation on work_mem, it may use up to work_mem bytes per sort or join, per-session. So if you have max_connections = 50 and you're running complicated queries, you might find yourself using many gigabytes of working memory, exceeding free memory, and hitting swap. Which you really don't want.

It also seems to be doing a seqscan on counters, but since it's finding about 1/4 the rows match the condition, that's probably the right thing to do - an index scan would probably be slower.

I find the default work_mem way too conservative and tend to set it to at least 100MB on any reasonably large system. I also prefer to run PostgreSQL with PgBouncer in front, and a low max_connections, allowing me to throw more resources at each individual connection.

Frankly, I'm curious to see how MS SQL Server executes this, because the numbers you report are astonishing for a query like this.