Basic Query optimization in postgres

Jul 30, 2021 sql postgres

Let's load dataset into postgres.

Run postgres docker container

1docker run -d --name postgres-shared-buff-default -e POSTGRES_PASSWORD=ungabunga postgres

If you use tools like pgadmin, please follow this link to import csv data into database.

Let's see how many rows we have using count query.

1select count(id) from worldcities;
2
3 count 
4-------
5 25596
6(1 row)

We know that creating indexes help in searching and by default when we create index, it is created using strategy btree. Analysing following statement yields following result:

 1explain analyse select * from worldcities where iso2='IN';
 2
 3QUERY PLAN                                                 
 4------------------------------------------------------------------------------------------------------------
 5 Seq Scan on worldcities  (cost=0.00..650.95 rows=433 width=70) (actual time=0.018..5.340 rows=433 loops=1)
 6   Filter: (iso2 = 'IN'::text)
 7   Rows Removed by Filter: 25163
 8 Planning Time: 0.743 ms
 9 Execution Time: 5.424 ms
10(5 rows)

We are doing sequential scan here and filtering result.

Now lets create index on iso2 field since our application will be doing many queries on this.

1drop index if exists idx_world_country_iso2; -- drop index if it exists
2create index idx_world_country_iso2 on worldcities using hash(iso2); --creates index using hash

Let's re-examine our analyse query

 1explain analyse select * from worldcities where iso2='IN';
 2
 3QUERY PLAN                                                             
 4------------------------------------------------------------------------------------------------------------------------------------
 5 Bitmap Heap Scan on worldcities  (cost=15.36..369.48 rows=433 width=70) (actual time=0.162..0.747 rows=433 loops=1)
 6   Recheck Cond: (iso2 = 'IN'::text)
 7   Heap Blocks: exact=142
 8   ->  Bitmap Index Scan on idx_world_country_iso2  (cost=0.00..15.25 rows=433 width=0) (actual time=0.099..0.100 rows=433 loops=1)
 9         Index Cond: (iso2 = 'IN'::text)
10 Planning Time: 0.267 ms
11 Execution Time: 0.851 ms
12(7 rows)

You can clearly see the difference in Execution Time before and after creating index on iso2 field.

If you do not see any improvements after creating index, it mostly probably that db's shared_buffers value is very low defaults to 128MB.

1 shared_buffers 
2----------------
3 128MB
4(1 row)

To test this scenario, run your docker container using custom shared_buffers value liek 1GB;

Docker command to use custom shared_buffers value

1docker run -d --name postgres-shared-buff-1024MB \
2    -e POSTGRES_PASSWORD=ungabunga postgres \
3    -c shared_buffers=1024MB

now show shared_buffers should tell current value and hopefully you should see some improvement.

1show shared_buffers;
2
3 shared_buffers 
4----------------
5 1GB
6(1 row)