Basic Query optimization in postgres
Let's load dataset into postgres.
Run postgres docker container
1docker run -d --name postgres-shared-buff-default -e POSTGRES_PASSWORD=ungabunga postgres
If you use tools like pgadmin, please follow this link to import csv data into database.
Let's see how many rows we have using count query.
1select count(id) from worldcities;
2
3 count
4-------
5 25596
6(1 row)
We know that creating indexes help in searching and by default when we create index, it is created using strategy btree. Analysing following statement yields following result:
1explain analyse select * from worldcities where iso2='IN';
2
3QUERY PLAN
4------------------------------------------------------------------------------------------------------------
5 Seq Scan on worldcities (cost=0.00..650.95 rows=433 width=70) (actual time=0.018..5.340 rows=433 loops=1)
6 Filter: (iso2 = 'IN'::text)
7 Rows Removed by Filter: 25163
8 Planning Time: 0.743 ms
9 Execution Time: 5.424 ms
10(5 rows)
We are doing sequential scan here and filtering result.
Now lets create index on iso2
field since our application will be doing many queries on this.
1drop index if exists idx_world_country_iso2; -- drop index if it exists
2create index idx_world_country_iso2 on worldcities using hash(iso2); --creates index using hash
Let's re-examine our analyse query
1explain analyse select * from worldcities where iso2='IN';
2
3QUERY PLAN
4------------------------------------------------------------------------------------------------------------------------------------
5 Bitmap Heap Scan on worldcities (cost=15.36..369.48 rows=433 width=70) (actual time=0.162..0.747 rows=433 loops=1)
6 Recheck Cond: (iso2 = 'IN'::text)
7 Heap Blocks: exact=142
8 -> Bitmap Index Scan on idx_world_country_iso2 (cost=0.00..15.25 rows=433 width=0) (actual time=0.099..0.100 rows=433 loops=1)
9 Index Cond: (iso2 = 'IN'::text)
10 Planning Time: 0.267 ms
11 Execution Time: 0.851 ms
12(7 rows)
You can clearly see the difference in Execution Time
before and after creating index on iso2
field.
If you do not see any improvements after creating index, it mostly probably that db's shared_buffers
value is very low defaults to 128MB
.
1 shared_buffers
2----------------
3 128MB
4(1 row)
To test this scenario, run your docker container using custom shared_buffers
value liek 1GB;
Docker command to use custom shared_buffers value
1docker run -d --name postgres-shared-buff-1024MB \
2 -e POSTGRES_PASSWORD=ungabunga postgres \
3 -c shared_buffers=1024MB
now show shared_buffers
should tell current value and hopefully you should see some improvement.
1show shared_buffers;
2
3 shared_buffers
4----------------
5 1GB
6(1 row)