{"id":451,"date":"2022-02-11T21:39:35","date_gmt":"2022-02-11T21:39:35","guid":{"rendered":"https:\/\/maboc.nl\/?p=451"},"modified":"2022-02-22T18:58:07","modified_gmt":"2022-02-22T18:58:07","slug":"db-2-cost-calculations-1-table-scan","status":"publish","type":"post","link":"https:\/\/maboc.nl\/?p=451","title":{"rendered":"DB 2 : Cost calculations 1 : Table Scan"},"content":{"rendered":"<p><a href=\"https:\/\/maboc.nl\/?p=487\" rel=\"noopener\" target=\"_blank\">Index<\/a><\/p>\n<p><strong>Why?<\/strong><br \/>\nWhen you supply DB2 with a query DB2 has to make some serious decisions:<br \/>\n&#8211; Should indexes be utilized?<br \/>\n&#8211; Which indexes are most suitable?<br \/>\n&#8211; Which join operation should be used? (hash join\/nested loop)<br \/>\n&#8211; Is all data needed at once or are the first 10 rows as quick as possible (the remainder comes later) okay?<\/p>\n<p>The optimizer is the part of the DB2 engine which gives answers to all these questions. All possibilities are taken into account to determine an excution-plan for the specific query. To do this the optimizer considers a lot of possibilitiesand give them a weigth. The excution plan with the lowest weight (wich we will call cost) will be chosen as _the_ execution-plan for the query.<\/p>\n<p><strong>What is cost?<\/strong><br \/>\nThe optimizer needs a way to measure how long an operation takes. That can be I\/O, that can be a calculation, it all costs time. The longer we have to wait, the more costly an operation is. We want to keep the cost as low as possible. The optimizer will try to find out an execution plan with lowest possible cost.<\/p>\n<p>The total Cost is composed of I\/O Cost and CPU Cost.<br \/>\nIn the following examples I focus on the IO component of the cost. This is not to bad since I _only_ do Full Table Scan (TBScan)<\/p>\n<p>As soon as we start to hit joins (hash join \/ nested loop \/ &#8230;) or sorts, the CPU Cost will increase, and I will also look at the CPU component of the cost.<\/p>\n<p><strong>Pages<\/strong><br \/>\nDB2 is fetching pages from the datafiles!!! It is not fetching single rows, it fetches pages. If DB2 needs one single row it still fetches a whole page from the datafile. In-memory the exact row will be retrieved from the page.<br \/>\nSo we should have a focus on pages (not rows) when discussing cost of I\/O in a DB2 system.<br \/>\nI can easily imagine tables which are very narrow (a few columns of char or integer), or very width (a few varchar(255) columns).<br \/>\nThe number of narrow rows in a page is much bigger compared to the number of width rows on a page. If rou need 1000 rows of a narrow table they may come from only 1 or 2 pages. For a very width table on the other hand 1000 rows may well be stored in 10000 pages. For DB2 the number of pages in respect to I\/O is important, not the number of rows.<\/p>\n<p><strong>Demo Table<\/strong><br \/>\nTo play around we need to create some demonstration tables. I&#8217;ll just show one here.<\/p>\n<pre>drop table cost1;\r\n\r\ncreate table cost1 (n1 int, n2 int, n3 int, t varchar(100));\r\n\r\ninsert into cost1 \r\n  WITH engine (rownum) AS (\r\n                            SELECT 1 AS rownum \r\n                            from   sysibm.sysdummy1\r\n                            UNION ALL\r\n                            SELECT rownum + 1 AS rownum\r\n                            FROM   engine\r\n                            where  rownum&lt;10\r\n                          )\r\n  SELECT rownum, \r\n         round(rownum\/1000,0), rownum%1000, 't'\r\n  FROM   engine;\r\n\r\ncommit;\r\n\r\nrunstats on table cost1 and indexes all;\r\n<\/pre>\n<p>Thanks to recursive CTE (Common Table Expressions) I can create a table with an arbitrary number of rows. In the above example I\u00b4ll create a table with 10 rows.<\/p>\n<p>For the purpose of this demonstration I actually created a few more tables (see below). After creating and filling the tables I collected statistics on them:<\/p>\n<p>These table do not allready have indexes but that will come in future articles, and it&#8217;s my habit to collect basic statistics like this.<\/p>\n<p>Look at some statistics<\/p>\n<pre>select cast(tabname as varchar(10)) tabname,\r\n       npages,\r\n       mpages,\r\n       fpages,\r\n       card,\r\n       card\/npages rows_per_page\r\nfrom   syscat.tables\r\nwhere  lower(tabschema)=lower('db2inst1')\r\n       and lower(tabname) like lower('cost%');\r\n\r\n\r\nTABNAME    NPAGES    MPAGES    FPAGES    CARD      ROWS_PER_PAGE       \r\n---------- --------- --------- --------- --------- -------------\r\nCOST0              0         0         1         0            -1\r\nCOST1              1         0         1        10            10\r\nCOST2              2         0         2       100            50\r\nCOST3              9         0         9      1000           111\r\nCOST4             78         0        78     10000           128\r\nCOST5            776         0       776    100000           128\r\nCOST6           7756         0      7756   1000000           128\r\n<\/pre>\n<p>Now move on. Let&#8217;s see how the optimizer thinks we can get _all_ data from a table. There is more then 1 way to look at explain plans, but for the (quite simple) explain we perform now, the following will suffice:<\/p>\n<pre>db2expln -t -g -d tinus -q \"select * from cost1\"\r\n\r\nOptimizer Plan:\r\n\r\n    Rows   \r\n  Operator \r\n    (ID)   \r\n    Cost   \r\n          \r\n    10    \r\n  RETURN  \r\n   ( 1)   \r\n  6.76828 \r\n    |     \r\n    10    \r\n  TBSCAN  \r\n   ( 2)   \r\n  6.76828 \r\n    |     \r\n    10    \r\n Table:   \r\n DB2INST1 \r\n COST1<\/pre>\n<p>The above is just the graphical representation of the explain plan. db2expln actually gives you must more information.<\/p>\n<p>-t : The statement is on one line (no separator)<br \/>\n-g : Also give a graphical representation of the execution plan<br \/>\n-d : The database on wich the explain operates<br \/>\n-q : The actual statement we want explained<\/p>\n<p>&#8211; As you can see at the very bottom of the graph, the source of the rows is table cost1 which has 10 rows.<br \/>\n&#8211; The first operation is a TBSCAN and has a cost of 6.76828<br \/>\n&#8211; The first operation (the TBSCAN) has ID 2. In the other parts of the explain plan (which are not show here) this operation is labeled 2.<br \/>\n&#8211; The query is expected to return 10 rows and has a cost of 6,76828. The total cost consist of only the cost of the TBSCAN (which is almost completly I\/O).<\/p>\n<table style=\"border-collapse: collapse; width: 100%; height: 135px;\">\n<tbody>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Table<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\"><strong>Cost0<\/strong><\/td>\n<td style=\"width: 10.452%; height: 27px;\"><strong>Cost1<\/strong><\/td>\n<td style=\"width: 11.6527%; height: 27px;\"><strong>Cost2<\/strong><\/td>\n<td style=\"width: 8.93356%; height: 27px;\"><strong>Cost3<\/strong><\/td>\n<td style=\"width: 8.77471%; height: 27px;\"><strong>Cost4<\/strong><\/td>\n<td style=\"width: 10.7433%; height: 27px;\"><strong>Cost5<\/strong><\/td>\n<td style=\"width: 16.6755%; height: 27px;\"><strong>Cost6<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Rows<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\">0<\/td>\n<td style=\"width: 10.452%; height: 27px;\">10<\/td>\n<td style=\"width: 11.6527%; height: 27px;\">100<\/td>\n<td style=\"width: 8.93356%; height: 27px;\">1000<\/td>\n<td style=\"width: 8.77471%; height: 27px;\">10000<\/td>\n<td style=\"width: 10.7433%; height: 27px;\">100000<\/td>\n<td style=\"width: 16.6755%; height: 27px;\">1000000<\/td>\n<\/tr>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Pages<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\">1<\/td>\n<td style=\"width: 10.452%; height: 27px;\">1<\/td>\n<td style=\"width: 11.6527%; height: 27px;\">2<\/td>\n<td style=\"width: 8.93356%; height: 27px;\">9<\/td>\n<td style=\"width: 8.77471%; height: 27px;\">78<\/td>\n<td style=\"width: 10.7433%; height: 27px;\">776<\/td>\n<td style=\"width: 16.6755%; height: 27px;\">7756<\/td>\n<\/tr>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Cost Table Scan<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\">6.76747<\/td>\n<td style=\"width: 10.452%; height: 27px;\">6.76828<\/td>\n<td style=\"width: 11.6527%; height: 27px;\">13.5412<\/td>\n<td style=\"width: 8.93356%; height: 27px;\">60.9737<\/td>\n<td style=\"width: 8.77471%; height: 27px;\">71.2083<\/td>\n<td style=\"width: 10.7433%; height: 27px;\">691.727<\/td>\n<td style=\"width: 16.6755%; height: 27px;\">6917.09<\/td>\n<\/tr>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Cost\/Page<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\">6,76747<\/td>\n<td style=\"width: 10.452%; height: 27px;\">6,76828<\/td>\n<td style=\"width: 11.6527%; height: 27px;\">6,7706<\/td>\n<td style=\"width: 8.93356%; height: 27px;\">6,7749<\/td>\n<td style=\"width: 8.77471%; height: 27px;\">0,9130<\/td>\n<td style=\"width: 10.7433%; height: 27px;\">0,8914<\/td>\n<td style=\"width: 16.6755%; height: 27px;\">0,8918<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>What is shown here that the cost increases if the number of pages increases. However if we look at the cost per page we see a ceratin drop in cost\/page.<\/p>\n<p>The cost is almost the same for the table with 1, 2 and 9 pages (around 6,77\/page).\u00a0 Then there is a decrease in the cost to 0,89\/page.<\/p>\n<p>At a certain point I need to start to tell about prefetching of pages, but that point is not yet. I&#8217; ll suffice by telling that the prefetch page count is 32. Let&#8217;s close in to that 32 pages mark. Create a table with 3900 rows (31 pages) and 4000 rows (32 pages)<\/p>\n<table style=\"border-collapse: collapse; width: 100%;\">\n<tbody>\n<tr>\n<td style=\"width: 33.3333%;\"><strong>Table<\/strong><\/td>\n<td style=\"width: 33.3333%;\"><strong>Cost7<\/strong><\/td>\n<td style=\"width: 33.3333%;\"><strong>Cost8<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 33.3333%;\"><strong>Rows<\/strong><\/td>\n<td style=\"width: 33.3333%;\">3900<\/td>\n<td style=\"width: 33.3333%;\">4000<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 33.3333%;\"><strong>Pages<\/strong><\/td>\n<td style=\"width: 33.3333%;\">31<\/td>\n<td style=\"width: 33.3333%;\">32<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 33.3333%;\"><strong>Cost Table Scan<\/strong><\/td>\n<td style=\"width: 33.3333%;\">210,053<\/td>\n<td style=\"width: 33.3333%;\">28,5167<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 33.3333%;\"><strong>Cost\/Page<\/strong><\/td>\n<td style=\"width: 33.3333%;\">6,7759<\/td>\n<td style=\"width: 33.3333%;\">0,8911<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The cut-off is 32 pages (the prefetch page count)<\/p>\n<p>Just for good measure, also look at a table with a different width. We&#8217;ll make the varchar column a 100 characters width.<\/p>\n<pre>insert into cost2_1 \r\n  WITH engine (rownum) AS (\r\n                            SELECT 1 AS rownum \r\n                            from   sysibm.sysdummy1\r\n                            UNION ALL\r\n                            SELECT rownum + 1 AS rownum\r\n                            FROM   engine\r\n                            where  rownum&lt;10\r\n                          )\r\n  SELECT rownum, \r\n         round(rownum\/1000,0), rownum%1000, rpad('t',100, 'x') \r\n  FROM   engine;\r\n<\/pre>\n<p>The text column isn&#8217;t 1 character width anymore but 100. The row is much longer, so less rows fit on a page, and thus we need more pages for the same number of rows.<\/p>\n<p>See what the tables look like now:<\/p>\n<pre>TABNAME     NPAGES       MPAGES       FPAGES       CARD         ROWS_PER_PAGE       \r\n----------- ------------ ------------ ------------ ------------ -------------\r\nCOST2_0                0            0            1            0            -1\r\nCOST2_1                1            0            1           10            10\r\nCOST2_2                4            0            4          100            25\r\nCOST2_3               34            0           34         1000            29\r\nCOST2_7              131            0          131         3900            29\r\nCOST2_8              134            0          134         4000            29\r\nCOST2_4              334            0          334        10000            29\r\nCOST2_5             3335            0         3335       100000            29\r\nCOST2_6            33349            0        33349      1000000            29\r\n<\/pre>\n<table style=\"border-collapse: collapse; width: 100%; height: 135px;\">\n<tbody>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Table<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\"><strong>Cost2_0<\/strong><\/td>\n<td style=\"width: 10.452%; height: 27px;\"><strong>Cost2_1<\/strong><\/td>\n<td style=\"width: 11.6527%; height: 27px;\"><strong>Cost2_2<\/strong><\/td>\n<td style=\"width: 8.93356%; height: 27px;\"><strong>Cost2_3<\/strong><\/td>\n<td style=\"width: 8.77471%; height: 27px;\"><strong>Cost2_4<\/strong><\/td>\n<td style=\"width: 10.7433%; height: 27px;\"><strong>Cost2_5<\/strong><\/td>\n<td style=\"width: 8.33775%; height: 27px;\"><strong>Cost2_6<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Rows<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\">0<\/td>\n<td style=\"width: 10.452%; height: 27px;\">10<\/td>\n<td style=\"width: 11.6527%; height: 27px;\">100<\/td>\n<td style=\"width: 8.93356%; height: 27px;\">1000<\/td>\n<td style=\"width: 8.77471%; height: 27px;\">10000<\/td>\n<td style=\"width: 10.7433%; height: 27px;\">100000<\/td>\n<td style=\"width: 8.33775%; height: 27px;\">1000000<\/td>\n<\/tr>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Pages<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\">1<\/td>\n<td style=\"width: 10.452%; height: 27px;\">1<\/td>\n<td style=\"width: 11.6527%; height: 27px;\">4<\/td>\n<td style=\"width: 8.93356%; height: 27px;\">34<\/td>\n<td style=\"width: 8.77471%; height: 27px;\">334<\/td>\n<td style=\"width: 10.7433%; height: 27px;\">3335<\/td>\n<td style=\"width: 8.33775%; height: 27px;\">33349<\/td>\n<\/tr>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Cost Table Scan<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\">6,76747<\/td>\n<td style=\"width: 10.452%; height: 27px;\">6,76828<\/td>\n<td style=\"width: 11.6527%; height: 27px;\">27,0724<\/td>\n<td style=\"width: 8.93356%; height: 27px;\">35,0786<\/td>\n<td style=\"width: 8.77471%; height: 27px;\">296,723<\/td>\n<td style=\"width: 10.7433%; height: 27px;\">2946,83<\/td>\n<td style=\"width: 8.33775%; height: 27px;\">29461,6<\/td>\n<\/tr>\n<tr style=\"height: 27px;\">\n<td style=\"width: 23.5875%; height: 27px;\"><strong>Cost\/Page<\/strong><\/td>\n<td style=\"width: 9.18083%; height: 27px;\">6,76747<\/td>\n<td style=\"width: 10.452%; height: 27px;\">6,76828<\/td>\n<td style=\"width: 11.6527%; height: 27px;\">6,7681<\/td>\n<td style=\"width: 8.93356%; height: 27px;\">1,032<\/td>\n<td style=\"width: 8.77471%; height: 27px;\">0,8884<\/td>\n<td style=\"width: 10.7433%; height: 27px;\">0,8836<\/td>\n<td style=\"width: 8.33775%; height: 27px;\">0,8834<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Again, around the 32 pages mark, the cost\/page drops dramaticly<br \/>\nAnd again for the small tablesthe cost is around 6,76 and for the tables with more then 32 pages the cost is around 0,88.<\/p>\n<p>In a next article I would like to see how indices behave with respect to cost.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Index Why? When you supply DB2 with a query DB2 has to make some serious decisions: &#8211; Should indexes be utilized? &#8211; Which indexes are most suitable? &#8211; Which join operation should be used? (hash join\/nested loop) &#8211; Is all data needed at once or are the first 10 rows as quick as possible (the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[81,64],"tags":[82,65,14],"class_list":["post-451","post","type-post","status-publish","format-standard","hentry","category-cost","category-db2","tag-cost","tag-db2","tag-sql"],"_links":{"self":[{"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/posts\/451","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/maboc.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=451"}],"version-history":[{"count":24,"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/posts\/451\/revisions"}],"predecessor-version":[{"id":497,"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/posts\/451\/revisions\/497"}],"wp:attachment":[{"href":"https:\/\/maboc.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=451"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/maboc.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=451"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/maboc.nl\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=451"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}