{"id":476,"date":"2022-02-20T17:17:08","date_gmt":"2022-02-20T17:17:08","guid":{"rendered":"https:\/\/maboc.nl\/?p=476"},"modified":"2022-02-22T18:58:34","modified_gmt":"2022-02-22T18:58:34","slug":"db2-cost-calculation-2-index-only","status":"publish","type":"post","link":"https:\/\/maboc.nl\/?p=476","title":{"rendered":"DB2 : Cost Calculation 2 : index only"},"content":{"rendered":"<p><a href=\"https:\/\/maboc.nl\/?p=487\" rel=\"noopener\" target=\"_blank\">Index<\/a><\/p>\n<p><strong>Why indexes<\/strong><br \/>\nSearching through millions of rows can be the right thing to do (<a href=\"https:\/\/maboc.nl\/?p=451\">TBSCAN<\/a>). It might be the only option available to get the data you need.But in a lot of cases we need smarter ways to get to our data. If you need only one row,it seems silly to process thousands of pages to look for that one row. Then indexes came along: a bit of the data in a table but then already ordered for fast lookup. <\/p>\n<p>What an index looks like, and how they work, is not the scope of this article, but I think that an article will come.<\/p>\n<p>Suffice it to say that we are talking about <a href=\"https:\/\/www.ibm.com\/docs\/en\/db2\/11.1?topic=indexes-index-structure\" rel=\"noopener\" target=\"_blank\">BTree indexes<\/a>.<\/p>\n<p><strong>Setup<\/strong><br \/>\nFollowing table is created for testing:<\/p>\n<pre>\r\ncreate table ind1 (n1 int not null primary key, \r\n                   n2 int, \r\n                   n3 int, \r\n                   t varchar(100));\r\n\r\ninsert into ind1 \r\n  WITH engine (rownum) AS (\r\n                            SELECT 1 AS rownum \r\n                            from   sysibm.sysdummy1\r\n                            UNION ALL\r\n                            SELECT rownum + 1 AS rownum\r\n                            FROM   engine\r\n                            where  rownum<100000\r\n                          )\r\n  SELECT rownum, \r\n         case when rownum<50000\r\n              then 1\r\n              else 2\r\n         end,\r\n         round(random()*100000),\r\n         rpad('t', 100, 'x')\r\n  FROM   engine;\r\n\r\ncommit;\r\n\r\nrunstats on table ind1 and detailed indexes all;\r\n<\/pre>\n<p>You can see that the column n1 is a primary key. This means:<br \/>\n- not null<br \/>\n- unique<\/p>\n<p>There are no NULL values for n1, and every value in column n1 is unique. This uniqueness is supported by an index. Since I did not create this index myself (DB2 created it for me) it has a DB2 generated name.<\/p>\n<pre>\r\ndb2 \"select cast(indschema as varchar(20)) schema, \r\n            cast(indname as varchar(20)) name,\r\n            decode(uniquerule,\r\n                   'D', 'Duplicates',\r\n                   'U', 'Unique',\r\n                   'P', 'Primary') uniquerule,\r\n            nleaf,\r\n            nlevels\r\n     from   syscat.indexes \r\n     where tabname='IND1'\"\r\n\r\nSCHEMA          NAME                 UNIQUERULE NLEAF        NLEVELS\r\n--------------- -------------------- ---------- ------------ -------\r\nSYSIBM          SQL220220161847690   Primary             417       3\r\n\r\n  1 record(s) selected.\r\n<\/pre>\n<p>OKay....we're all set up. Let's do some explaining.<\/p>\n<p><strong>Unique Scan (index acces only)<\/strong><br \/>\nLet's select one n1 : say for n=10000.<br \/>\nThis means that all information can be retrieved from the index.<\/p>\n<p>For good measure....I first look at the cost whitout the index. I quickly created the same table whithout the index as follows:<\/p>\n<pre>\r\ncreate table ind2 (n1 int not null, \r\n                   n2 int, \r\n                   n3 int, \r\n                   t varchar(100));\r\n\r\ninsert into ind2 select * from ind1;\r\n\r\ncommit;\r\n<\/pre>\n<p>(No primary key, so no DB2 generated index)<\/p>\n<p>The query:<\/p>\n<pre>\r\n[db2inst1@broccoli indscn]$ db2 \"explain plan for select n1 from ind2 where n1=10000\" \r\nDB20000I  The SQL command completed successfully.<\/pre>\n<p>The explain:<\/p>\n<pre>[db2inst1@broccoli indscn]$ db2exfmt -1 -d tinus -o expl1.txt\r\nDB2 Universal Database Version 11.5, 5622-044 (c) Copyright IBM Corp. 1991, 2017\r\nLicensed Material - Program Property of IBM\r\nIBM DATABASE 2 Explain Table Format Tool\r\n\r\nConnecting to the Database.\r\nConnect to Database Successful.\r\nOutput is in expl1.txt.\r\nExecuting Connect Reset -- Connect Reset was Successful.\r\n[db2inst1@broccoli indscn]$<\/pre>\n<p>In the outputfile we find the graph of the execution path:<\/p>\n<pre>\r\nAccess Plan:\r\n-----------\r\n        Total Cost:             2857.2\r\n        Query Degree:           1\r\n\r\n\r\n      Rows\r\n     RETURN\r\n     (   1)\r\n      Cost\r\n       I\/O\r\n       |\r\n        1\r\n     TBSCAN\r\n     (   2)\r\n     2857.2\r\n      3228\r\n       |\r\n     100000\r\n TABLE: DB2INST1\r\n      IND2\r\n       Q1\r\n<\/pre>\n<p>In <a href=\"https:\/\/maboc.nl\/?p=451\" rel=\"noopener\" target=\"_blank\">DB 2 : Cost calculations 1 : Table Scan<\/a> we saw that the cost of retrieving one page is about 6,7706 when DB2 will not use prefetch and around 0,8914 if DB2 is using prefetch. The cut is at 32 pages. Since we need to read a lot more then 32 pages we can safely assume that prefetch is used. So I would say the cost for this operation is:3228 * 0,8914=2877 (well not exactly what DB2 suggests (2857.2), but quite close!<\/p>\n<p>So when the cost of using an index to get this one value is lower then 2877 I quite well understand why we are using the index!<\/p>\n<p>Let's see (query and explain):<\/p>\n<pre>\r\n[db2inst1@broccoli indscn]$ db2 \"explain plan for select n1 from ind1 where n1=10000\" \r\nDB20000I  The SQL command completed successfully.\r\n\r\n[db2inst1@broccoli indscn]$ db2exfmt -1 -d tinus -o expl2.txt\r\nDB2 Universal Database Version 11.5, 5622-044 (c) Copyright IBM Corp. 1991, 2017\r\nLicensed Material - Program Property of IBM\r\nIBM DATABASE 2 Explain Table Format Tool\r\n\r\nConnecting to the Database.\r\nConnect to Database Successful.\r\nOutput is in expl2.txt.\r\nExecuting Connect Reset -- Connect Reset was Successful.\r\n[db2inst1@broccoli indscn]$\r\n<\/pre>\n<p>What does the graph tell us:<\/p>\n<pre>\r\nAccess Plan:\r\n-----------\r\n        Total Cost:             13.5334\r\n        Query Degree:           1\r\n\r\n\r\n        Rows\r\n       RETURN\r\n       (   1)\r\n        Cost\r\n         I\/O\r\n         |\r\n          1\r\n       IXSCAN\r\n       (   2)\r\n       13.5334\r\n          2\r\n         |\r\n       100000\r\n   INDEX: SYSIBM\r\n SQL220220161847690\r\n         Q1\r\n<\/pre>\n<p>We see that the execution only hits the index. The table itself is not involved.<br \/>\nAnd...we only do 2 IO's. Two IO's is to less to prefetch so I imagine the cost for 1 IO is about 6,7706 (see : <a href=\"https:\/\/maboc.nl\/?p=451\" rel=\"noopener\" target=\"_blank\">DB2 : Cost calculations 1 : Table Scan<\/a>). So 2 * 6,7706 = 13,5412....quite close : the DB2 estimates the cost to be 13.5334. That's close enough \ud83d\ude42<\/p>\n<p>Cost of getting 1 value out of the table (which actually is in the index) is 13,5334, compared to the cost of getting that same value whithout an index being 2857,2. I say that I do understand (for this query) why DB2 chooses a path using the index.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Index Why indexes Searching through millions of rows can be the right thing to do (TBSCAN). It might be the only option available to get the data you need.But in a lot of cases we need smarter ways to get to our data. If you need only one row,it seems silly to process thousands of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[81,64,32],"tags":[82,65,8],"class_list":["post-476","post","type-post","status-publish","format-standard","hentry","category-cost","category-db2","category-index","tag-cost","tag-db2","tag-index"],"_links":{"self":[{"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/posts\/476","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/maboc.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=476"}],"version-history":[{"count":11,"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/posts\/476\/revisions"}],"predecessor-version":[{"id":498,"href":"https:\/\/maboc.nl\/index.php?rest_route=\/wp\/v2\/posts\/476\/revisions\/498"}],"wp:attachment":[{"href":"https:\/\/maboc.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=476"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/maboc.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=476"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/maboc.nl\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=476"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}