Joined: 20 Oct 2006 Posts: 1411 Topics: 26 Location: germany
Posted: Wed Dec 19, 2007 5:44 am Post subject: Ranomizing Prim Keys to insure equtable insert population
I have posted this in other forums (yesterday). There are some clever and informed db2 types here and I would like to have their input.
vsn6 and early vsn7 db2, dba's told us to contort the timestamp (used as primary key) in order to cause inserts to be placed in different pages, instead of 1,2,3,4 which meant fill a page, then go to the next.
as of vsn7, vsn8 same dba's are telling me, don't bother.
anyone have any documents/articles that address this issue?
as I receive feedback in the other forums, I will update this thread.
thx, dbz _________________ Dick Brenholtz
American living in Varel, Germany
Joined: 08 Oct 2004 Posts: 274 Topics: 52 Location: California
Posted: Wed Dec 19, 2007 10:47 am Post subject:
dbzTHEdinosauer,
This is my experience with V7 & V8. We have many tables with the primary key on an Identity column. This guarantees uniqueness. The Clustering Index is 99% of the time on other column/columns. This will spread the data across the data pages.
Joined: 26 Nov 2002 Posts: 12378 Topics: 75 Location: San Jose
Posted: Wed Dec 19, 2007 12:40 pm Post subject:
dbzTHEdinosauer,
DB2 uses one of three algorithms for deciding where to insert a row.
The default mode is for DB2 to search the Free Space Control Records (FSCR's) interspersed in a table's pages to find a page that has enough free space to hold the new row. Obviously, if there is relatively little free space on each page, much of the search time will be wasted. To counter this, DB2 provides the DB2MAXFSCRSEARCH registry variable to allow you to limit the search to fewer than the default of 5 pages.
The second algorithm is used when the table is placed in APPEND mode through ALTER TABLE. This avoids the FSCR search entirely, as rows are simply placed immediately at the end of the table.
The final algorithm is used when the table has a clustering index; in which case DB2 tries to insert each row on a page with similar key values (according to the clustering index). If there's no room there, it tries nearby pages, and if there's no room there either, it does an FSCR search.
If optimizing insert time is the only consideration, using APPEND mode is the fastest approach for bulk inserts. The next best approach should be the default algorithm, but changing the value of DB2MAXFSCRSEARCH had negligible impact in the test environment; perhaps in a less I/O constrained environment that change would have been significant.
Having a clustered index has a significantly negative impact on insert performance which is not surprising given that the purpose of such an index is to improve query (i.e., Select) performance by doing extra work at insert time. If having a clustered index is needed, its effect on inserts can be minimized by ensuring that sufficient free space exists: use ALTER TABLE to increase PCTFREE, then REORG to reserve the free space. Allowing too much free space, however, could cause extra pages to be read for queries and largely defeat the purpose of using a clustering index. Alternatively, dropping a clustered index before bulk inserts, and recreating afterwards, may be the optimal approach (there is no more overhead for creating a clustered index than a regular index, just extra overhead at insert time). _________________ Kolusu
www.linkedin.com/in/kolusu
Joined: 08 Oct 2004 Posts: 274 Topics: 52 Location: California
Posted: Mon Dec 24, 2007 10:41 am Post subject:
kolusu, I did a little more research on clustering. If an index is not specified as 'CLUSTER' the default Clustering is the oldest index created (read that as the First). This is for V8. I will try and cut and paste the document. _________________ Thanks,
NASCAR9
Joined: 08 Oct 2004 Posts: 274 Topics: 52 Location: California
Posted: Thu Jan 24, 2008 5:07 pm Post subject:
Here is the paragraph I was refering to above.
It is very important to specify a clustering index based on your analysis of how the data is processed. If you do not create an index using the parameter CLUSTER, then by default the first index created (or more precisely the first index on the chain of indexes in the database descriptor (DBD)) is used to determine where to insert rows _________________ Thanks,
NASCAR9
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum