Friday, December 21, 2007

OOW2K7 vs UKOUG2007

Having attended the two conferences so close together makes it easy for me to compare them. There are obvious differences. San Francisco in November is warmer, sunnier and all together more attractive than Birmingham in December. Open World is much much larger. But the UKOUG felt more tiring, at least to me. This is because the UKOUG packs a lot more into each day.

Open World is distributed across several sites. In order to give delegates sufficinet time to get from a session in the Westin St Francis to Moscone South there is a half hour gap between sessions. There are no sessions whilst the key notes are on. There are no sessions during lunch hour. The UKOUG schedules sessions with only ten minutes between them. The streams are staggered, so that there were always sessions during lunch. This obviously provides value for the delegates by offering a vast number of sessions to attend. But it also makes us feel guilty about skipping a session in order to go round the exhibition hall.

Another difference is the networking opportunities. Oracle have started putting a lot more effort into providing portals for delegates to get in contact with each other and places where people can meet and converse. This is easier because Moscone Center has some big spaces which can be used for such purposes.

Both conferences are valuable. I learned a lot from attending both. But Open World was much more relaxed. I think the fact that I have just blogged about UKOUG Wednesday more than two weeks after the end of the conferences says it all.

Thursday, December 20, 2007

UKOUG 2007: Wednesday Morning, 3 AM

Well not quite, but I was awake at 4:30am. So I ran through my presentation a couple of times and made some final tweaks to the slides. I didn't have any code demos this year, so there was no chance of me breaking anything.

Every performance problem is one of two things


I almost skipped James Morle's presentation, because I thought it would cover the same ground as Jonathan Lewis's talk on Statspack. I'm glad I didn't though, because it was completely different. But I'm even more frustrated that this talk was scheduled at the same time as Daniel Fink's talk on Why tune SQL?.

James's talk was a reassessment of the YAPP tuning method:
R = S + W

That is, response time equals service time plus wait time. Most tuning methods focus on the waits. This talk's assertion was that most fixes actually deal with skew or latency, that is, the service time. In a nutshell, Morle's Grand Unified Theory of Performance Tuning is: All problems are caused by skew, latency or both.

Bandwidth is not the same as latency: bandwidth is how much transport you have, latency is the wasted time associated with each operation. James demonstrated the difference between latency and bandwidth with some live action in which two volunteers transported bottles of beer across the auditorium, in a suitcase or by hand. The impact of Connor McDonald's infamous "squirrel sketch" is still rippling across the presentation pond. However, this example is easier to blog. Given two networks:

LAN100MB/sping = 3ms
WAN250KB/sping = 2ms

which one can transmit the most data? The answer is the WAN; the LAN's enormous bandwidth is wasted because of the additional latency in its ping speeds. Of course, the WAN is operating more-or-less at capacity and the LAN isn't.

Problems with bandwidth are easy to fix - they just require money and bandwidth is getting cheaper all the time. They are also easy to spot. Latency on the other hand is bounded by the laws of physics, which means it is not necessarily fixable to chucking money at it. Also, it can be hard to spot. Latency multiplication is a very common source of problems - 1ms per operation is a very short time until you do that operation a thousand times.

Common sources of latency:
  • nested loop joins
  • SQL*Net message to/from client
  • too many commits
  • "slow" I/O


Skew is also hard to spot because humans (and computers) tend to think linearly, and unconsciously split things into evenly-sized chunks.

Common sources of skew.:
  • distribution of column values
  • high demand for small set of resources (contention)
  • response times for component SQL statements in a transaction
.
So the revised version of the YAPP formula is
R = S + W

where S = (skew * service time) and W = (skew * latency + passive wait time).

Modelling on the cheap


This presentation originally started out as some jokey test data, when the UKOUG asked some of us SIG chairs to test the conference site. But the title was sufficiently compelling for me to submit it as a real paper and, more surprisingly, for it to be accepted. I think many people found the title compelling because it focuses on a real problem: how to undertake database and application modelling when the boss won't spring for an expensive CASE tool. I work for a consultancy; consultancies tend to put software licences in one of two categories - those for which the client pays and overhead. So I have an interest in tools and techniques which don't require me to get a business case approved.

I talked about low tech solutions (whiteboards, Post-It notes and index cards), desktop tools (PowerPoint and Visio) and then free modelling tools (basically Schemester). There is JDeveloper but I don't think that is really suitable for people who are developing database applications. However the rumours suggest that JDev 11g will have much more to offer on that side of things, and that means that we can hope those features will eventually be ported to SQL Developer. I feel the session went well but I shall have to wait until UKOUG publishes the evaluations.

Afterwards, somebody who introduced themselves as one of the JDev development team who worked on the original Use Case modeller said they started out modelling that tool with index cards and Post-Its. And Duncan Mills said that he didn't think the JDev features set was any worse that Schemester. It's a fair point: Schemester's big advantage is that it is a mere 295KB zip to download and has a memory footprint of ~12MB of RAM. So, unlike JDev, we can run Schemester without nadgering our desktop.

Monday, December 17, 2007

UKOUG 2007: Ruby Tuesday

Tuesday was a long day for me. I was awake at 5:00am. On Monday I was awake at 5:30am. If it wasn't for the fact that I'm still in the same timezone I would swear I was jet-lagged. It isn't "jet-lag" either - I mainly drank water at the bloggers' meet-up last night. Anyway, it gave me the chance to run through my presentation before breakfast.

The first session of Tuesday was Sue Harper's overview of SQL Developer's new features. These transpired to be the new features in the coming 1.5 release rather than the latest production release. The upcoming release is special because its features are ones suggested and voted on by SQL Developer users on the SQL Developer Exchange site. The team deferred the planned 2.0 release because its manifest contained no features which users were clamouring for, at least not according to the SQL Dev Exchange rankings. So if there's some functionality you want to see added to the product, suggest it. If enough other users agree with you there is probably a reasonable chance you might get it. Power to the people!

Most of these new features are enhancements to existing functionality - a new SQL formatter, better code insight. This is because the SQL Dev team is quite small and has only one real developer. The team piggy-backs on the work of the JDeveloper team, so to a large extent the feature-set in SQL Dev depends on what is in the other tool. That is why the next release will feature the ability to see queue tables but not to create or alter them. In the long run we can hope that the JDev modelling features will be included in SQL Dev.

Database vs middle tier


My chairing of Toon Koppelaar's session got off on the wrong foot. Of course I should have checked how to pronounce his name before I introduced him - Dutch names tend to sound like a cough - but as it turned out it was the name I thought I knew which I had in fact got wrong. Toon (pronounced tone) was generous. Apparaently he gets call "Tune" quite a lot.

Toon's talk was another waltz around the question of where business logic properly belongs. Toon distinguishes three times of logic:
  • data logic - directly manipulating the data;
  • UI logic - directly affecting the user's interaction with the system;
  • business logic - everything else.
Toon rehearsed the familiar arguments. We have moved almost full-circle from text-based stateless apps in the seventies and early eighties through client server apps to the current fad for graphical stateless apps. The main difference between the green screen dumb terminals of yore and today's browser-based apps is responsiveness: the new apps do try to respond to the user in the client layer. This comes at a price: the presentation layers have a large number of moving parts, drawn from an exploding range of widget toolkits and frameworks. Keeping up with all these different bits requires a significant knowledge investment, which in turn leads to teams of specialists overseen by architects (you have to hear Toon say "architect" to realise that he means it as an insult).

But the real problems arise when the business logic is written in Java or .Net or one of these other fancy-dan languages. These applications tend to be very chatty - Toon quoted an example where a single search spawned 7000 separate calls. They also tend to date very quickly, which makes them hard to maintain and hard to migrate. Most applications are "Window on Data" - the user retrieves data and updates data. This is precisely what databases are meant to do. Consequently the underlying database design is paramount. The DBMS and PL/SQL have been the single stable constant in system architectures for more than fifteen years now. So put your business logic in the database (PL/SQL and views) and just change the front-end layer to keep up with the Next Big Thing.

The interesting thing is that these talks - due to the nature of the conference - are usually singing to the converted. However, I noticed an ex-colleague, who is a Java head, in the audience. I chatted to him afterwards. He was spitting feathers. Strawman arguments... business logic is object-oriented....PL/SQL is too hard to maintain. In the end I had to calm him down by proposing that he present his opinions at a SIG. He said he'd think about it. I have heard this a number of times now. Both sides try to bludgeon the other side into admitting defeat. The trouble is finding people who are equally experienced in both database and middle-tier architectures. Almost everybody is a specialist on one side of the fence or the other. There are good reasons why this tends to be the case but it does make it difficult to hold a discussion in which everybody understands the issues involved.

So I think it comes down to numbers. Front-end developers generally outnumber database engineers in any organisation, so most of the application will be written in the programming paradigm which suits them. This will lead to the business logic in the middle tier, because that's what they understand. Whether this is the correct solution is a different matter.

11g for developers


Well at least I managed to pronounce Tom Kyte's name correctly. In this session Tom covered the key new features for developers. As with the other session, I'll list the features he mentioned, and just expand on a couple of points which were news to me:
  • Client result cache
  • server result cache
  • PL/SQL result cache
  • virtual columns
  • the long awaited PIVOT clause
  • finer grained dependency analysis
  • Advanced compression
  • PL/Scope
  • Method 4 dynamic SQL
  • compound triggers
  • change to WHEN OTHERS exceptions (see below).
The more I hear about the result caching features the more I think my project ought to move to 11g. Currently we are on 9i with only the vaguest plans to migrate to 10g. For the last few months I have been, in effect, implementing something like result caching, only not as efficient. A crusade for the new year. Something I had not picked up on before is that the PL/SQL result cache only applies to functions, not procedures, and also doesn't work with functions which have OUT or IN OUT parameters.

Bryn Llewellyn, the PL/SQL product manager, offered Tom the choice of three new features. Tom asked to have the WHEN OTHERS exception banned; regular readers of AskTom will know that the use of this exception without an accompanying RAISE or RAISE_APPLICATION_ERROR is one of his bugbears. In the end they compromised: WHEN OTHERS without a RAISE or RAISE_APPLICATION_ERROR will throw a compilation warning. But only if the compilation is set to the correct level. I rather suspect the sort of developers who code such silent bugs in their exception handlers probably don't use the compiler flags.

The duplicity of duplicate rows


Hugh Darwen started the third talk in his series of SQL's biggest mistakes with a discussion of the Bellman's line from the Hunting of the Snark, "What I tell you three times is true." He followed this up with Ted Codd's observtion that "If something is true saying it twice doesn't make it any more true."

Duplicate rows in a resultset are duplicitous because they can represent two different problems. One is that the table contains duplicated data. The other is that the SQL has generated duplicate rows, for instance through an imperfect join. Either way it is an offense against relational theory, which requires that a relation consist of unique records. The possibility of valid SQL generating duplicate rows is a fundamental flaw in relational theory, which wasn't anticipated by Ted Codd. There are workarounds. We can introduce artificial uniqueness through sequences or or on the fly row number generation. We can suppress duplication through the use of DISTINCT or UNION, or we can ensure we include all the key columns in our select statements and include differentiating tags in UNION ALL statements.

No solution is entirely satisfactory. Synthetic keys do not address the underlying problem of duplicate data in the table. That really needs to be tackled by a decently normalised data model and building unique constraints on the natural keys. Removing duplication from the resultset has performance implications. We wouldn't want to pay the overhead for applying DISTINCT to every query but how else can we guarantee uniqueness in the resultset. Hugh suggested this problem was down to the database vendors but I think that is a bit cheeky. DISTINCT inevitably requires additional work on the part of the database and that has to cost something. It can be optimized (with the kind of hashing algorithm Oracle introduced for GROUP BY in 10gR2) but as Scotty was wont to observe, "Ye cannae break the law of physics".

Incidentally, Hugh owned up to introducing the NATURAL JOIN syntax into the ANSI standard. He acknowledged that it was a controversial addition. Indeed. Some people I know regard it as a mistake; perhaps not on a par with NULLs but a mistake none the less. Personally I think it's a heffalump trap. It may be a boon for the experienced SQL practitioner but it's just a bug waiting to happen for the unwary. And if my experience on the OTN forum is at all representative the clueless outnumber the clueful, and they appear to breed faster. It's not the way we'd like it to be but it's the way it is. Of course this is just the sort of pragmatic approach which often enrages the theoreticians (or at least Fabian Pascal).

Hugh rounded off his presentation with the observation that fundamental mistakes are often irrevocable. Furthermore they don't often reveal themselves until it's too late. So maybe it's time for a Truly Relational Database. This call for for arms was delivered with a knowing smile - most of us in the audience are Oracle wranglers first and foremost. So he asked the question of himself, why does he keep doing this? Because somebody has to, otherwise the importance of theory will disappear off the practitioners' radar altogether.

Understanding statspack


Jonathan Lewis also started his presentation with a quote:
"------ is the art of looking for trouble, finding it everywhere, diagnosing it wrongly and applying unsuitable remedies."
This turns out to be Sir Ernest Benn on politics, but as Jonathan rightly says it acts as a neat summation of the state of Oracle tuning.

All database performance problems mean one of two things:
  • the database is working too hard; or
  • it is waiting for someting.
Waits are due either to the inherent slowness of a particular step or because there is a queue for a particular resource. There are only issues:
  1. disk I/O;
  2. CPU consumption;
  3. network delays;
  4. application contention (locks);
  5. internal contention (latches)

It's almost always the SQL at fault:






SQL ordered by getsCPU
SQL ordered by readsDisk (CPU)
SQL ordered by executionsCPU (network)
SQL ordered by parse callsCPU (internal contention)
SQL ordered by version countsCPU, internal contention
SQL ordered by CPU timeCPU
SQL ordered by elapsed timeapplication contention

The problem with Statspack is that it just gives us a list of symptoms. It can be hard to distinguish the flow of the events which underlie the bald statistics and even harder to determine possible causes. What sort of database activity might generate lots of logfile sync waits? So Jonathan advises us to play with Statspack. Run a query many times; check the Statspack output. Change the statement and repeat the operation. Stress Oracle and see what happens.

The other important thing is to keep an archive of Statspack results. Quite often performance problems arise because something has changed. The current output may flag up a problem but it might not be enough to diagnose it. Having a history of Statspack output will allow us to spot trends, repeating patterns or sudden changes in behaviour.



..occurrence
..rarefrequent
impacthugeeasy to seeeasy to see
.tinyirrelevantstatistically visible

This tied in nicely with Robyn Sands's observation that performance tuning is about reducing variability.

Tuesday evening


In the evening we had the SIG Focus Pubs. This is supposed to be an opportunity for delegates to meet others with similar interests. I'm afraid I took it as an opportunity to find presenters for my next SIG (27th February 2008 - save the date!) I did chat with somebody who is still happily using client/server Forms 6i. Well, it's table, quite richly featured, highly productive and it works - at least for the time being. This person expressed the hope that Forms 6i would run on Vista. Somehow I doubt it (and so did a former Forms guru of my acquaintance). Apart from anything else, archaic software like Forms is exactly the sort of thing which will display very slowly on Vista's new rendering architecture.

Afterwards a few of us went for a mealy at a rather upmarket curry house called Lazeez (hence "ruby Tuesday"). The food was very nice but there were no side dishes. An innovation too far. Anyway, my thanks to Joraph, who picked up the tab.

Thursday, December 13, 2007

In praise of the Checklist

I love reading The New Yorker magazine. Partly the it is sheer expanse of the articles, which are measured in pages rather than paragraphs. But also it's the breadth of the coverage. Okay, so I could do without the profiles of baseball coaches but pretty much every article is worth reading. Unfortunately I lack the time to read each issue, so these days I buy it when I want to pretend I am still a leisured (and cultured) person.

I have just got around to reading last week's issue. It contained a fascinating piece by Atul Gawande on the use of checklists in intensive care units. ICU staff deal with an very complicated piece of machinery (i.e. us) when it's in an extremely precarious state (hence the need for intensive care). There are thousands of different ICU procedures. Each procedure consists of multiple steps; if somebody misses or botches a step there are often terminal consequences for the patient. Furthermore each condition requires a unique combination of ICU procedures, staff and equipment. Patients in intensive care frequently die there.

In his piece Gawande talks about the efforts of a critical-care specialist named Peter Pronovost to improve survival rates by the simple expedient of checklists for a handful of common yet critical procedures. It is astonishing how such a simple thing can make such a profound difference:
"Pronovost and his colleagues monitored what happened for a year afterward. The results were so dramatic that they weren’t sure whether to believe them: the ten-day line-infection rate went from eleven per cent to zero. So they followed patients for fifteen more months. Only two line infections occurred during the entire period. They calculated that, in this one hospital, the checklist had prevented forty-three infections and eight deaths, and saved two million dollars in costs."
Less surprising but more depressing is the difficulty Pronovost experienced in persuading highly-qualified doctors to bother themselves with yet more form-filling.

Most of us in IT have similarly mundane-yet-complicated procedures. Of course, hardly any of our procedures are literally life-or-death, but there are usually penalties for getting them wrong (even if it's only digging out the manuals to refresh our memories on Flashback Database). Checklists are good because they prompt us to go through each step of a prccedure. And because the machinery we deal with is a lot more tractable than the human body we can often automate our checklists into stored procedures, shell scripts or workflow processes.

Gawande's article reminded me of a couple of things I do on an infrequent but regular basis which would benefit from being documented in a checklist. But it's also a fine and moving piece of writing and worth reading in its own right.

NOT IN, NOT EXISTS and MINUS: an aide-memoir

A colleague asked me whether NOT IN would return the same as MINUS. I said it would depend on whether the results contained nulls. I confess to not being clear as to how the results would be affected by the presence of nulls, but it's easy enough to knock up a test case.

We start with both tables containing nulls:

SQL> select id from a10
2 /
ID
----------
1

3

3 rows selected.

SQL> select id from a20
2 /
ID
----------
3
2

4

4 rows selected.

SQL> select * from a10
2 where id not in ( select id from a20)
3 /

no rows selected

SQL> select * from a10
2 where not exists
3 (select id from a20 where a20.id = a10.id)
4 /
ID
----------
1


2 rows selected.

SQL> select * from a10
2 minus
3 select * from a20
4 /
ID
----------
1

1 row selected.

With a null in the top table but not in the bottom table:

SQL> delete from a20 where id is null
2 /

1 row deleted.

SQL> select * from a10
2 where id not in ( select id from a20)
3 /
ID
----------
1

1 row selected.

SQL> select * from a10
2 where not exists
3 (select id from a20 where a20.id = a10.id)
4 /
ID
----------
1


2 rows selected.

SQL> select * from a10
2 minus
3 select * from a20
4 /
ID
----------
1


2 rows selected.

SQL>

With a null in the bottom table but not in the top table:


SQL> delete from a10 where id is null
2 /

1 row deleted.

SQL> insert into a20 values (null)
2 /

1 row created.

SQL> select * from a10
2 where id not in ( select id from a20)
3 /

no rows selected

SQL> select * from a10
2 where not exists
3 (select id from a20 where a20.id = a10.id)
4 /
ID
----------
1

1 row selected.

SQL> select * from a10
2 minus
3 select * from a20
4 /
ID
----------
1

1 row selected.

SQL>

With no nulls in either table:

SQL> delete from a20 where id is null
2 /

1 row deleted.

SQL> select * from a10
2 where id not in ( select id from a20)
3 /
ID
----------
1

1 row selected.

SQL> select * from a10
2 where not exists
3 (select id from a20 where a20.id = a10.id)
4 /
ID
----------
1

1 row selected.

SQL> select * from a10
2 minus
3 select * from a20
4 /
ID
----------
1

1 row selected.

SQL>

Monday, December 10, 2007

UKOUG 2007: Presentations available.

I haven't finished writing up the conference and already the presentations are downloadable from the conference agenda page. Sigh.

The password is the usual one. If you don't know it, your membership's regular contact should be able to tell you. Otherwise e-mail the UKOUG office, which is quicker than attempting to crack it yourself (although as it's a regular dictionary word with no baroque capitalisation or numbers your rainbow tables won't break a sweat).

Not all the presentations are available. As speakers we had until Friday 7th December to upload the really final version of the slides so perhaps some people didn't make that deadline. Also, putting our slides online was not compulsory. It's a shame, because there are a couple of talks which I missed which are not there yet.

Tuesday, December 04, 2007

UKOUG 2007: Monday, Monday

The first session I chaired was Martin Widlake talking on "Avoiding Avoidable Disasters and Surviving Survivable Ones" He took as his sermon the text What could possibly go wrong? Well, for starters, there could be a problem with the USB drive so that the wireless slide controller doesn't work. Martin disappeared to try and find a replacement, which left me with the possibility of having to vamp for an hour if he didn't come back. Fortunately he did return, albeit without a replacement gadget, so his presentation was more static than useful usual [see Comments - Ed]. Fortunately the ideas were lively. I was particularly taken with the Formica Table. This would be a forum where "not bad" DBAs answered questions which fitted 95% of all scenarios; sort of an Oak Table For The Rest Of Us.

His main theme was that projects tend to fail because decision-makers aren't realistic in their risk assessments. So projects are planned on the basis of everything going right. Staff are told to work extra hours and weekends without any recognition that tired people make mistakes, and fixing mistakes costs time. Or additional people are recruited which just increases the number of communication channels, to the point that the overhead of keeping everybody in the loop becomes excessive.

Martin advises us to install disaster tolerant hardware, because it gives us more interesting problems to solve. Of course we shouldn't really switch to clustering technology just for the sake of it. But if we think we are eventually going to need RAC we should move to it now. That way we'll have learned to live with it and smoothed all the wrinkles before the technology has become critical to the system.

There were some entertaining war stories. One concerned a failed powerpack in a cluster. A sysadmin noticed and helpfully substituted the powerpack from another machine. When he connected the first node without a hitch but the second node promptly fried that power pack too. So he called an electrician. In order to get at the problem the electrician had to climb a ladder. The ladder slipped and the flailing electrician grabbed at the nearest thing to hand, the power rail, which collapsed and took out the leccy for the entire server room. We can't plan for such things, we can merely acknowledge that such stuff will happen.

The solutions are the obvious ones: realistic planning, smaller units of delivery, delivering something on a regular basis. One neat idea for solving the communication problem came from somebody who works for Unilever. They use Jabber to post small messages to a central message board, so everybody can see what everybody else is doing in real time. At last a business use for Twitter.

An Industrial Engineer's Approach to DBMS


Problems with the AV set-up seem to have become a theme in the sessions I've chaired. Cisco's Robyn Sands turned up with a Mac but without the requisite dongle which would allow her to plug it into the hall's projector. So she ended up having to drive her presentation from a PDF on a loaned Windows machine. She handled the transition to an unfamiliar OS but it was an unlucky start to her session.

Industrial engineering concerns itself with "design, improvement and installation of integrated systems of people, material, equipment and energy", which is a pretty good definition of the role of a DBA too. Industrial engineers focus on efficiency, value and methodology; they are the accountants of the engineering world. The application of IE methods to DBMS has the aim of producing a consistent application. For instance, every database in an organisation should be consistent: same installed components, same init parameters, same file locations, with transportable tablespaces and reusable code/scripts/methods. This results in safer and more flexible systems. The installation process is a flowchart; in effect the database instance is deployed as an appliance.

Another IE practice is value analysis. This says that a cost reduction adds value to a system just as much as adding a new feature. Which brings us to Statistical Process Control . Every process displays variability: there is controlled variability and uncontrolled variability. We need to use logging to track the elapsed time of our processes, and measure the degree of variability. Benchmarking is crucial because we need to define the normal system before we can spot abnormality. Abnormal variability falls into three categories:
  • special case;
  • trend;
  • excess variation.
Once we have explained the special cases we can file and forget them. Trends and excess variation both have to be investigated and fixed. The aim is achieving a consistent level of service rather than extreme performance. If you can accurately predict the Mean Response Time then you understand your system well.

Robyn described a project she had worked on which focused on systems reliability. The goal was to reduce or eliminate recurring issues with a view to reducing outages - and henceout-of-hours calls - to increase the amount of uninterrupted sleep for DBAs and developers. A worthy end. The problem is simply that a DBA or developer woken at three in the morning will apply the quickest possible fix to resolve the outage but there was no budget to fix the underlying problem when they got back into the office. Usually old code is the culprit. There's lots of kruft and multiple dependencies, which make the programs brittle. The project worked to identify the underlying causes of outages and fix them. The metric they used to monitor the project's success was the number of out-of-hours calls: over the course of the year these fell by orders of magnitude.

Robyn finished her presentation with some maxims:
  • Rules of thumbs are not heuristics.
  • Discipline and consistency lead to agility.
  • Reduce variation to improve performance.
  • No general model applies to all systems.
  • Understand what the business wants.
  • Model and benchmark your system accurately.
  • Understand the capabilities of you system.

The licensing round table


This event was billed as "Oracle's Right To Reply". Unfortunately there wasn't an Oracle representative present and even when one was rustled up they could only take away our observations to pass them on. This confirmed an observation from Rocela's Jason Pepper that Oracle employees are expressly forbidden from discussing licencing unless they are an account manager. This can lead to situations where advice from Support or Consulting leads to customers having exposure to increased licences.

The issues aired were the usual suspects. Why isn't partitioning part of the Enterprise Edition licence? Why aren't the management packs available for Standard Edition? Why isn't there a single, easily locatable document explaining pricing policy? How can we have sensible negotiations when the account managers keep changing? There was one area which was new to me. There is a recent initiative, the Customer Optimization Team, whose task is to rationalise a customer's licences. Somebody asked the pertinent question: what is the team's motivation - to get the best value for customer or to sell additional licences for all the things which the customer is using without adequate licences? Perhaps we'll get answers. I shall watch my inbox with bated breath.

Index compression


This was a bonus session I hadn't been meaning to attend but it was worthwhile. Philip Marshall from Joraph presented his research into the effects of compression, because these are not well documented in the manual. Compression works by storing the distinct values of the compressed columns and then linking to each instance of that value, which obviously imposes a small overhead per row. So the space saved on storing the compressed column is dependent on both the length of the column and the number of instances of those values. The overhead means that compressing an index with small columns which have high variability could result in very small savings or even a larger index.

Also we need to remember that the apparent space saving could be due to the act of rebuilding the index rather than compressing it. This matters because (as we all should know) the space savings from rebuilding can be quickly lost once the index is subjected to DML. Furthermore there is a cost associated with uncompressing an index when we query its table. This is can be quite expensive. The good news is that the CPU cost of uncompressing the columns is incurred by the index read only: so it is usually only a small slice of the whole query. Still it's a cost we should avoid paying if we aren't actually getting a compensating saving on space. Also compression does not result in more index blocks being cached. More blocks will be read in a single sweep, but the unused blocks will be quickly discarded.

I thought this presentation was a classic example of the Formica Table approach. A focused - dare I say compressed? - look at a feature which probably most of us have contemplated using at some time without really understanding the implications. It was the kind of presentation which might just as easily have been a white paper (I will certainly be downloading the presentation to get the two matrices Philip included) but there is already so much to read on the net that a paper would have just got lost.

11g for DBAs


This was the first of a tie-less Tom Kyte's two selections from the 11g chocolate box. I think the 11g features have been sufficiently rehearsed over the last few months that I have decided to skip the details. So here is just a list of the new features Tom thinks DBAs most need to pay the attention to.
  • Encrypted tablespaces
  • Active Dataguard
  • Real Application Testing
  • Enhancements to Data Pump (EXP and IMP are now deprecated)
  • Virtual columns
  • Enhancements to partitioning
  • Finer grained dependency tracking
  • the xml version of the alert log
  • invisible indexes
Of that list encrypted tablespaces, Active Dataguard, Real Application Testing and partitioning are (or require) chargeable extras. In the earlier round table Ronan kept reminding us that we must distinguish between licensing and pricing: we have to accept that Oracle has 47% of the database market so lots of CTOs and CFOs must think it offers value for money. Which is a fair point, but it is hard to see a compelling reason why a Standard Edition DBA would choose to upgrade. Actually the enhancements for developers are very attractive, but alas we don't carry as much sway.

One notable thing in Tom's presentation occurred when he was demonstrating the new INTERVAL operation for partitioning. The new partitions had a year of 2020, but the dates were supposed to be from this year. It turns out Tom had been tinkering with his demo code and had removed an explicit date conversation without checking the default date format. It's nice to know even the demi-gods fall prone to such things ;)

Monday, December 03, 2007

UKOUG2007: Make It Great

Most great cities have a quarter. Paris has its Latin quarter. Prague has the Jewish Quarter. Lankhmar has a Thieves Quarter. Birmingham has two quarters, the sparkling Jewellery Quarter and the slightly less sparkling Conference Quarter. No, really. It's a area dedicated to conferences, which means it has lots of hotel and lots of bars. No I don't know what bars have to do with conferences either ;)

Outside of the Conference Quarter Birmingham is gearing up for Christmas in a big way. The German market is doing a thriving trade in Gluhwein and Stollen. The central shopping area is thronged with people. Every structure is festooneed with lights and tinsel. And the weather is sufficiently chilly to make mulled wine and hot mince-pies seem like very good ideas.

Last night we had the UKOUG volunteers' meeting. Apparently this year is the biggest UKOUG conference yet: 2900 delegates registered, 600 speakers and 450 presentations (up from 350 last year). The conference is scaling out from the ICC and running some sessions in the Birmingham Rep theatre next door. But it's still relatively compact compared to OOW2K7 - I have already met several colleagues and ex-colleagues without trying, because there are fewer people - and obviously there is a greater preponderance of people from the UK than there was in San Francisco.

I have already attended the keynotes from Ian Smith and Tom Kyte. Ian was keen to stress the point that Oracle UK are responding to customer feedback, and in particular the responses to the UKOUG customer survey. He cited as evidence their initiatives for the JD Edwards and PeopleSoft communities, who felt bewildered by the sheer size of the Oracle range. Oracle is now "the world's leading enterprise software vendor", with the database just being one of its offerings (although it has 47% of the database market, more that #2 and #3 combined). Tom made a similar point. He skated through a list of twenty new or enhanced features out of the 400 in the 11g database release in thirty minutes. He said that when he joined Oracle in 1992 he could have covered the features of their entire product range in the same time. That's the scale of innovation in Oracle.

So the UKOUG conference is going to have to keep growing just to be able to keep up.