« My take: should a web developer know SQL? | Main | UI: Double-click to search keywords on NYTimes.com »

When to create indexes in your database

The discussion over on Ben Nadel's posting on tips for writing better, more-optimized SQL is pretty interesting, with a number of users writing in with good tips on SQL optimization and even db optimization. I suggested that any column used in a WHERE clause should be indexed, which in truth is rather a strong statement; I suppose it's more accurate to say that any column in a WHERE clause should be considered for indexing. The possible drawback to the over-use of indexes, as some people pointed out, is that indexes can possibly slow down your database by taking up too much disk space or by being so large that they take too much overhead for the database to maintain.

While I suppose that's technically possible, and those folks say they've observed it, I have to say that I've never experienced that kind of problem, even on high-traffic tables with several millions of records. In the end, you just have to experiment to see whether an index makes your query run faster. But do consider indexes for more than your foreign keys.

People have been asking for some solid technical rules or advice on when to implement indexes, so I thought I'd post links to the best guides on indexes:

Comments (1)

the best resource i have found for indexing is:

relational database index design and the optimizers (db2, oracle, sql server et al.) by tapio lahdenmaki and michael leach
its not a thick book, but very good. i've had huge improvements in my own development.

for any oracle tuning i have researched, i am almost always finding an article from donald burleson (published by rampant books) to be the most useful.

all db authors i have read state that if you are going to have a field in the where clause that it needs to be in an index. also, you should avoid indexing single columns unless they are your foreign key. its better to have 2 indexes with 2 columns than 6 indexes with 1 column. single column indexes are what take more space and cause higher i/o (which is where your perf hit comes from).

Post a comment


Type the characters you see in the picture above.