Aug 22 2008
Are foreign keys really necessary in a database design?
Recently I asked a question in StackOverFlow:
Are foreign keys really necessary in a database design?
As far as I know, foreign keys are used to aid the programmer to manipulate data in the correct way. Suppose a programmer is actually doing this in the right manner already, then do we really need the concept of foreign keys?
Are there any other uses for foreign keys? Am I missing something here?
Many people responded and there were all kinds of opinions. The strange thing I noticed was that many people including Joel Spolsky do not use foreign key constraints in their databases.
You can build a complete application without even using a single foreign key constraint. Foreign keys, as a matter of fact are not essential in your databases, but they help a lot in programming and making robust systems. Hence the use of foreign key integrity constraints is strongly recommended.
These are the obvious advantages of using foreign key constraints:
- They help in keeping data integrity by removing the possibility of unwanted data piling up and cluttering the database. Mostly this data will not affect your business logic, but eventually you will have to take up the task of cleaning up the mess manually.
- They help in visualizing and preparing database diagrams. Many tools are available which will automate this task.
- They may give performance benefits. In systems which auto index fields, foreign key references can give a performance boost.
- Foreign keys can also help the programmer write less code using things like ON DELETE CASCADE and other referential actions.
If you know of any other advantages please add them below.
Curtis Poe has written a very good article about this mentioning the fundamental flaw in the thought against foreign keys:
If you have an SQL statement inserting data, you want the receiving table to have the data validation in the form of foreign keys, custom data types, enums, triggers and so on. If you’re writing code for personal use or you control an open-source project, skip this if you want. It’s your code; it’s your choice. However, if you’re writing code for a business, you don’t know who will be maintaining it tomorrow. Maybe they don’t know about your clever data validation routines which allow you to forego foreign keys. Maybe they’re finding your code too slow so they decide to write directly to the database themselves. Whatever the reason, by putting the data validation as close as possible to the data munging, you make it harder to circumvent.
In effect, foreign key constraints are a way of implementing a set of guidelines for proper querying and manipulation of data in a database. You can do away with them, but you are bound to make mistakes eventually without them.




Subscribe
on 22 Aug 2008 at 6:05 pm
niyaz,
The use of foreign key include
- It will show the relation between two tables.
- Prevent the deletion of a table if there are dependencies.
My understanding is that foreign keys alone will not improve the performance. Actually they will reduce the performance. Only if we are creating index on that particular columns FKs will improve performance. In case of loading data its better to disable the foreign key and enable it after loading data.
on 22 Aug 2008 at 6:30 pm
Akhi,
Thanks for giving the valuable points.
on 22 Aug 2008 at 8:28 pm
I’m a bit confused by the notion of eliminating foreign keys and I work in application development utilizing databases.
Say you have a table of US states and a table of US cities. Without foreign keys, how do you know which cities are in which states?
states.statePrimKey
states.statename
states.stateabreviation (could be the same as PrimKey)
cities.cityPrimKey
cities.cityname
cities.statePrimKey (as a foreign key to find which state this city is in)
on 22 Aug 2008 at 8:45 pm
Huh,
I am talking about integrity constraints here. Not just foreign keys.
What you said is right. We cannot eliminate foreign keys as such.
on 22 Aug 2008 at 8:46 pm
Yeah we need ‘em. We need them for cascading deletes, and more importantly for object-oriented data access. I’m talking about technologies like Microsoft’s LINQ, that will “generate” objects that can be used in code based off of a database schema. Foreign keys tell LINQ what type of relationships the table have and can build objects that match that relationship.
on 22 Aug 2008 at 8:54 pm
Brent,
Thanks for the info. Tools used for manipulating database data heavily rely on foreign key constraints.
on 22 Aug 2008 at 9:29 pm
Obviously you are confusing foreign keys with referential integrity constraints, which helps to make your article more controversial than it really is.
on 22 Aug 2008 at 11:16 pm
I am not sure how its avoidable – foreign keys are the basis for Relational DBMS.
on 23 Aug 2008 at 12:10 am
I understand you meant to ask, “Are foreign key CONSTRAINTS really necessary…” and obviously the answer is no, they are not. It’s more useful to ask if foreign key constraints are a good thing, or if they should be ignore as Joel Spolsky and lots of other people seem to think.
Asking a question like this when there is 30 years of relational database theory, literature, and real-life experience to draw on is like asking if bicycles really need brakes. No, you can stop by dragging your feet or running into something, but brakes do make things a lot safer and more stable.
Questions like this released into the reddit/digg community add to the widespread ignorance of relational databases and give the impression that foreign key constraints are somehow unsettled or controversial. Foreign key and other constraints are best practices for serious enterprise-quality databases. They are best practices for any relational database if you care about data integrity and not introducing what are called anomalies. If you have to write application code to work around missing constraints in the database, you are doing it wrong. That should be apparent when you or someone else has to write another application on the same database.
The only reason this kind of thing is perpetually brought up and debated is way too many programmers are using relational databases with no understanding of relational theory, and no experience with large-scale databases that support multiple applications. If all you ever do is a to-do list using a trivial database foreign key constraints are not needed. For a lot of trivial applications a relational database is not needed.
I have a pretty good list of books about relational databases, both theory and practice, on my web site here:
http://typicalprogrammer.com/?p=14
on 23 Aug 2008 at 12:21 am
The analogy with brakes is a faulty one because every bike needs a brake. Do you drive with a bike without brakes in a town? Does anyone? Is this even allowed? After all if you cannot properly stop you need to drive in circles permanently and lose speed this way.
But you obviously can design databases that last a long time WITHOUT the foreign keys. And you dont have a speed penalty either.
So the analogy is really really really silly.
The brake is an essential part of a bike. Less important than the wheel and the connection towards the wheel, but essential in its essence – an extension of the wheel. I fail to see how this is a part of the foreign key with databases.
And no matter how many books someone reads about database design – they for sure wont have read that a foreign key is so essentialy that everyone needs to use it.
on 23 Aug 2008 at 12:51 am
Note that MySQL by default uses MyISAM but you need InnoDB to use foreign keys
on 23 Aug 2008 at 1:02 am
the analogy should have been:
Do bicycles need training wheels when you drive them?
The obvious answer is, yes, if you’re gonna be loaning your bike out to kids.
Just like almost all programming/database concepts, foreign keys constraints are a tool that is available for use. They aren’t for every situation, and you should use them if you need them, or think that you will need them in the future.
If you’re supporting a large database with multiple applications and multiple developers, you better have your training wheels on. If you’re an expert driver and you don’t loan out your bike, forget the training wheels, get some of those tight blue shorts, and pedal to your heart’s content. (But wear a helmet)
on 23 Aug 2008 at 1:08 am
>> Note that MySQL by default uses MyISAM but you need InnoDB to use foreign keys
This is why this debate even exists. The shoddy “database” MySQL has been so widely adopted that all of its flaws have morphed from being obvious flaws to being things which are ’subjective’ or ‘debatable’. See also: transactions, ACID compliance, the need for ’storage engines’, and up until relatively recently, subqueries.
Put foreign keys constraints on your tables. Introduce cascades that make sense.
It’s basically a free way to increase the robustness of your application. Do not tell me that you’re worried about the performance impact (the only theoretical ‘downside’), you don’t know what you’re talking about if you are concerned about the performance impact of performing a few index lookups during an INSERT or UPDATE.
on 23 Aug 2008 at 2:04 am
[...] Agreed. Especially this. Whatever the reason, by putting the data validation as close as possible to the data munging, you make it harder to circumvent. [...]
on 23 Aug 2008 at 4:56 am
What has not yet been said is that understanding why foreign keys are so beneficial is just the first step in fully realizing the power of a relational database. If you are not sure yet if you need them, be sure that you need an education.
Foreign keys are the basis of all (or nearly all) advanced table design patterns. Live them, learn them, love them, they are your friend many times over.
on 23 Aug 2008 at 6:45 am
See this interview. Apparently, Ebay doesn’t use foreign key constraint.
http://www.infoq.com/interviews/dan-pritchett-ebay-architecture
on 23 Aug 2008 at 8:16 am
Hmmmm. Well Batman, you can have an associations table and you can search that table for various relationships. You don’t precisely need foreign keys. You ESPECIALLY dont need foreign key constraints.
Anyone who has worked in a large, large DBMS centric system has seen the folly of this.
But your not asking if you need it your asking if “its a good idea”. The answer is up to you. If you want to build a numb nuts RDBMS system to say you did it (wow IM A PROGRAMMER!), the RDBMS vendors are ECSTATIC to give you these “features” to make it harder for you to change out the database. Why did you think these “features” existed?..duh
When your done you can pat yourself on the back for building a 20 year old unmaintainable dinosaur on top of your favorite flavor of CRAP-dbms. Voila!
Problem Solved
on 23 Aug 2008 at 8:20 am
No matter how small or personal is the code you are writing, it makes little sense to eliminate an accepted and proved approach to ensure data integrity, robustness and maintainability.
on 23 Aug 2008 at 8:39 am
There aren’t hard rules on this. FKs are bad for performance, for example, for some purposes. I mean, normalisation is bad for performance.
on 23 Aug 2008 at 9:24 am
All,
Thanks for adding the great comments to the discussion.
Silky,
True. We cannot eliminate a stuff simply because it causes slight performance overhead. It is all about trade-offs. Isn’t it?
on 23 Aug 2008 at 10:39 am
Hey,
There are new generation databases like Teradata, Netezza etc which do not even implement the conventional concepts of primary keys…. And these databases work much faster than the conventional ones. But i dont knw y….
on 23 Aug 2008 at 10:46 am
How can they work without primary keys? I mean the database system should support the concepts. Right?
So what should a developer do to get the functionality of a primary key (like non-duplication)? Should he program it himself?
on 23 Aug 2008 at 11:10 am
@Markus
(used for example in Amsterdam by bike deliveries)
There are certainly bikes without brakes
http://www.oldskooltrack.com/files/nobrakesok.frame.html
on 23 Aug 2008 at 11:57 am
Martijn,
Amazing !!!
on 24 Aug 2008 at 2:17 am
Your database is required to
(a) store data
(b) be circumspect of the data passed to it and run its own validations
(c) implement application / business logic
In case of (b) or (c) – not having fkeys is not an option. In case of (a) only, it is an option
Your database will be used by
(a) One application
(b) More than one applications
Since it is much harder to ensure that all validations are conducted equally effectively across applications, it is preferable to do as many validations in the db layer. Why just restrict yourself to fkeys, you might want to implement many more validations using stored procedures too in case of (b) !
You have faith in your current and future development teams to rigorously architect and build application tiers which can conduct all relationship validations roughly as effectively as it could get done in the db layer
(a) true
(b) false
If it is (b) use foreign keys.
You use OOP / class hierarchies, and probably ORM as well and your database implements class inheritances using
(a) one table per class hierarchy approach
(b) one table per leaf class approach
(c) one table per class approach
If you are using option (b) above it is extremely difficult to use foreign keys, since the foreign keys will refer to different tables based on the leaf class (or the discriminator value) .. stay away from fkeys in this case.
on 24 Aug 2008 at 6:31 am
There is a huge example of a database design that doesn’t make use of foreign keys: SAP.
All of SAP Solutions uses SAP NW as foundation of all Business Solutions.
SAP NW (SAP Netweaver) implements a database abstraction layer to enable Database Vendor independent platform, enabling install SAP Servers in many kinds of database server: Oracle, DB2, MSSQL, MaxDB.
In this implementation SAP designs his own database layer, with your rules and features.
SAP enables Primary Key, Index, Unique Index Keys as transparent use of Primary Key, Index, Unique Index Keys features of the Vendor Databases. But doesn’t make use of Foreign Key constraints in your Database Abstraction Layer and the Vendor Database.
With the use of database without Foreign Keys, enables SAP a more flexible use of database tables and obligates SAP Systems to control his own the relation and integrity related to the relation of database fields, is a great responsibility but the inconsistencies are not so prejudice to the system.
The results of this database inconsistencies are waste of database storage and obligation of managing records without his “father” registry.
SAP implementation is a result of a view of pragmatism, enabling a huge software requirement (Platform Independency) over strict rules of database design.
[]’s.
on 24 Aug 2008 at 4:39 pm
I still remember the days when I learned database design in school, we first did it theoretically and later on did practical examples with Access and MySQL. We didn’t set up a InnoDB database for MySQL and always used the default MyISAM engine, so that the default installation didn’t have foreign key checks.
There’s basically one thing you need to make sure of when not using foreign key’s: the fk id must be a valid id e.g. it must exist in the database and not point to a non-existing record. It’s also very easy to delete an record where a foreign key points to, so you need to check beforehand if there are any foreign keys that point to the record you want to delete, otherwise there will be a lot of inconsistencies.
I now use InnoDB (when using MySQL) or PostgreSQL most of the time and it definitely helps to ease the development.
Cascading updates/deletes are also a nice thing, but you need make sure you understand the whole impact. Deleting a user record might also delete some entries, which might delete some comments, which in turn deletes some other stuff.
One thing I noticed with MyISAM is that it’s a really fast database engine because it’s pretty feature-less.
on 24 Aug 2008 at 7:46 pm
If you want the RDBMS to handle cascade delete and other referential integrity, yes.
on 25 Aug 2008 at 3:37 pm
Dhananjay, Antelio, Arthur,
Thanks for reading and taking time to provide detailed opinions.
on 25 Aug 2008 at 7:14 pm
I design databases on small projects. I am typically the sole designer for a team of about 15-20 developers and systems typically have 150-300 tables.
Unless I am the only person who is going to write code against a database, I will not design a database without foreign keys (or some equivalent enforcement).
The reason:
When you design the database (or sections of an existing database) in advance of the application developers use of it, foreign keys give guidance to the developers. Just in case they don’t completely understand the structure (common), foreign keys kick out errors as they do things incorrectly. Then they change what they are doing or they come and ask me questions and I explain it to them.
To develop a system without FKs (or some enforcement), every person who develops against the system has to have a perfect understanding of the relationships – way too much to ask.
Two other notes:
1. ORMs often use the database FK metadata them to create the correct XML that includes relationships.
2. Even if your developers have perfect understanding of the system – what about the next contract that will attach to the database? I see problems without FK constraints.
on 25 Aug 2008 at 7:29 pm
A different Bob,
True. Thanks for writing.
on 25 Jan 2009 at 11:33 pm
These are all great points. Here’s what I wonder; why was the latest and greatest, high-performance database engine (Falcon) created without fk support?
I don’t understand it at all!
Thoughts?
on 31 Jan 2009 at 6:05 pm
Hi there,
I saw your post at StackOverFlow and replied there but no response yet. Anyway, I understand foreign keys may result in lockings and my question is, will this be a “performance” impact for an online game with high reads/writes?
Cheers,
darnpunk
on 12 Feb 2009 at 10:52 pm
Two things are being confused here:
1. Relationship between two tables
2. How to ensure the relationship
#2 is where the choice of Foreign Key comes in (#1 is an aspect of your data model). You can choose to define/enforce the relationship through your own code, OR you can use Foreign Key.
Doing it yourself gives you some benefits like optimization/customization and non-benefits like someone less knowledgeable overriding your code.
Using Foreign Key has the benefit of being automatically enforced/managed by the database. Disadvantage is that it might not be optimized, and can slow down database updates (since each update will be verified by the database).
Conclusion – If your database is constantly updated, maybe it is better to use custom code instead of foreign keys. Otherwise, maybe best to use foreign keys.
on 03 Jun 2009 at 12:13 pm
[...] Are foreign keys really necessary in a database design? [...]
on 12 Jun 2009 at 5:26 pm
This is ossem sight
Please sent me difference between Primery & Secondry keys.