When it comes to structuring your MySQL database you have one of two choices. You can normalise your schema or you can denormalise it. Both have their advantages and trade-offs, but when should you use one over the other?
Essentially normalising your schema means breaking it into multiple tables. In a traditional application you might have the following tables; users, categories, pages, profiles, profilemeta, tags, media, pagescategories and so on. Normalisation prevents redundant data and keeps things lean.
You might be familiar with normalisation if you’ve ever had to JOIN multiple tables together to generate a result (a social network profile page is a good example of this). You would have a profiles table, users table, a relationships table and possibly a few other tables to join. If you are using proper indexes, then speed isn’t really an issue as indexed queries will be fast even on a lot of rows.
With normalisation you also prevent content duplication which some would argue is a non-issue given how cheap storage is, but that is besides the point. Every single entity is only stored once, so there is no inconsistency when it comes to data.
When you denormalise your data you go the exact opposite as normalisation. Instead of splitting up data into smaller tables which you join, denormalised data means you cram as much data as you can into one table.
Using the social networking example you would replace multiple tables with one user table and have all fields relating to a user in it. Instead of having a users table, a profiles table and a profile meta table, every single possible field is now in the users table.
Your SELECT queries are going to be fast now, but you’ve just opened up a Pandora’s box of pain: nice one. Why? What about bits of data the user hasn’t supplied that are optional? You’ve now got gaping holes in your data and your results are fat.
So which one makes more sense?
Academically, most people are taught to choose normalisation and in most cases it makes sense. You shouldn’t worry too much about your database structure in terms of scaling in the early days of a project. You’re unlikely to run into any scaling problems a little more hardware can’t fix and even then, we are talking potentially hundreds of millions of rows.
The general consensus is to normalise your tables first and then go through and denormalise the read heavy parts and leave the write heavy parts normalised. In the beginning choose one or the other, normalisation has the benefit of playing nicely with an MVC application structure opposed to a denormalised structure which can be messy and chaotic if not managed correctly.