February 22, 2008
by Kevin Hunter
0 comments
Categories:
Technical

Database Explained for the Business Professional

One problem with marketing is that it introduces pseudo-false concepts, arbitrarily divorces necessarily wed ones, and leaves out all the gory details. We recently had a client ask us,

Why would we do our data model/data base outside of MySQL and not have our software make calls upon it? I thought one used MySQL to manage the data and the database, and used software to call, update and display it …

The answer, of course, is "it depends." However, given the context of the situation with our client, whose project absolutely mandated a database, this highlighted a couple of large misconceptions. The first and foremost is that a data model and database are the same thing. They are not. A data model is a description of how a system will handle data. A database is how a system enforces that description. The difference is subtle, but significant.

For example, if I put 500 USD into my bank account, I expect that when I want it later, the funds are still there. But what if I made a mistake and put a 1 instead of a 5 for the account number? Or what if the teller can't read my handwriting and puts down a 5 instead of 6? What's the guard to make sure that I won't lose my money to some lucky schmo? In this hypothetical example, the bank database would see my name on the transaction and raise a flag that it didn't match the number the teller typed. That's the job of the database, to make sure that the constituent (me) is correctly tied to related items (500 USD, account number, name). If I had just put money in and there was no database, there would just be one record among millions that showed that I gave some random account a gift. Good luck proving that I did not mean to do that.

But let's return to "it depends." It depends on what one needs to do with the data. If one merely needs a log or list of data produced, one will not need the complexity of a Relational Database Management System (RDBMS). On the other hand, the minute one wants to do something fun, like see who accessed two different pages of a site on the 1st and 5th of each month, between the hours of 3p and 8p from certain departments of 4 different companies, one wants an RDBMS. As the name "Relational" implies, a well designed data model in the hands of an RDBMS makes extremely difficult and perhaps random questions easy, or at least possible.

The second misconception lay in his confusion about keeping his project outside of the database. In his mind at the time, MySQL was it. Similar to how most people don't know that Microsoft Windows is a choice, he thought MySQL was the only database. When he heard us drop words like "SQLite," "Postgres," and "datastore" around, he was confused. There exist today a plethora of databases, each targeting a segment of the market and each rife with their own strengths and weaknesses. To name drop a few: DB2, Derby, Firebird, MySQL, Oracle, Postgres, Sqlite. There are others, too (hint: google for 'database comparison wikipedia'). Each of these has a specific market segment they target. I won't get into it here, but knowing the right one to pick for your needs is difficult at best.

Overall, I absolutely love databases. They are supremely excellent beasts. Properly designed, databases help your company enforce all the nitty-gritty details, and pull all kinds of interesting information from your data. However, like any tool, use it for the jobs at which it excels and not for the jobs at which fails. If all you need is a storage device, use a simple text file or spreadsheet. If you need long-term data integrity and powerful information scraping capabilities, absolutely use a database.

blog comments powered by Disqus