This article is a follow up one I wrote last week entitled
“The NoSQL Movement, LINQ, and MongoDB - Oh My!”. In that article I introduced
the NoSQL movement, MongoDB, and showed you how to program against it in .NET using
LINQ and NoRM.
I highlighted two cornerstone reasons why you might ditch your SQL Server for the
NoSQL world of MongoDB. Those were
1. Ease-of-use and deployment
2. Performance
For ease-of-use, you’ll want to
read the original article.
This article is about the performance argument for MongoDB over SQL Server (or MySql
or Oracle). In the first article, I threw out a potentially controversial graph
showing MongoDB performing 100 *times* better than SQL Server for inserts.
“A potentially controversial graph showing MongoDB performing 100 times better than
SQL Server”

We’ll see source code, downloadable and executable examples and you can verify all
of this for yourselves. But first, here’s a new twist on an old proverb:
“Data is money”
If your application is data intensive and stores lots of data, queries lots of data,
and generally lives and breathes by its data, then you’d better do that efficiently
or have resources (i.e. money) to burn.
Let’s imagine you’re creating a website that is for-pay and data intensive. If you
were to attempt to plan out your operating costs per user to help guide the pricing
of your product then the cost of storing, querying, and managing your data will
likely be a significant part of that calculation.
If there is a database that is 100 times faster than SQL Server, free, easy to administer
and you program it with LINQ just as you would with SQL Server then that is a very
compelling choice.
When you have such a database, it means you can run your system on commodity hardware
rather than high-end servers. It means you can have fewer servers to maintain and
purchase or lease. It means you can charge a lot less per user of your application
and get the same revenue. Think about it.
“It means you can charge a lot less per user of your application and get the same
revenue. Think about it.”
One more story before we see the statistics. Kristina Chodorow from 10Gen gave a
talk a few weeks ago at San Francisco’s MySQL Meetup entitled “Dropping ACID with
MongoDB”. You can watch the recording here:
http://www.ustream.tv/recorded/6146875
[The audio and video isn’t too hot, but the content is. Skip the first minute without
audio.]
During this talk, Kristina describes SourceForge’s experience moving from MySql
to MongoDB. On MySql, SourceForge was reaching its limits of performance at its
current user load. Using some of the easy scale-out options in MongoDB, they fully
replaced MySQL and found MongoDB could handle the current user load easily. In fact,
after some testing, they found their site can now handle 100 times the number of
users it currently supports.
Not convinced of this NoSQL thing yet? Fair enough. Here are some graphs, some stats,
and some code.
The scenario:
Model a data intensive web application aiming to support as many concurrent users
as possible. There will be users from the web application itself. But there will
also be users from an API and external applications. Users will interact with the
data by having nearly as many inserts as they do queries. Their inserts are all
small pieces of data and are all independent of each other.
Let me just get this out of the way and I mean the following in the nicest of ways:
I don’t care about your scenario or use-case. The scenario above is what I’m trying
to model. I’m not trying to do bulk-inserts or loading large files into databases
or anything like that. MongoDB may be great for these. SQL Server may have specialized
features around your use-case, etc. They don’t apply in my scenario. So please don’t
wonder why I’m not using bulk inserts or anything like that in the examples below.
Insert Speed Comparison
It’s the inserts where the differences are most obvious between MongoDB and SQL
Server.


These inserts were performed by inserting 50,000 independent objects using NoRM
for MongoDB and LINQ to SQL for SQL Server 2008. Here are the data models:

MongoDB basic class

SQL Server basic class
I ran five concurrent clients hammering the databases with inserts. Here’s the screenshots
for
running against MongoDB and
against SQL Server. Let’s zoom into the most important result with the
output from one of five concurrent clients:
MongoDB:

SQL Server:

That’s right. It’s 2 seconds verses 3 1/2 minutes!
Now to be fair, this was using LINQ to SQL on the SQL side which is slow on the
inserts. After discussing these results with some friends, I re-ran the tests using
raw ADO.NET style programming and saw a 1.5x-3x performance improvement for SQL.
That still leaves MongoDB 30x-50x faster than SQL.
Query Speed Comparison
Now let’s see about getting the data out using the same objects above on the indexed
Id field for each database.


Here MongoDB still kicks some SQL butt with almost 3x performance. If we were to
leverage the mad scale-out options that MongoDB affords then we could kick that
up to many times more.
“If we were to leverage the mad scale-out options that MongoDB affords then we could
kick that up to many times more.”
Complex Data and the Real World
Feel like that was an overly simplified example? Here’s some real world data with
foreign keys and joins. Below is the complex data model.
MongoDB:

SQL Server:

It shouldn’t surprise you that MongoDB does even better here without its joins.


The Hardware
All of these tests were run on a Lenovo T61 on Windows 7 64-bit with a dual-core
2.8 GHz processor using the 64-bit versions of both SQL Server 2008 Standard and
MongoDB 1.4.1. You can even see a picture of the computer here: http://twitpic.com/hywa8
Your Turn
If you want to see the entire set of data above as an Excel spreadsheet, you can
download that here:
http://www.michaelckennedy.com/Downloads/sql-vs-mongo.xlsx
You can also download the sample code. Before you do, realize I haven’t done a bunch
of work to make it super easy to run. But you should be able to figure it out. Just
turn the knobs on the PerfConstants class for the number of inserts and queries.
Then comment or uncomment sections of the code in the clients for your scenarios.
The expected use is that you’ll start the launcher application then use it to launch
five concurrent clients at exactly the same time.
Download Sample:
http://www.michaelckennedy.com/Samples/SpeedOfSqlVsMongoDBAnddotNetSample.zip
Got feedback? Write a comment or contact me on Twitter:
@mkennedy or find me in
any of these other ways.
Thanks!
Some thanks are in order for all the help I got bouncing around ideas as well as trying different scenarios.
Thanks to
Eric Cain @arcain
Jim Lehmer@dullroar
Karl Seguin @karlseguin