Take a look also at Gian Maria’s posts on the topic.

As I’ve said during my talk about RavenDB at the UgiAlt.Net conference one of the most important concept of a document database and in this case of RavenDB is the concept of indexing. An index is the only way (well, not really) to read your data from the database, without an index the database is fairly useless.

As all the databases out there RavenDB offers two main way to access your own data:

  • Load data: load expects to search data using the identifier of the requested data, a load operation does not requires an index to be performed nor it is affected by the indexing process; once the data is written the same data, using its identifier, is immediately accessible using a load operation;
  • Query data: querying is the art of searching data given a set of criteria, in a document database a query requires an index to be executed, in RavenDB does not exists the concept of full table scan, basically because table does not exists Smile and because it is a non sense to scan the entire database and for each document the entire document searching for something that matches the given criteria;

Personal Statement

The fact that the concept of a full table scan does not exists, from my point of view, is one of the coolest features a database (note that I’ve not specified “document” here…) can provide…

What the hell are you saying?

Smile as I have already said document databases perfectly suites the domain driven design world (once again I’m not saying that relational databases does not suite, it is not a fight) or much better tend to drive the developer into the DDD world. In this travel the developer is forced to think about the shape of the data and to think about the usages of the data because the database requires the developer to define the indexes to query the stored data and indexes are directly related to data usages.

Unlike relational databases where exists the concept of “normal form” of data in the document world there isn’t any rule on how to design the shape of the data and such lack can be considered a feature (once you gained a bunch of experience) because you are forced to think at the shape in term of scenarios and use cases and not only in term of good data design; on the other side relational databases tends to follow the developer always returning data even if the shape is the worst possible shape…obviously the developer will be punished at runtime once the application is in production…too late my dear.

Let’s go back to the topic of the post…

Indexes

I suppose we all have understood that indexes are a requirement in a document database, but what are indexes? the most trivial compare can be done with an hash table where the indexed data is the key and the value is the document that matches those data: too simplistic but gives the idea.

In the next post we’ll start diving into the indexing concepts of RavenDB, looking at:

  • Simple Indexes (or Map);
  • Index Projection;
  • Map/Reduce;

.m