mongodb/tutorial.md

MongoDB Haskell Mini Tutorial
-----------------------------

  __Author:__ Brian Gianforcaro (b.gianfo@gmail.com)

  __Updated:__ 2/28/2010

This is a mini tutorial to get you up and going with the basics
of the Haskell mongoDB drivers. It is modeled after the
[pymongo tutorial](http://api.mongodb.org/python/1.4%2B/tutorial.html).

You will need the mongoDB bindings installed as well as mongo itself installed.

    $ = command line prompt
    > = ghci repl prompt


Installing Haskell Bindings
---------------------------

From Source:

    $ git clone git://github.com/srp/mongoDB.git
    $ cd mongoDB
    $ runhaskell Setup.hs configure
    $ runhaskell Setup.hs build
    $ runhaskell Setup.hs install

From Hackage using cabal:

    $ cabal install mongoDB

Getting Ready
-------------

Start a MongoDB instance for us to play with:

    $ mongod

Start up a haskell repl:

    $ ghci

Now we'll need to bring in the MongoDB/BSON bindings and set
OverloadedStrings so literal strings are converted to UTF-8 automatically.

    > import Database.MongoDB
    > :set -XOverloadedStrings

Making A Connection
-------------------
Open up a connection to your DB instance, using the standard port:

    > Right conn <- runNet $ connect $ host "127.0.0.1"

or for a non-standard port

    > Right conn <- runNet $ connect $ Host "127.0.0.1" (PortNumber 30000)

*connect* throws IOError if connection fails and *runNet* catches IOError and
returns it as Left. We are assuming above it won't fail. If it does you will get a
pattern match error.

Connected monad
-------------------

The current connection is held in a Connected monad, and the current database
is held in a Reader monad on top of that. To run a connected monad, supply
it and a connection to *runConn*. To access a database within a connected
monad, call *useDb*.

Since we are working in ghci, which requires us to start from the
IO monad every time, we'll define a convenient *run* function that takes a
db-action and executes it against our "test" database on the server we
just connected to:

    > let run action = runNet $ runConn (useDb "test" action) conn

*runConn* return either Left Failure or Right result. Failure
means there was a read or write exception like cursor expired or duplicate key insert.
This combined with *runNet* means our *run* returns *(Either IOError (Either Failure a))*.

Databases and Collections
-----------------------------

A MongoDB can store multiple databases -- separate namespaces
under which collections reside.

You can obtain the list of databases available on a connection:

    > run allDatabases

The "test" database is ignored in this case because *allDatabases*
is not a query on a specific database but on the server as a whole.

Databases and collections do not need to be created, just start using
them and MongoDB will automatically create them for you.

In the below examples we'll be using the database "test" (captured in *run*
above) and the colllection "posts":

You can obtain a list of collections available in the "test" database:

    > run allCollections

Documents
---------

Data in MongoDB is represented (and stored) using JSON-style
documents. In mongoDB we use the BSON *Document* type to represent
these documents. A document is simply a list of *Field*s, where each field is
a named value. A value is a basic type like Bool, Int, Float, String, Time;
a special BSON value like Binary, Javascript, ObjectId; a (embedded)
Document; or a list of values. Here's an example document which could
represent a blog post:

    > import Data.Time
    > now <- getCurrentTime
    > :{
      let post = ["author" =: "Mike",
                  "text" =: "My first blog post!",
                  "tags" =: ["mongoDB", "Haskell"],
                  "date" =: now]
      :}

Inserting a Document
-------------------

To insert a document into a collection we can use the *insert* function:

    > run $ insert "posts" post

When a document is inserted a special field, *_id*, is automatically
added if the document doesn't already contain that field. The value
of *_id* must be unique across the collection. *insert* returns the
value of *_id* for the inserted document. For more information, see
the [documentation on _id](http://www.mongodb.org/display/DOCS/Object+IDs).

After inserting the first document, the posts collection has actually
been created on the server. We can verify this by listing all of the
collections in our database:

    > run allCollections

* Note The system.indexes collection is a special internal collection
that was created automatically.

Getting a single document with findOne
-------------------------------------

The most basic type of query that can be performed in MongoDB is
*findOne*. This method returns a single document matching a query (or
*Nothing* if there are no matches). It is useful when you know there is
only one matching document, or are only interested in the first
match. Here we use *findOne* to get the first document from the posts
collection:

    > run $ findOne (select [] "posts")

The result is a document matching the one that we inserted previously.

* Note: The returned document contains an *_id*, which was automatically
added on insert.

*findOne* also supports querying on specific elements that the
resulting document must match. To limit our results to a document with
author "Mike" we do:

    > run $ findOne (select ["author" =: "Mike"] "posts")

If we try with a different author, like "Eliot", we'll get no result:

    > run $ findOne (select ["author" =: "Eliot"] "posts")

Bulk Inserts
------------

In order to make querying a little more interesting, let's insert a
few more documents. In addition to inserting a single document, we can
also perform bulk insert operations, by using the *insertMany* function
which accepts a list of documents to be inserted. It send only a single
command to the server:

    > now <- getCurrentTime
    > :{
      let post1 = ["author" =: "Mike",
                   "text" =: "Another post!",
                   "tags" =: ["bulk", "insert"],
                   "date" =: now]
      :}
    > :{
      let post2 = ["author" =: "Eliot",
                   "title" =: "MongoDB is fun",
                   "text" =: "and pretty easy too!",
                   "date" =: now]
      :}
    > run $ insertMany "posts" [post1, post2]

* Note that post2 has a different shape than the other posts - there
is no "tags" field and we've added a new field, "title". This is what we
mean when we say that MongoDB is schema-free.

Querying for More Than One Document
------------------------------------

To get more than a single document as the result of a query we use the
*find* method. *find* returns a cursor instance, which allows us to
iterate over all matching documents. There are several ways in which
we can iterate: we can call *next* to get documents one at a time
or we can get all the results by applying the cursor to *rest*:

    > Right cursor <- run $ find (select ["author" =: "Mike"] "posts")
    > run $ rest cursor

Of course you can use bind (*>>=*) to combine these into one line:

    > run $ find (select ["author" =: "Mike"] "posts") >>= rest

* Note: *next* automatically closes the cursor when the last
document has been read out of it. Similarly, *rest* automatically
closes the cursor after returning all the results.

Counting
--------

We can count how many documents are in an entire collection:

    > run $ count (select [] "posts")

Or count how many documents match a query:

    > run $ count (select ["author" =: "Mike"] "posts")

Range Queries
-------------

To do

Indexing
--------

To do
Add mini-tutorial 2010-02-28 12:19:02 +00:00			`MongoDB Haskell Mini Tutorial`
			`-----------------------------`

			`__Author:__ Brian Gianforcaro (b.gianfo@gmail.com)`

			`__Updated:__ 2/28/2010`

			`This is a mini tutorial to get you up and going with the basics`
changed formatting in tutorial 2010-03-09 05:13:01 +00:00			`of the Haskell mongoDB drivers. It is modeled after the`
			`[pymongo tutorial](http://api.mongodb.org/python/1.4%2B/tutorial.html).`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`You will need the mongoDB bindings installed as well as mongo itself installed.`

changed formatting in tutorial 2010-03-09 05:13:01 +00:00			`$ = command line prompt`
			`> = ghci repl prompt`
Add mini-tutorial 2010-02-28 12:19:02 +00:00

			`Installing Haskell Bindings`
			`---------------------------`

			`From Source:`
fix trailing whitespace in tutorial 2010-03-01 14:15:40 +00:00
changed formatting in tutorial 2010-03-09 05:13:01 +00:00			`$ git clone git://github.com/srp/mongoDB.git`
			`$ cd mongoDB`
			`$ runhaskell Setup.hs configure`
			`$ runhaskell Setup.hs build`
			`$ runhaskell Setup.hs install`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`From Hackage using cabal:`

changed formatting in tutorial 2010-03-09 05:13:01 +00:00			`$ cabal install mongoDB`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`Getting Ready`
			`-------------`

			`Start a MongoDB instance for us to play with:`

changed formatting in tutorial 2010-03-09 05:13:01 +00:00			`$ mongod`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`Start up a haskell repl:`

changed formatting in tutorial 2010-03-09 05:13:01 +00:00			`$ ghci`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`Now we'll need to bring in the MongoDB/BSON bindings and set`
			`OverloadedStrings so literal strings are converted to UTF-8 automatically.`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
changed formatting in tutorial 2010-03-09 05:13:01 +00:00			`> import Database.MongoDB`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`> :set -XOverloadedStrings`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
fix trailing whitespace in tutorial 2010-03-01 14:15:40 +00:00			`Making A Connection`
Add mini-tutorial 2010-02-28 12:19:02 +00:00			`-------------------`
			`Open up a connection to your DB instance, using the standard port:`

Slight API refactoring. Fix spinning pipeline when other end disconnects. Handle response flags correctly 2010-07-27 21:18:53 +00:00			`> Right conn <- runNet $ connect $ host "127.0.0.1"`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`or for a non-standard port`

Slight API refactoring. Fix spinning pipeline when other end disconnects. Handle response flags correctly 2010-07-27 21:18:53 +00:00			`> Right conn <- runNet $ connect $ Host "127.0.0.1" (PortNumber 30000)`
tutorial: update connection section for new ConnectOpts stuff 2010-03-01 14:27:59 +00:00
Slight API refactoring. Fix spinning pipeline when other end disconnects. Handle response flags correctly 2010-07-27 21:18:53 +00:00			`connect throws IOError if connection fails and runNet catches IOError and`
			`returns it as Left. We are assuming above it won't fail. If it does you will get a`
			`pattern match error.`
tutorial: update connection section for new ConnectOpts stuff 2010-03-01 14:27:59 +00:00
Handle response flags correctly, plus some comment changes 2010-07-03 17:15:30 +00:00			`Connected monad`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`-------------------`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`The current connection is held in a Connected monad, and the current database`
			`is held in a Reader monad on top of that. To run a connected monad, supply`
			`it and a connection to runConn. To access a database within a connected`
			`monad, call useDb.`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`Since we are working in ghci, which requires us to start from the`
			`IO monad every time, we'll define a convenient run function that takes a`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`db-action and executes it against our "test" database on the server we`
			`just connected to:`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
Slight API refactoring. Fix spinning pipeline when other end disconnects. Handle response flags correctly 2010-07-27 21:18:53 +00:00			`> let run action = runNet $ runConn (useDb "test" action) conn`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
Slight API refactoring. Fix spinning pipeline when other end disconnects. Handle response flags correctly 2010-07-27 21:18:53 +00:00			`runConn return either Left Failure or Right result. Failure`
			`means there was a read or write exception like cursor expired or duplicate key insert.`
			`This combined with runNet means our run returns (Either IOError (Either Failure a)).`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`Databases and Collections`
			`-----------------------------`
fix trailing whitespace in tutorial 2010-03-01 14:15:40 +00:00
Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`A MongoDB can store multiple databases -- separate namespaces`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`under which collections reside.`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`You can obtain the list of databases available on a connection:`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`> run allDatabases`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`The "test" database is ignored in this case because allDatabases`
			`is not a query on a specific database but on the server as a whole.`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`Databases and collections do not need to be created, just start using`
			`them and MongoDB will automatically create them for you.`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`In the below examples we'll be using the database "test" (captured in run`
			`above) and the colllection "posts":`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`You can obtain a list of collections available in the "test" database:`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`> run allCollections`
update tutorial to include more information 2010-03-10 00:32:36 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`Documents`
			`---------`

			`Data in MongoDB is represented (and stored) using JSON-style`
			`documents. In mongoDB we use the BSON Document type to represent`
			`these documents. A document is simply a list of Fields, where each field is`
			`a named value. A value is a basic type like Bool, Int, Float, String, Time;`
			`a special BSON value like Binary, Javascript, ObjectId; a (embedded)`
			`Document; or a list of values. Here's an example document which could`
			`represent a blog post:`

			`> import Data.Time`
			`> now <- getCurrentTime`
update tutorial to include more information 2010-03-10 00:32:36 +00:00			`> :{`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`let post = ["author" =: "Mike",`
			`"text" =: "My first blog post!",`
			`"tags" =: ["mongoDB", "Haskell"],`
			`"date" =: now]`
update tutorial to include more information 2010-03-10 00:32:36 +00:00			`:}`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`Inserting a Document`
			`-------------------`

lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00			`To insert a document into a collection we can use the insert function:`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`> run $ insert "posts" post`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`When a document is inserted a special field, _id, is automatically`
			`added if the document doesn't already contain that field. The value`
tutorial: update examples to show what insert returns 2010-03-14 03:51:05 +00:00			`of _id must be unique across the collection. insert returns the`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00			`value of _id for the inserted document. For more information, see`
			`the [documentation on _id](http://www.mongodb.org/display/DOCS/Object+IDs).`

			`After inserting the first document, the posts collection has actually`
			`been created on the server. We can verify this by listing all of the`
			`collections in our database:`

See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`> run allCollections`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00
			`* Note The system.indexes collection is a special internal collection`
			`that was created automatically.`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`Getting a single document with findOne`
			`-------------------------------------`

lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00			`The most basic type of query that can be performed in MongoDB is`
			`findOne. This method returns a single document matching a query (or`
			`Nothing if there are no matches). It is useful when you know there is`
			`only one matching document, or are only interested in the first`
			`match. Here we use findOne to get the first document from the posts`
			`collection:`

Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`> run $ findOne (select [] "posts")`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`The result is a document matching the one that we inserted previously.`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00
			`* Note: The returned document contains an _id, which was automatically`
			`added on insert.`

			`findOne also supports querying on specific elements that the`
			`resulting document must match. To limit our results to a document with`
			`author "Mike" we do:`

Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`> run $ findOne (select ["author" =: "Mike"] "posts")`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00
			`If we try with a different author, like "Eliot", we'll get no result:`

Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`> run $ findOne (select ["author" =: "Eliot"] "posts")`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00
			`Bulk Inserts`
			`------------`

			`In order to make querying a little more interesting, let's insert a`
			`few more documents. In addition to inserting a single document, we can`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`also perform bulk insert operations, by using the insertMany function`
			`which accepts a list of documents to be inserted. It send only a single`
			`command to the server:`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`> now <- getCurrentTime`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00			`> :{`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`let post1 = ["author" =: "Mike",`
			`"text" =: "Another post!",`
			`"tags" =: ["bulk", "insert"],`
			`"date" =: now]`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00			`:}`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`> :{`
			`let post2 = ["author" =: "Eliot",`
			`"title" =: "MongoDB is fun",`
			`"text" =: "and pretty easy too!",`
			`"date" =: now]`
			`:}`
			`> run $ insertMany "posts" [post1, post2]`
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`* Note that post2 has a different shape than the other posts - there`
			`is no "tags" field and we've added a new field, "title". This is what we`
			`mean when we say that MongoDB is schema-free.`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`Querying for More Than One Document`
			`------------------------------------`

lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00			`To get more than a single document as the result of a query we use the`
			`find method. find returns a cursor instance, which allows us to`
			`iterate over all matching documents. There are several ways in which`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`we can iterate: we can call next to get documents one at a time`
			`or we can get all the results by applying the cursor to rest:`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`> Right cursor <- run $ find (select ["author" =: "Mike"] "posts")`
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`> run $ rest cursor`
tutorial: add hint about binding allDocs to find 2010-03-01 14:28:38 +00:00
lift more tutorial sections from pymongo 2010-03-10 22:39:58 +00:00			`Of course you can use bind (>>=) to combine these into one line:`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`> run $ find (select ["author" =: "Mike"] "posts") >>= rest`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`* Note: next automatically closes the cursor when the last`
			`document has been read out of it. Similarly, rest automatically`
			`closes the cursor after returning all the results.`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`Counting`
			`--------`

changed formatting in tutorial 2010-03-09 05:13:01 +00:00			`We can count how many documents are in an entire collection:`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`> run $ count (select [] "posts")`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`Or count how many documents match a query:`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
Slight redesign: pipelining with writeMode instead of exclusive access with getLastError 2010-06-21 15:06:20 +00:00			`> run $ count (select ["author" =: "Mike"] "posts")`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`Range Queries`
			`-------------`

See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`To do`
Add mini-tutorial 2010-02-28 12:19:02 +00:00
			`Indexing`
			`--------`

See V0.5.0-Redesign.md for description of changes in this commit 2010-06-15 03:14:40 +00:00			`To do`