Lost? Go back to the beginning.
A lightning quick recap haiku.
The last part sounds ominous and heroic. I like it.
Go into the game, generate a world with a random seed, walk around for a bit, and then quit. Here’s what you might see gets produced in your Minecraft saved games directory (I called my world ‘testworld’).
A Minecraft saved game is a complete snapshot of the world at a particular time.
level.dat stores information about the weather, time of day, positions and velocities of currently moving pigs, sheep, wolves, the player,Â arrows, snowballs etc.
Files in the region directory store static block information â€”Â they’re the ones in the form “r.x.x.mcr”. These files define the locations and the types of blocks that comprise the entire landscape, and therefore are the only files we’re concerned about. They each represent a square subdivision of the world of a fixed size.
(NB: The number of actual region files may vary from one world to the next, because not all regions exist when the world is created. Regions are in fact procedurally generated lazily as the player explores more of the world. This is a key feature that makes each and every Minecraft world have unique terrains and land features.)
We can write good testsÂ up front whenÂ knowing surprisingly little about the data. If you can’t wait until the next post to find out about the binary format innards of Region files, check out the Minecraft Wiki page.
By treating these ‘.mcr’ files as black boxes that represent Regions, we can take a top-down approach to coding, refining details as they appear. Enter the Region algebraic data type.
First line of code in the bag.Â Similar to declaring an empty class in OO.
The Show and Eq instances will come in handy when debugging; Show lets you convert Region to a string, and Eq lets you compare two Regions using (==).
Note that this is isomorphic to the unit type
â€” a type inhabited by only one value. There’s absolutely no useful information carried by this type other than its presence.
The idea is that the Region type won’t stay like this for long; as we find out more about Minecraft’s Region files, we will enrich its type with more data.
Reading Region files can be treated as deserialisation from a file’s contents into a value of our Region type. I looked at a number ofÂ (de)serialisation options to express this and didn’t take very long to decide onÂ Data.Binary (or simply known by its package name,Â binary) mainly for its simplicity.
Install the necessary packages…
Binary uses what’s calledÂ lazy bytestrings. Lazy ByteString is Haskell’s efficient data type dealing with binary data, thanks to their lazily loaded chunked structure. This translates to something like buffered reads and writes.
The main contribution of Data.Binary is the Binary type class, which is defined like so:
i.e. for your type to be regarded as having a Binary representation, the ‘put’ and ‘get’ (de)serialisation methods must be provided.
encode(File) and decode(File) functions are provided at the programmer’s convenience for writing or reading Region data to and from either bytestrings or files (lazily). They rely on the implementations of ‘put’ and ‘get’.
To make Region serialisable, we must provide a Binary instance for Region.Â You can provide such an implementation in one of two ways:
This approach makes use of Template Haskell, which is a form ofÂ meta-programming based onÂ compile-time reflection (it requires no runtime type information)
Sadly, we can’tÂ capitaliseÂ on the former one-line wonder because our Region wouldn’t have been serialised with Data.Derive.Binary. Being a custom file format, work needed to emulate Minecraft’s own saved game serialiser will be deferred for two posts down the line. ‘undefined’ shall have to suffice for now!
‘put’ and ‘get’ are actually monadic computations (yes, monadic; don’t run away yet!) for the purpose of restricting the types of operationsÂ permittedÂ within the serialisation and deserialisation procedures. This is aÂ design choice of Data.Binary.
Imagine for a moment that bytestrings were top hatsÂ â€” some sort of container for a piece of data.
Putting magical thinking aside, if you ‘put’ a Region into a top hat, the only thing that you can possibly hope to ‘get’ back out of the top hat is that same Region, right?
If we manage to pull anything else but the same Region out of the top hat, the chances are that it’s probably not even a rabbit; it’s going to be an ugly,Â nasty bug. Euck.
Point being, ‘put’ and ‘get’ should clearly be be inverses. For any Region r,
whereÂ runPutÂ runs a Put computation that takes a Region and produces a bytestring, while runGet runs a Get computation on a bytestring and produces a Region. This can be written more succinctly as
or, ideally, as a point-free property
How do we make sure of this? Do fuzz testing with QuickCheck! (Frank’s alluded to this nice tool in his sort of recent post).
What might be surprising is that whileÂ Haskell has a good number of testing libraries covering a range of uses:
… yet there hasn’t been a decent unifiedÂ test runner. Programers have so far resorted to writing, compiling and invoking their own ‘test programs’ for test coverage.
… untilÂ Max Bolingbroke’s recent excellent GSOC work onÂ test-framework, that is!Â This test library integrates Quickcheck and HUnit tests with cabal so that you can write something like
to run all your tests.
Finally, simple command line test runner. Yes please.Â Get it installed like so:
The latter two are known as test providers, one for each of Quickcheck2 and HUnit.
While test-framework does seem to make cabal a bit more comprehensive like Maven or NAnt, lots of work still needs to be done to improve the state of Haskell tooling.
We create a TestMain.hs file and start with some imports
A simple entry point of the test runner is provided by test-framework, and is called defaultMain.
Then, we define the test-suite, test-groups, and finally tests that belong in those groups, to construct a hierarchically structured test run. We call our test-group “Properties” just for clarity.
The property propDecEncRegion can be defined as
This is simply a Bool-valued function. We cannot write this using the point-free style easily, because the ‘point’ (r) is applied on both sides of the equivalence, though I would say it’s clear enough.
Quickcheck2 is a fuzzer, right? That means that for our custom Region type we have to provide it a generator to pull values out of. Again, if you’re lazy, this can be derived using Data.Derive.Arbitrary, as long as you also derive Data and Typeable instances. For us, it’s so simple the compile-time cost isn’t even justified:
There can only be one value (Region) so return that value every time.Â We can (and will!) elaborate on this later.
Okay, the test is ready for a run. Any Haskeller would use cabal for build management, so simply set up a cabal project by using
then fill in the appropriate fields when prompted. Update the build dependencies, and you will end up with a *.cabal file looking a bit like this (minus generated comments for your viewing pleasure).
Note that the Build-Depends section declares the test libraries required. Don’t worry if you forget, because cabal will complain about hidden packages when you try to configure the project.
Since testing is relatively a new feature, cabal configure still requires an explicit flag –enable-tests. –show-details=’always’ will ensure the full test reporting output is printed, as below.
Test failure; as expected!
Fixing this test is a big job. We start doing this in the next instalment.