Pig lovers meet TOP

Have you ever needed to get the top n items for a key in Pig? For instance the most popular three items in each country for an online store? You could always solve this the hard way by calculating a threshold per country and then filter on that threshold. This is neither to write or execute. What you […]

Crash course in Erlang

This is a summary of a talk I held Monday May 14 2012 at an XP Meetup in Trondheim. It is meant as a teaser for listeners to play with Erlang themselves. First, some basic concepts. Erlang has a form of constant called atom that is defined on first use. They are typically used as […]

Case statement pitfall when migrating to Ruby 1.9.2

Note that the pitfall is limited to MRI (standard Ruby) version 1.9.2. MRI 1.9.3, JRuby and Rubinius does not have this behavior. I have been using Rubinius 2.0 to run machine learning experiments with libsvm lately. When running in Ruby 1.9.2, I noticed that my classifier always classified all samples as negative. I though this […]

CouchDB and the web

This is my presentation from JavaZone 2010 Note that during my presentation, I showed the view section and basic replication directly in Futon instead of showing the fallback in the slides. What I did show was mostly the same, but naturally I showed some variations on the mappers as well.

CouchDB on Amazon EC2 CentOS server with Sprinkle

Read the Getting Started part of Till Klampäckel’s CouchDB on Ubuntu on AWS blog post for some general information. I see no reason to repeat those things here. Till stresses the need for a security group opening port 80, but you should also enable ssh at port 22, otherwise it will be impossible to isntall […]

Camping with CouchDB

When developing a new system, getting end-to-end functionality and being able to demonstrate it as soon as possible is important. While doing so, it’s also an added benefit if you do not spend a lot of time writing throwaway code. I have a set of scripts that let me test and use the system that […]