As my plans for a startup is about to become reality, I think it’s time to start telling the world beyond my twitter followers that something is about to happen. My main worry was whether I could get 1-2 interesting people to join me. The first choice in partner was a hit so we’re rolling. In the coming weeks, I will focus on funding and basic infrastructure for a prototype.
I did a proof-of-concept a few weeks ago. The result was a great success. Even with statistically bad quality data, the method worked. The bad quality data was a very small document corpus that I used to generate idf-values. Being able to show results despite this issue indicates that my approach is robust which is always a plus when you are doing things on a big scale.