So I implemented batch gradient descent in Haskell, to simultaneously solidify my understanding of the algorithm and work on learning Haskell.
It got a bit bumpy. I've preserved my realtime notes of the mess. But the short version is that after a certain number of iterations that was an increasing function the learning rate, the model would just terminate in weights of Infinity for all features.Continue reading →
n.b. the math is all in code blocks because the markdown processor screws with underscores and carets otherwise, and mathjax can't handle that. This is making me insane and I might actually write some kind of post-processor to jerk around the generated html files to fix this, but it'll have to do for now.Continue reading →
Nonparametric algorithms reduce "the need to choose features very carefully" (I guess that makes sense if you think of features as mathematical transformations on stuff observed rather than stuff observed in general... a nonparemetric algorithm surely can't avoid the fact that you left something off, though I guess it can help avoid the fact that you threw a bunch of extra stuff in...)Continue reading →