The Kitchen Sink and Other Oddities

Atabey Kaygun

Online Perceptron

Description of the problem

Few days ago I described a machine learning algorithm that learns the coefficients of a regression. The setup I gave is far too general. If you read it carefully, you see that I described a perceptron

Today, I am going to redo what I did in scala.

An implementation in scala

Here is my implementation:

import $ivy.'org.scalanlp:breeze_2.11:1.0-RC2'
import breeze.linalg._
import math.random
import scala.util.Random.nextInt

def logit(x:Double):Double = 1.0/(1.0 + math.exp(-x))
import $ivy.'org.scalanlp:breeze_2.11:1.0-RC2'
import breeze.linalg._
import math.random
import scala.util.Random.nextInt
defined function logit
case class perceptron(n: Int, fn: Double=>Double, eta: Double){

    private var derivative = 0.0
    private var input = DenseVector.zeros[Double](n+1)
    var weights = DenseVector.zeros[Double](n+1)

    def forward(xs:DenseVector[Double]):Double = {
        input = DenseVector.vertcat(DenseVector(1.0),xs)
        val calc = input.dot(weights)
        derivative = (fn(calc + eta/2) - fn(calc - eta/2))/eta
        return fn(calc)
    }

    def train(ys:Double, xs:DenseVector[Double]) {
        val delta = ys - forward(xs)
        weights += delta*eta/(derivative + eta*(2.0-math.random))*input
    }
}
defined class perceptron

The data I am going to use is Skin Segmentation Data Set from UCI. The data consists of 4 columns each is an integer from 0 to 255 indicating pixel color (RGB) values, and the last is either 1 or 2.

Let us read the data:

val raw = scala.io.Source.fromURL("https://archive.ics.uci.edu/ml/machine-learning-databases/00229/Skin_NonSkin.txt")
               .mkString
               .split("\n")
               .map(_.split("\t").map(_.toDouble).reverse)
val n = raw.length-1
raw: Array[Array[Double]] = Array(
  Array(1.0, 123.0, 85.0, 74.0),
  Array(1.0, 122.0, 84.0, 73.0),
  Array(1.0, 121.0, 83.0, 72.0),
  Array(1.0, 119.0, 81.0, 70.0),
  Array(1.0, 119.0, 81.0, 70.0),
  Array(1.0, 118.0, 80.0, 69.0),
  Array(1.0, 119.0, 81.0, 70.0),
  Array(1.0, 119.0, 81.0, 70.0),
  Array(1.0, 125.0, 87.0, 76.0),
  Array(1.0, 125.0, 87.0, 76.0),
  Array(1.0, 126.0, 88.0, 77.0),
...-(77%)
n: Int = 245056

There are approximately 250K data points. I am going to use only a random sample of 10K points to build a model. Since I am using the sigmoid function, the class labels 1 and 2 h as to be normalized to 0 and 1.

val node = perceptron(3,logit,4e-2)

(1 to 10000).foreach(i=>{
    val x = raw(nextInt(n))
    node.train(x(0)-1.0, DenseVector(x.slice(1,x.length)))
})
node: perceptron = perceptron(3, , 0.001)

Let us test on randomly chosen 1K data points.

def experiment(m: Int, data:Array[Array[Double]]): Double = {
    val n = data.length
    var res = 0.0
    (1 to m).foreach(i=>{
        val x = data(nextInt(n))
        val y = x(0)
        val u = 1.0 + node.forward(DenseVector(x.slice(1,4)))
        if(math.abs(u-y)>1e-2) res += 1
    })
    return 1.0 - res/m
}

(1 to 100).map(i=>experiment(100,raw)).reduce(_+_)/100
defined function experiment
res97_1: Double = 0.9128999999999999

A success of 91.3%. Very nice!