More than often, I need random numbers from standard statistical distributions such as Gaussian, Poisson, T, F, or Beta. Since I like using Clojure so much, I have been looking for a satisfactory solution with as little friction as possible. Then I remember, Apache Commons Math has an excellent statistical sub-library that has all the distributions I need (and more) already implemented. All I need is to form a bridge between Clojure and Apache Commons Math.
Those who are interested in a pure Clojure solution should check out kixi.stats.
Let me start with my dependencies in my deps.edn:
{:deps {org.apache.commons/commons-math3 {:mvn/version "3.6.1"}}}
Next, my namespace:
(ns distributions.core
(:import [org.apache.commons.math3.distribution
BetaDistribution
BinomialDistribution
CauchyDistribution
ChiSquaredDistribution
ExponentialDistribution
FDistribution
GammaDistribution
GeometricDistribution
GumbelDistribution
HypergeometricDistribution
IntegerDistribution
KolmogorovSmirnovDistribution
LaplaceDistribution
LevyDistribution
LogNormalDistribution
LogisticDistribution
MixtureMultivariateNormalDistribution
MixtureMultivariateRealDistribution
MultivariateNormalDistribution
MultivariateRealDistribution
NakagamiDistribution
NormalDistribution
ParetoDistribution
PascalDistribution
PoissonDistribution
RealDistribution
TDistribution
TriangularDistribution
UniformIntegerDistribution
UniformRealDistribution
WeibullDistribution
ZipfDistribution]))I included all distributions in the library in case you’d like to experiment with them later on. Next, some utility functions. I am following R’s naming conventions:
(defn quartile [dist x]
(.inverseCumulativeProbability dist x))
(defn pdf [dist x]
(.density dist x))
(defn cdf [dist x]
(.cumulativeProbability dist x))
(defn sample [dist N]
(seq (.sample dist N)))Let us test the code. First, the normal distribution with 0 mean and 1.0 variance:
(def ndist (NormalDistribution. 0.0 1.0))
(quartile ndist 0.975)
(pdf ndist 12.0)
(sample ndist 10)#'distributions.core/ndist
1.959963984540054
2.1463837356630728E-32
(0.7328831550323387 -1.7564646871411873 0.3036384595319081 -0.6552956779199097 0.09651672484712626 2.100967417766368 -0.3455198591244783 0.4937778023356781 0.44520438525164974 -0.33095042749939735)Next, the Poisson distribution with p=0.05:
(def pdist (PoissonDistribution. 0.05))
(sample pdist 100)#'distributions.core/pdist
(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)And, finally the T-distribution with 2 degrees of freedom
(def tdist (TDistribution. 2))
(quartile tdist 0.975)
(pdf tdist 12.0)
(sample tdist 10)#'distributions.core/tdist
4.302652729698313
5.668533483577864E-4
(0.9571878079111966 0.22544989371671462 -1.4840659295754255 1.1260852156451235 0.38857568503851414 -1.4375968836484603 0.15007647730820867 -1.1068773851595448 0.0021123655126886205 -0.2345292434886456)