More than often, I need random numbers from standard statistical distributions such as Gaussian, Poisson, T, F, or Beta. Since I like using Clojure so much, I have been looking for a satisfactory solution with as little friction as possible. Then I remember, Apache Commons Math has an excellent statistical sub-library that has all the distributions I need (and more) already implemented. All I need is to form a bridge between Clojure and Apache Commons Math.
Those who are interested in a pure Clojure solution should check out kixi.stats.
Let me start with my dependencies in my deps.edn
:
{:deps {org.apache.commons/commons-math3 {:mvn/version "3.6.1"}}}
Next, my namespace:
ns distributions.core
(:import [org.apache.commons.math3.distribution
(
BetaDistribution
BinomialDistribution
CauchyDistribution
ChiSquaredDistribution
ExponentialDistribution
FDistribution
GammaDistribution
GeometricDistribution
GumbelDistribution
HypergeometricDistribution
IntegerDistribution
KolmogorovSmirnovDistribution
LaplaceDistribution
LevyDistribution
LogNormalDistribution
LogisticDistribution
MixtureMultivariateNormalDistribution
MixtureMultivariateRealDistribution
MultivariateNormalDistribution
MultivariateRealDistribution
NakagamiDistribution
NormalDistribution
ParetoDistribution
PascalDistribution
PoissonDistribution
RealDistribution
TDistribution
TriangularDistribution
UniformIntegerDistribution
UniformRealDistribution
WeibullDistribution ZipfDistribution]))
I included all distributions in the library in case you’d like to experiment with them later on. Next, some utility functions. I am following R’s naming conventions:
defn quartile [dist x]
(
(.inverseCumulativeProbability dist x))
defn pdf [dist x]
(
(.density dist x))
defn cdf [dist x]
(
(.cumulativeProbability dist x))
defn sample [dist N]
(seq (.sample dist N))) (
Let us test the code. First, the normal distribution with 0 mean and 1.0 variance:
def ndist (NormalDistribution. 0.0 1.0))
(0.975)
(quartile ndist 12.0)
(pdf ndist 10) (sample ndist
#'distributions.core/ndist
1.959963984540054
2.1463837356630728E-32
0.7328831550323387 -1.7564646871411873 0.3036384595319081 -0.6552956779199097 0.09651672484712626 2.100967417766368 -0.3455198591244783 0.4937778023356781 0.44520438525164974 -0.33095042749939735) (
Next, the Poisson distribution with p=0.05:
def pdist (PoissonDistribution. 0.05))
(100) (sample pdist
#'distributions.core/pdist
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0) (
And, finally the T-distribution with 2 degrees of freedom
def tdist (TDistribution. 2))
(0.975)
(quartile tdist 12.0)
(pdf tdist 10) (sample tdist
#'distributions.core/tdist
4.302652729698313
5.668533483577864E-4
0.9571878079111966 0.22544989371671462 -1.4840659295754255 1.1260852156451235 0.38857568503851414 -1.4375968836484603 0.15007647730820867 -1.1068773851595448 0.0021123655126886205 -0.2345292434886456) (