criterium

0.4.3


Benchmarking library

dependencies

org.clojure/clojure
1.3.0



(this space intentionally left almost blank)
 

Copyright (c) Hugo Duncan. All rights reserved.

The use and distribution terms for this software are covered by the Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) which can be found in the file epl-v10.html at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software.

Criterium - measures expression computation time over multiple invocations

Inspired by Brent Broyer's http://www.ellipticgroup.com/html/benchmarkingArticle.html and also Haskell's Criterion

Unlike java solutions, this can benchmark general expressions rather than just functions.

Criterium measures the computation time of an expression. It is designed to address some of the pitfalls of benchmarking, and benchmarking on the JVM in particular.

This includes: - statistical processing of multiple evaluations - inclusion of a warm-up period, designed to allow the JIT compiler to optimise its code - purging of gc before testing, to isolate timings from GC state prior to testing - a final forced GC after testing to estimate impact of cleanup on the timing results

Usage: (use 'criterium.core) (bench (Thread/sleep 1000) :verbose) (with-progress-reporting (bench (Thread/sleep 1000) :verbose)) (report-result (benchmark (Thread/sleep 1000)) :verbose) (report-result (quick-bench (Thread/sleep 1000)))

References: See http://www.ellipticgroup.com/html/benchmarkingArticle.html for a Java benchmarking library. The accompanying article describes many of the JVM benchmarking pitfalls.

See http://hackage.haskell.org/package/criterion for a Haskell benchmarking library that applies many of the same statistical techniques.

(ns ^{:author "Hugo Duncan"
      :see-also
      [["http://github.com/hugoduncan/criterium" "Source code"]
       ["http://hugoduncan.github.com/criterium" "API Documentation"]]}
  criterium.core
  (:use clojure.set
         criterium.stats)
  (:require criterium.well)
  (:import (java.lang.management ManagementFactory)))
(def ^{:dynamic true} *use-mxbean-for-times* nil)

Fraction of excution time allowed for final cleanup before a warning is issued.

(def ^{:doc 
       :dynamic true}
  *final-gc-problem-threshold* 0.01)
(def s-to-ns (* 1000 1000 1000)) ; in ns
(def ns-to-s 1e-9) ; in ns

Time period used to let the code run so that jit compiler can do its work.

(def ^{:doc 
       :dynamic true}
  *warmup-jit-period* (* 10 s-to-ns)) ; in ns

Number of executions required

(def ^{:doc 
       :dynamic true} *sample-count* 60)

Target elapsed time for execution for a single measurement.

(def ^{:doc 
       :dynamic true}
  *target-execution-time* (* 1 s-to-ns)) ; in ns

Maximum number of attempts to run finalisers and gc.

(def ^{:doc 
       :dynamic true}
  *max-gc-attempts* 100)
(def ^{:dynamic true}
  *default-benchmark-opts*
  {:max-gc-attempts *max-gc-attempts*
   :samples *sample-count*
   :target-execution-time *target-execution-time*
   :warmup-jit-period *warmup-jit-period*
   :tail-quantile 0.025
   :bootstrap-size 1000})
(def ^{:dynamic true}
  *default-quick-bench-opts*
  {:max-gc-attempts *max-gc-attempts*
   :samples (/ *sample-count* 10)
   :target-execution-time (/ *target-execution-time* 10)
   :warmup-jit-period (/ *warmup-jit-period* 2)
   :tail-quantile 0.025
   :bootstrap-size 500})

Progress reporting

(def ^{:dynamic true} *report-progress* nil)

Conditionally report progress to out.

(defn #^{:skip-wiki true}
  progress
  [& message]
  (when *report-progress*
    (apply println message)))
(def ^{:dynamic true} *report-debug* nil)

Conditionally report debug to out.

(defn #^{:skip-wiki true}
  debug
  [& message]
  (when *report-debug*
    (apply println message)))
(def ^{:dynamic true} *report-warn* nil)

Conditionally report warn to out.

(defn #^{:skip-wiki true}
  warn
  [& message]
  (when *report-warn*
    (apply println "WARNING:" message)))

Interrogation of differences in a state.

Java Management interface

(defprotocol StateChanged
  (state-changed?
   [state]
   "Check to see if a state delta represents no change")
  (state-delta
   [state-1 state-2]
   "Return a state object for the difference between two states"))
(defrecord JvmClassLoaderState [loaded-count unloaded-count]
  StateChanged
  (state-changed?
   [state]
   (not (and (zero? (:loaded-count state)) (zero? (:unloaded-count state)))))
  (state-delta
   [state-1 state-2]
   (let [vals (map - (vals state-1) (vals state-2))]
     (JvmClassLoaderState. (first vals) (second vals)))))
(defn jvm-class-loader-state []
  (let [bean (.. ManagementFactory getClassLoadingMXBean)]
    (JvmClassLoaderState. (. bean getLoadedClassCount)
                          (. bean getUnloadedClassCount))))
(defrecord JvmCompilationState [compilation-time]
  StateChanged
  (state-changed?
   [state]
   (not (zero? (:compilation-time state))))
  (state-delta
   [state-1 state-2]
   (let [vals (map - (vals state-1) (vals state-2))]
     (JvmCompilationState. (first vals)))))

Returns the total compilation time for the JVM instance.

(defn jvm-compilation-state
  []
  (let [bean (.. ManagementFactory getCompilationMXBean)]
    (JvmCompilationState. (if (. bean isCompilationTimeMonitoringSupported)
                            (. bean getTotalCompilationTime)
                            -1))))

Returns the name of the JIT compiler.

(defn jvm-jit-name
  []
  (let [bean (.. ManagementFactory getCompilationMXBean)]
    (. bean getName)))

Return the operating system details as a hash.

(defn os-details
  []
  (let [bean (.. ManagementFactory getOperatingSystemMXBean)]
    {:arch (. bean getArch)
     :available-processors (. bean getAvailableProcessors)
     :name (. bean getName)
     :version (. bean getVersion)}))

Return the runtime details as a hash.

(defn runtime-details
  []
  (let [bean (.. ManagementFactory getRuntimeMXBean)
        props (. bean getSystemProperties)]
    {:input-arguments (. bean getInputArguments)
     :name (. bean getName)
     :spec-name (. bean getSpecName)
     :spec-vendor (. bean getSpecVendor)
     :spec-version (. bean getSpecVersion)
     :vm-name (. bean getVmName)
     :vm-vendor (. bean getVmVendor)
     :vm-version (. bean getVmVersion)
     :java-version (get props "java.version")
     :java-runtime-version (get props "java.runtime.version")
     :sun-arch-data-model (get props "sun.arch.data.model")
     :clojure-version-string (clojure-version)
     :clojure-version *clojure-version*}))

Return the operating system details.

(defn system-properties
  []
  (let [bean (.. ManagementFactory getRuntimeMXBean)]
    (. bean getSystemProperties)))

OS Specific Code

(defn clear-cache-mac []
  (.. Runtime getRuntime (exec "/usr/bin/purge") waitFor))
(defn clear-cache-linux []
  ;; not sure how to deal with the sudo
  (.. Runtime getRuntime
      (exec "sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'") waitFor))
(defn clear-cache []
  (condp #(re-find %1 %2) (.. System getProperties (getProperty "os.name"))
    #"Mac" (clear-cache-mac)
    :else (println "WARNING: don't know how to clear disk buffer cache for "
                   (.. System getProperties (getProperty "os.name")))))

Obtain a timestamp

Time reporting

(defmacro timestamp
  [] `(System/nanoTime))

Obtain a timestamp, possibly using MXBean.

(defn timestamp-2
  []
  (if *use-mxbean-for-times*
    (.. ManagementFactory getThreadMXBean getCurrentThreadCpuTime)
    (System/nanoTime)))

Returns a vector containing execution time and result of specified function.

Execution timing

(defmacro time-body
  ([expr pre]
     `(do ~pre
          (time-body ~expr)))
  ([expr]
     `(let [start# (timestamp)
            ret# ~expr
            finish# (timestamp)]
        [(- finish# start#) ret#])))
(defn replace-ret-val-in-time-body-result
  [[elapsed-time _] new-ret-val]
  [elapsed-time new-ret-val])

Returns a vector containing execution time, change in loaded and unloaded class counts, change in compilation time and result of specified function.

(defmacro time-body-with-jvm-state
  ([expr pre]
     `(do ~pre
          (time-body-with-jvm-state ~expr)))
  ([expr]
  `(let [cl-state# (jvm-class-loader-state)
         comp-state# (jvm-compilation-state)
         start# (timestamp)
         ret# ~expr
         finish# (timestamp)]
     [(- finish# start#)
      (merge-with - cl-state# (jvm-class-loader-state))
      (merge-with - comp-state# (jvm-compilation-state))
      ret#])))

Report a (inconsistent) snapshot of the heap memory used.

Memory reporting

(defn heap-used
  []
  (let [runtime (Runtime/getRuntime)]
    (- (.totalMemory runtime) (.freeMemory runtime))))

Report a (inconsistent) snapshot of the memory situation.

(defn memory
  []
  (let [runtime (Runtime/getRuntime)]
    [ (.freeMemory runtime) (.totalMemory runtime) (.maxMemory runtime)]))

Force garbage collection and finalisers so that execution time associated with this is not incurred later. Up to max-attempts are made.

Memory management

(defn force-gc
  ([] (force-gc *max-gc-attempts*))
  ([max-attempts]
     (debug "Cleaning JVM allocations ...")
     (loop [memory-used (heap-used)
            attempts 0]
       (System/runFinalization)
       (System/gc)
       (let [new-memory-used (heap-used)]
         (if (and (or (pos? (.. ManagementFactory
                                getMemoryMXBean
                                getObjectPendingFinalizationCount))
                      (> memory-used new-memory-used))
                  (< attempts max-attempts))
           (recur new-memory-used (inc attempts)))))))

Time a final clean up of JVM memory. If this time is significant compared to the runtime, then the runtime should maybe include this time.

(defn final-gc
  []
  (progress "Final GC...")
  (first (time-body (force-gc))))
(defn final-gc-warn
  [execution-time final-gc-time]
  (progress "Checking GC...")
  (let [fractional-time (/ final-gc-time execution-time)
        final-gc-result [(> fractional-time *final-gc-problem-threshold*)
                         fractional-time
                         final-gc-time]]
    (when (first final-gc-result)
      (println
       "WARNING: Final GC required"
       (* 100.0 (second final-gc-result))
       "% of runtime"))
    final-gc-result))

Core timing loop

A mutable field is used to store the result of each function call, to prevent JIT optimising away the expression entirely.

Provides a mutable place

(defprotocol MutablePlace
  (set-place [_ v] "Set mutable field to value.")
  (get-place [_] "Get mutable field value."))
(deftype Unsynchronized [^{:unsynchronized-mutable true :tag Object} v]
  MutablePlace
  (set-place [_ value] (set! v value))
  (get-place [_] v))
(def mutable-place (Unsynchronized. nil))

Performs the part of execute-expr where we actually measure the elapsed run time. Evaluates (f) n times, each time saving the return value as an Object in mutable-place.

The idea is that except for the call to (f), the only things done during each iteration are a few arithmetic operations and comparisons to 0 on primitive longs, and the storage of the return value.

The JVM is not free to optimize away the calls to f because the return values are saved in mutable-place.

This array is at most +max-obj-array-size+ elements long, to save memory. An artificially intelligent JVM might be able to determine that if n is larger than +max-obj-array-size+, some of the return values are overwritten and thus those calls need not be made. I doubt we will see that kind of optimization any time soon, and perhaps some JVM rules even prohibit doing so since the writes to ret-vals-arr could potentially be read by a different thread.

(defn execute-expr-core-timed-part
  [n f]
  (time-body
   (loop [i (long (dec n))
          v (f)]
     (set-place mutable-place v)
     (if (pos? i)
       (recur (unchecked-dec i) (f))
       v))))

Time the execution of n invocations of f. See execute-expr-core-timed-part.

Execution

(defn execute-expr
  [n f]
  (let [time-and-ret (execute-expr-core-timed-part n f)]
    (get-place mutable-place) ;; just for good measure, use the mutable value
    time-and-ret))
(defn collect-samples
  [sample-count execution-count f gc-before-sample]
  {:pre [(pos? sample-count)]}
  (let [result (object-array sample-count)]
    (loop [i (long 0)]
      (if (< i sample-count)
        (do
          (when gc-before-sample
            (force-gc))
          (aset result i (execute-expr execution-count f))
          (recur (unchecked-inc i)))
        result))))

Run expression for the given amount of time to enable JIT compilation.

Compilation

(defn warmup-for-jit
  [warmup-period f]
  (progress "Warming up for JIT optimisations" warmup-period "...")
  (let [cl-state (jvm-class-loader-state)
        comp-state (jvm-compilation-state)
        t (max 1 (first (time-body (f))))
        _ (debug "  initial t" t)
        [t n] (if (< t 100000)           ; 100us
                (let [n (/ 100000 t)]
                  [(first (execute-expr n f)) n])
                [t 1])
        p (/ warmup-period t)
        c (long (max 1 (* n (/ p 5))))]
    (debug "  using t" t "n" n)
    (debug "  using execution-count" c)
    (loop [elapsed (long t)
           count (long n)
           delta-free (long 0)
           old-cl-state cl-state
           old-comp-state comp-state]
      (let [new-cl-state (jvm-class-loader-state)
            new-comp-state (jvm-compilation-state)]
        (if (not= old-cl-state new-cl-state)
          (progress "  classes loaded before" count "iterations"))
        (if (not= old-comp-state new-comp-state)
          (progress "  compilation occured before" count "iterations"))
        (debug "  elapsed" elapsed " count" count)
        (if (and (> delta-free 2) (> elapsed warmup-period))
          [elapsed count
           (state-delta new-cl-state cl-state)
           (state-delta new-comp-state comp-state)]
          (recur (+ elapsed (long (first (execute-expr c f))))
                 (+ count c)
                 (if (and (= old-cl-state new-cl-state)
                          (= old-comp-state new-comp-state))
                   (unchecked-inc delta-free)
                   (long 0))
                 new-cl-state
                 new-comp-state))))))

Estimate the number of executions required in order to have at least the specified execution period, check for the jvm to have constant class loader and compilation state.

Execution parameters

(defn estimate-execution-count
  [period f gc-before-sample estimated-fn-time]
  (progress "Estimating execution count ...")
  (debug " estimated-fn-time" estimated-fn-time)
  (loop [n (max 1 (long (/ period estimated-fn-time 5)))
         cl-state (jvm-class-loader-state)
         comp-state (jvm-compilation-state)]
    (let [t (ffirst (collect-samples 1 n f gc-before-sample))
          ;; It is possible for small n and a fast expression to get
          ;; t=0 nsec back from collect-samples.  This is likely due
          ;; to how (System/nanoTime) quantizes the time on some
          ;; systems.
          t (max 1 t)
          new-cl-state (jvm-class-loader-state)
          new-comp-state (jvm-compilation-state)]
      (debug " ..." n)
      (when (not= comp-state new-comp-state)
        (warn "new compilations in execution estimation phase"))
      (if (and (>= t period)
               (= cl-state new-cl-state)
               (= comp-state new-comp-state))
        n
        (recur (if (>= t period)
                 n
                 (min (* 2 n) (inc (long (* n (/ period t))))))
               new-cl-state new-comp-state)))))

Benchmark an expression. This tries its best to eliminate sources of error. This also means that it runs for a while. It will typically take 70s for a quick test expression (less than 1s run time) or 10s plus 60 run times for longer running expressions.

benchmark

(defn run-benchmark
  [sample-count warmup-jit-period target-execution-time f gc-before-sample
   overhead]
  (force-gc)
  (let [first-execution (time-body (f))
        [warmup-t warmup-n cl-state comp-state] (warmup-for-jit
                                                 warmup-jit-period f)
        n-exec (estimate-execution-count
                target-execution-time f gc-before-sample
                (long (/ warmup-t warmup-n)))
        total-overhead (long (* (or overhead 0) 1e9 n-exec))
        _   (progress "Sampling ...")
        _   (debug
             "Running with\n sample-count" sample-count \newline
             "exec-count" n-exec \newline
             "overhead[s]" overhead \newline
             "total-overhead[ns]" total-overhead)
        _   (force-gc)
        samples (collect-samples sample-count n-exec f gc-before-sample)
        final-gc-time (final-gc)
        sample-times (->> samples
                          (map first)
                          (map #(- % total-overhead)))
        total (reduce + 0 sample-times)
        final-gc-result (final-gc-warn total final-gc-time)]
    {:execution-count n-exec
     :sample-count sample-count
     :samples sample-times
     :results (map second samples)
     :total-time (/ total 1e9)
     :warmup-time warmup-t
     :warmup-executions warmup-n
     :final-gc-time final-gc-time
     :overhead overhead}))

Benchmark multiple expressions in a 'round robin' fashion. Very similar to run-benchmark, except it takes multiple expressions in a sequence instead of only one (each element of the sequence should be a map with keys :f and :expr-string). It runs the following steps in sequence:

  1. Execute each expr once

  2. Run expression 1 for at least warmup-jit-period nanoseconds so the JIT has an opportunity to optimize it. Then do the same for each of the other expressions.

  3. Run expression 1 many times to estimate how many times it must be executed to take a total of target-execution-time nanoseconds. The result is a number of iterations n-exec1 for expression 1. Do the same for each of the other expressions, each with the same target-execution-time, each resulting in its own independent number of executions.

  4. Run expression 1 n-exec1 times, measuring the total elapsed time. Do the same for the rest of the expressions.

  5. Repeat step 4 a total of sample-count times.

(defn run-benchmarks-round-robin
  [sample-count warmup-jit-period target-execution-time exprs gc-before-sample]
  (force-gc)
  (let [first-executions (map (fn [{:keys [f]}] (time-body (f))) exprs)]
    (progress (format "Warming up %d expression for %.2e sec each:"
                      (count exprs) (/ warmup-jit-period 1.0e9)))
    (doseq [{:keys [f expr-string]} exprs]
      (progress (format "    %s..." expr-string))
      (warmup-for-jit warmup-jit-period f))
    (progress
     (format
      "Estimating execution counts for %d expressions.  Target execution time = %.2e sec:"
                      (count exprs) (/ target-execution-time 1.0e9)))
    (let [exprs (map-indexed
                 (fn [idx {:keys [f expr-string] :as expr}]
                   (progress (format "    %s..." expr-string))
                   (assoc expr :index idx
                          :n-exec (estimate-execution-count
                                   target-execution-time f
                                   gc-before-sample
                                   nil)))
                 exprs)
;;          _   (progress
;;               "Running with sample-count" sample-count
;;               "exec-count" n-exec  ; tbd: update)
          all-samples (doall
                       (for [i (range sample-count)]
                         (do
                           (progress
                            (format
                             "    Running sample %d/%d for %d expressions:"
                             (inc i) sample-count (count exprs)))
                           (doall
                            (for [{:keys [f n-exec expr-string] :as expr} exprs]
                              (do
                                (progress (format "        %s..." expr-string))
                                (assoc expr
                                  :sample (first
                                           (collect-samples
                                            1 n-exec f gc-before-sample)))))))))
          ;; 'transpose' all-samples so that all samples for a
          ;; particular expression are in a sequence together, and
          ;; all-samples is a sequence of one map per expression.
          all-samples (group-by :index (apply concat all-samples))
          all-samples
          (map (fn [[idx data-seq]]
                 (let [expr (dissoc (first data-seq) :sample)
                       n-exec (:n-exec expr)
                       samples (map :sample data-seq)
                       final-gc-time (final-gc)
                       sample-times (map first samples)
                       total (reduce + 0 sample-times)
                       ;; TBD: Doesn't make much sense to attach final
                       ;; GC warning to the expression that happened
                       ;; to be first in the sequence, but that is
                       ;; what this probably does right now.  Think
                       ;; what might be better to do.
                       final-gc-result (final-gc-warn total final-gc-time)]
                   {:execution-count n-exec
                    :sample-count sample-count
                    :samples sample-times
                    :results (map second samples)
                    :total-time (/ total 1e9)}))
               all-samples)]
      all-samples)))

Bootstrap a statistic. Statistic can produce multiple statistics as a vector so you can use juxt to pass multiple statistics. http://en.wikipedia.org/wiki/Bootstrapping_(statistics)

(defn bootstrap-bca
  [data statistic size alpha rng-factory]
  (progress "Bootstrapping ...")
  (let [bca (bca-nonparametric data statistic size alpha rng-factory)]
    (if (vector? bca)
      (bca-to-estimate alpha bca)
      (map (partial bca-to-estimate alpha) bca))))

Bootstrap a statistic. Statistic can produce multiple statistics as a vector so you can use juxt to pass multiple statistics. http://en.wikipedia.org/wiki/Bootstrapping_(statistics)

(defn bootstrap
  [data statistic size rng-factory]
  (progress "Bootstrapping ...")
  (let [samples (bootstrap-sample data statistic size rng-factory)
        transpose (fn [data] (apply map vector data))]
    (if (vector? (first samples))
      (map bootstrap-estimate samples)
      (bootstrap-estimate samples))))

Outliers

Return a keyword describing the effect of outliers on the estimate of mean runtime.

(defn outlier-effect
  [var-out-min]
  (cond
    (< var-out-min 0.01) :unaffected
    (< var-out-min 0.1) :slight
    (< var-out-min 0.5) :moderate
    :else :severe))
(defn point-estimate [estimate]
  (first estimate))
(defn point-estimate-ci [estimate]
  (last estimate))

Find the significance of outliers given boostrapped mean and variance estimates. See http://www.ellipticgroup.com/misc/article_supplement.pdf, p17.

(defn outlier-significance
  [mean-estimate variance-estimate n]
  (progress "Checking outlier significance")
  (let [mean-block (point-estimate mean-estimate)
        variance-block (point-estimate variance-estimate)
        std-dev-block (Math/sqrt variance-block)
        mean-action (/ mean-block n)
        mean-g-min (/ mean-action 2)
        sigma-g (min (/ mean-g-min 4) (/ std-dev-block (Math/sqrt n)))
        variance-g (* sigma-g sigma-g)
        c-max (fn [t-min]
                (let [j0 (- mean-action t-min)
                      k0 (- (* n n j0 j0))
                      k1 (+ variance-block (- (* n variance-g)) (* n j0 j0))
                      det (- (* k1 k1) (* 4 variance-g k0))]
                  (Math/floor (/ (* -2 k0) (+ k1 (Math/sqrt det))))))
        var-out (fn [c]
                  (let [nmc (- n c)]
                    (* (/ nmc n) (- variance-block (* nmc variance-g)))))
        min-f (fn [f q r]
                (min (f q) (f r)))
        ]
    (/ (min-f var-out 1 (min-f c-max 0 mean-g-min)) variance-block)))
(defrecord OutlierCount [low-severe low-mild high-mild high-severe])
(defn outlier-count
  [low-severe low-mild high-mild high-severe]
  (OutlierCount. low-severe low-mild high-mild high-severe))
(defn add-outlier [low-severe low-mild high-mild high-severe counts x]
  (outlier-count
   (if (<= x low-severe)
     (inc (:low-severe counts))
     (:low-severe counts))
   (if (< low-severe x low-mild)
     (inc (:low-mild counts))
     (:low-mild counts))
   (if (> high-severe x high-mild)
     (inc (:high-mild counts))
     (:high-mild counts))
   (if (>= x high-severe)
     (inc (:high-severe counts))
     (:high-severe counts))))

Find the outliers in the data using a boxplot technique.

(defn outliers
  [data]
  (progress "Finding outliers ...")
  (reduce (apply partial add-outlier
                 (apply boxplot-outlier-thresholds
                        ((juxt first last) (quartiles (sort data)))))
          (outlier-count 0 0 0 0)
          data))

overhead estimation

(declare benchmark*)

Calculate a conservative estimate of the timing loop overhead.

(defn estimate-overhead
  []
  (-> (benchmark*
       (fn [] 0)
       {:warmup-jit-period (* 10 s-to-ns)
        :samples 10
        :target-execution-time (* 0.5 s-to-ns)
        :overhead 0
        :supress-jvm-option-warnings true})
      :lower-q
      first))
(def estimatated-overhead-cache nil)

Sets the estimated overhead.

(defn estimatated-overhead!
  []
  (progress "Estimating sampling overhead")
  (alter-var-root
   #'estimatated-overhead-cache (constantly (estimate-overhead))))
(defn estimatated-overhead
  []
  (or estimatated-overhead-cache
      (estimatated-overhead!)))

Extract reporting options from the given options vector. Returns a two element vector containing the reporting options followed by the non-reporting options

options

(defn extract-report-options
  [opts]
  (let [known-options #{:os :runtime :verbose}
        option-set (set opts)]
    [(intersection known-options option-set)
     (remove #(contains? known-options %1) opts)]))
(defn add-default-options [options defaults]
  (let [time-periods #{:warmup-jit-period :target-execution-time}]
    (merge defaults
           (into {} (map #(if (contains? time-periods (first %1))
                            [(first %1) (* (second %1) s-to-ns)]
                            %1)
                         options)))))

Macro to enable progress reporting during the benchmark.

User top level functions

(defmacro with-progress-reporting
  [expr]
  `(binding [*report-progress* true]
     ~expr))
(defn benchmark-stats [times opts]
  (let [outliers (outliers (:samples times))
        tail-quantile (:tail-quantile opts)
        stats (bootstrap-bca
               (map double (:samples times))
               (juxt
                mean
                variance
                (partial quantile tail-quantile)
                (partial quantile (- 1.0 tail-quantile)))
               (:bootstrap-size opts) [0.5 tail-quantile (- 1.0 tail-quantile)]
               criterium.well/well-rng-1024a)
        analysis (outlier-significance (first stats) (second stats)
                                       (:sample-count times))
        sqr (fn [x] (* x x))
        m (mean (map double (:samples times)))
        s (Math/sqrt (variance (map double (:samples times))))]
    (merge times
           {:outliers outliers
            :mean (scale-bootstrap-estimate
                   (first stats) (/ 1e-9 (:execution-count times)))
            :sample-mean (scale-bootstrap-estimate
                          [m [(- m (* 3 s)) (+ m (* 3 s))]]
                          (/ 1e-9 (:execution-count times)))
            :variance (scale-bootstrap-estimate
                       (second stats) (sqr (/ 1e-9 (:execution-count times))))
            :sample-variance (scale-bootstrap-estimate
                              [ (sqr s) [0 0]]
                              (sqr (/ 1e-9 (:execution-count times))))
            :lower-q (scale-bootstrap-estimate
                       (nth stats 2) (/ 1e-9 (:execution-count times)))
            :upper-q (scale-bootstrap-estimate
                       (nth stats 3) (/ 1e-9 (:execution-count times)))
            :outlier-variance analysis
            :tail-quantile (:tail-quantile opts)
            :os-details (os-details)
            :options opts
            :runtime-details (->
                              (runtime-details)
                              (update-in [:input-arguments] vec))})))

Warn if the JIT options are suspicious looking.

(defn warn-on-suspicious-jvm-options
  []
  (let [compiler (jvm-jit-name)
        {:keys [input-arguments]} (runtime-details)]
    (when-let [arg (and (re-find #"Tiered" compiler)
                        (some #(re-find #"TieredStopAtLevel=(.*)" %)
                              input-arguments))]
      (println
       "WARNING: JVM argument" (first arg) "is active,"
       "and may lead to unexpected results as JIT C2 compiler may not be active."
       "See http://www.slideshare.net/CharlesNutter/javaone-2012-jvm-jit-for-dummies."))))

Benchmark a function. This tries its best to eliminate sources of error. This also means that it runs for a while. It will typically take 70s for a fast test expression (less than 1s run time) or 10s plus 60 run times for longer running expressions.

(defn benchmark*
  [f {:keys [samples warmup-jit-period target-execution-time gc-before-sample
             overhead supress-jvm-option-warnings] :as options}]
  (when-not supress-jvm-option-warnings
    (warn-on-suspicious-jvm-options))
  (let [{:keys [samples warmup-jit-period target-execution-time
                gc-before-sample overhead] :as opts}
        (merge *default-benchmark-opts*
               {:overhead (or overhead (estimatated-overhead))}
               options)
        times (run-benchmark
               samples warmup-jit-period target-execution-time f opts overhead)]
    (benchmark-stats times opts)))
(defn benchmark-round-robin*
  [exprs options]
  (let [opts (merge *default-benchmark-opts* options)
        times (run-benchmarks-round-robin
               (:samples opts)
               (:warmup-jit-period opts)
               (:target-execution-time opts)
               exprs
               (:gc-before-sample opts))]
    (map #(benchmark-stats % opts) times)))

Benchmark an expression. This tries its best to eliminate sources of error. This also means that it runs for a while. It will typically take 70s for a fast test expression (less than 1s run time) or 10s plus 60 run times for longer running expressions.

(defmacro benchmark
  [expr options]
  `(benchmark* (fn [] ~expr) ~options))
(defmacro benchmark-round-robin
  [exprs options]
  (let [wrap-exprs (fn [exprs]
                     (cons 'list
                           (map (fn [expr]
                                  {:f `(fn [] ~expr)
                                   :expr-string (str expr)})
                                exprs)))]
    `(benchmark-round-robin* ~(wrap-exprs exprs) ~options)))

Benchmark an expression. Less rigorous benchmark (higher uncertainty).

(defn quick-benchmark*
  [f {:as options}]
  (benchmark* f (merge *default-quick-bench-opts* options)))

Benchmark an expression. Less rigorous benchmark (higher uncertainty).

(defmacro quick-benchmark
  [expr options]
  `(quick-benchmark* (fn [] ~expr) ~options))

Print format output

(defn report
  [format-string & values]
  (print (apply format format-string values)))

Determine a scale factor and unit for displaying a time.

(defn scale-time
  [measurement]
  (cond
   (> measurement 60) [(/ 60) "min"]
   (< measurement 1e-6) [1e9 "ns"]
   (< measurement 1e-3) [1e6 "µs"]
   (< measurement 1) [1e3 "ms"]
   :else [1 "sec"]))
(defn format-value [value scale unit]
  (format "%f %s" (* scale value) unit))
(defn report-estimate
  [msg estimate significance]
  (let [mean (first estimate)
        [factor unit] (scale-time mean)]
    (apply
     report "%32s : %s  %2.1f%% CI: (%s, %s)\n"
     msg
     (format-value mean factor unit)
     (* significance 100)
     (map #(format-value % factor unit) (last estimate)))))
(defn report-point-estimate
  ([msg estimate]
     (let [mean (first estimate)
           [factor unit] (scale-time mean)]
       (report "%32s : %s\n" msg (format-value mean factor unit))))
  ([msg estimate quantile]
     (let [mean (first estimate)
           [factor unit] (scale-time mean)]
       (report
        "%32s : %s (%4.1f%%)\n"
        msg (format-value mean factor unit) (* quantile 100)))))
(defn report-estimate-sqrt
  [msg estimate significance]
  (let [mean (Math/sqrt (first estimate))
        [factor unit] (scale-time mean)]
    (apply
     report "%32s : %s  %2.1f%% CI: (%s, %s)\n"
     msg
     (format-value mean factor unit)
     (* significance 100)
     (map #(format-value (Math/sqrt %) factor unit) (last estimate)))))
(defn report-point-estimate-sqrt
  [msg estimate]
  (let [mean (Math/sqrt (first estimate))
        [factor unit] (scale-time mean)]
    (report "%32s : %s\n" msg (format-value mean factor unit))))
(defn report-outliers [results]
  (let [outliers (:outliers results)
        values (vals outliers)
        labels {:unaffected "unaffected"
                :slight "slightly inflated"
                :moderate "moderately inflated"
                :severe "severely inflated"}
        sample-count (:sample-count results)
        types ["low-severe" "low-mild" "high-mild" "high-severe"]]
    (when (some pos? values)
      (let [sum (reduce + values)]
        (report
         "\nFound %d outliers in %d samples (%2.4f %%)\n"
         sum sample-count (* 100.0 (/ sum sample-count))))
      (doseq [[v c] (partition 2 (interleave (filter pos? values) types))]
        (report "\t%s\t %d (%2.4f %%)\n" c v (* 100.0 (/ v sample-count))))
      (report " Variance from outliers : %2.4f %%"
              (* (:outlier-variance results) 100.0))
      (report " Variance is %s by outliers\n"
              (-> (:outlier-variance results) outlier-effect labels)))))
(defn report-result [results & opts]
  (let [verbose (some #(= :verbose %) opts)
        show-os (or verbose (some #(= :os %) opts))
        show-runtime (or verbose (some #(= :runtime %) opts))]
    (when show-os
      (apply println
             (->  (map
                   #(%1 (:os-details results))
                   [:arch :name :version :available-processors])
                  vec (conj "cpu(s)"))))
    (when show-runtime
      (let [runtime-details (:runtime-details results)]
        (apply println (map #(%1 runtime-details) [:vm-name :vm-version]))
        (apply println "Runtime arguments:"
               (:input-arguments runtime-details))))
    (println "Evaluation count :" (* (:execution-count results)
                                     (:sample-count results))
             "in" (:sample-count results) "samples of"
             (:execution-count results) "calls.")
    (when verbose
      (report-point-estimate
       "Execution time sample mean" (:sample-mean results)))
    (report-point-estimate "Execution time mean" (:mean results))
    (when verbose
      (report-point-estimate-sqrt
       "Execution time sample std-deviation" (:sample-variance results)))
    (report-point-estimate-sqrt
     "Execution time std-deviation" (:variance results))
    (report-point-estimate
     "Execution time lower quantile"
     (:lower-q results) (:tail-quantile results))
    (report-point-estimate
     "Execution time upper quantile"
     (:upper-q results) (- 1.0 (:tail-quantile results)))
    (when-let [overhead (:overhead results)]
      (when (pos? overhead)
        (report-point-estimate "Overhead used" [overhead])))
    (report-outliers results)))

Convenience macro for benchmarking an expression, expr. Results are reported to out in human readable format. Options for report format are: :os, :runtime, and :verbose.

(defmacro bench
  [expr & opts]
  (let [[report-options options] (extract-report-options opts)]
    `(report-result
      (benchmark
       ~expr
       ~(when (seq options) (apply hash-map options)))
      ~@report-options)))

Convenience macro for benchmarking an expression, expr. Results are reported to out in human readable format. Options for report format are: :os, :runtime, and :verbose.

(defmacro quick-bench
  [expr & opts]
  (let [[report-options options] (extract-report-options opts)]
    `(report-result
      (quick-benchmark
       ~expr
       ~(when (seq options) (apply hash-map options)))
      ~@report-options)))
 

Copyright (c) Hugo Duncan. All rights reserved.

The use and distribution terms for this software are covered by the Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) which can be found in the file epl-v10.html at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software.

A collection of statistical methods used by criterium

(ns criterium.stats)

(set! warn-on-reflection true)

Transpose a vector of vectors.

Utilities

(defn transpose
  [data]
  (if (vector? (first data))
    (apply map vector data)
    data))

Square of argument

(defn sqr
  [x] (* x x))

Square of argument

(defn cube
  [x] (* x x x))

Arithmetic mean of data.

Statistics

(defn mean
  [data]
  (/ (reduce + data) (count data)))

Sum of each data point.

(defn sum
  [data] (reduce + data))

Sum of the squares of each data point.

(defn sum-of-squares
  [data]
  (reduce
   (fn [s v]
     (+ s (* v v))) 0.0 data))

Sample variance. Returns variance. Ref: Chan et al. Algorithms for computing the sample variance: analysis and recommendations. American Statistician (1983).

(defn variance
  ([data] (variance data 1))
  ([data df]
   ;; Uses a single pass, non-pairwise algorithm, without shifting.
   (letfn [(update-estimates [[m q k] x]
             [(+ m (/ (- x m) (inc k)))
              (+ q (/ (* k (sqr (- x m))) (inc k)))
              (inc k)])]
     (let [[m q k] (reduce update-estimates [0.0 0.0 0.0] data)]
       (/ q (- k df))))))

Calculate the median of a sorted data set References: http://en.wikipedia.org/wiki/Median

For the moment we take the easy option of sorting samples

(defn median
  [data]
  (let [n (count data)
        i (bit-shift-right n 1)]
    (if (even? n)
      [(/ (+ (nth data (dec i)) (nth data i)) 2)
       (take i data)
       (drop i data)]
      [(nth data (bit-shift-right n 1))
       (take i data)
       (drop (inc i) data)])))

Calculate the quartiles of a sorted data set References: http://en.wikipedia.org/wiki/Quartile

(defn quartiles
  [data]
  (let [[m lower upper] (median data)]
    [(first (median lower)) m (first (median upper))]))

Calculate the quantile of a sorted data set References: http://en.wikipedia.org/wiki/Quantile

(defn quantile
  [quantile data]
  (let [n (dec (count data))
        interp (fn [x]
                 (let [f (Math/floor x)
                       i (long f)
                       p (- x f)]
                   (+ (* p (nth data (inc i))) (* (- 1.0 p) (nth data i)))))]
    (interp (* quantile n))))

Outlier thresholds for given quartiles.

(defn boxplot-outlier-thresholds
  [q1 q3]
  (let [iqr (- q3 q1)
        severe (* iqr 3)
        mild (* iqr 1.5)]
    [(- q1 severe)
     (- q1 mild)
     (+ q3 mild)
     (+ q3 severe)]))

Return uniformly distributed deviates on 0..max-val use the specified rng.

(defn uniform-distribution
  [max-val rng]
  (map (fn [x] (* x max-val)) rng))

Provide n samples from a uniform distribution on 0..max-val

(defn sample-uniform
  [n max-val rng]
  (take n (uniform-distribution max-val rng)))

Sample with replacement.

(defn sample
  [x rng]
  (let [n (count x)]
    (map #(nth x %1) (sample-uniform n n rng))))

Bootstrap sampling of a statistic, using resampling with replacement.

(defn bootstrap-sample
  [data statistic size rng-factory]
  (transpose
   (for [_ (range size)] (statistic (sort (sample data (rng-factory)))))))

Find the significance of outliers gicen boostrapped mean and variance estimates. This uses the bootstrapped statistic's variance, but we should use BCa of ABC.

(defn confidence-interval
  [mean variance]
  (let [n-sigma 1.96                    ; use 95% confidence interval
        delta (* n-sigma (Math/sqrt variance))]
    [(- mean delta) (+ mean delta)]))

Mean, variance and confidence interval. This uses the bootstrapped statistic's variance for the confidence interval, but we should use BCa of ABC.

(defn bootstrap-estimate
  [sampled-stat]
  (let [stats ((juxt mean variance ) sampled-stat)]
    (conj stats (apply confidence-interval stats))))
(defn scale-bootstrap-estimate [estimate scale]
  [(* (first estimate) scale)
   (map #(* scale %1) (last estimate))])

Evaluate a polynomial at the given value x, for the coefficients given in descending order (so the last element of coefficients is the constant term).

(defn polynomial-value
  [x coefficients]
  (reduce #(+ (* x %1) %2) (first coefficients) (rest coefficients)))

erf polynomial approximation. Maximum error is 1.5e-7. Handbook of Mathematical Functions: with Formulas, Graphs, and Mathematical Tables. Milton Abramowitz (Editor), Irene A. Stegun (Editor), 7.1.26

(defn erf
  [x]
  (let [x (double x)
        sign (Math/signum x)
        x (Math/abs x)
        a [1.061405429 -1.453152027 1.421413741 -0.284496736 0.254829592 0.0]
        p 0.3275911
        t (/ (+ 1.0 (* p x)))
        value (- 1.0 (* (polynomial-value t a) (Math/exp (- (* x x)))))]
    (* sign value)))

Probability p(X

(defn normal-cdf
  [x]
  (* 0.5 (+ 1.0 (erf (/ x (Math/sqrt 2.0))))))

Normal quantile function. Given a quantile in (0,1), return the normal value for that quantile.

Wichura, MJ. 'Algorithm AS241' The Percentage Points of the Normal Distribution. Applied Statistics, 37, 477-484

(defn normal-quantile
  [x]
  (let [x (double x)
        a [2509.0809287301226727
           33430.575583588128105
           67265.770927008700853
           45921.953931549871457
           13731.693765509461125
           1971.5909503065514427
           133.14166789178437745
           3.3871328727963666080]
        b [5226.4952788528545610
           28729.085735721942674
           39307.895800092710610
           21213.794301586595867
           5394.1960214247511077
           687.18700749205790830
           42.313330701600911252
           1.0]
        c [0.000774545014278341407640
           0.0227238449892691845833
           0.241780725177450611770
           1.27045825245236838258
           3.64784832476320460504
           5.76949722146069140550
           4.63033784615654529590
           1.42343711074968357734]
        d [1.05075007164441684324e-9
           0.000547593808499534494600
           0.0151986665636164571966
           0.148103976427480074590
           0.689767334985100004550
           1.67638483018380384940
           2.05319162663775882187
           1.0]
        e [
           2.01033439929228813265e-7
           0.0000271155556874348757815
           0.00124266094738807843860
           0.0265321895265761230930
           0.296560571828504891230
           1.78482653991729133580
           5.46378491116411436990
           6.65790464350110377720
           ]
        f [2.04426310338993978564e-15
           1.42151175831644588870e-7
           1.84631831751005468180e-5
           0.000786869131145613259100
           0.0148753612908506148525
           0.136929880922735805310
           0.599832206555887937690
           1.0]]
    (if (<= 0.075 x 0.925)
      (let [v (- x 0.5)
            r (- 180625e-6 (* v v))]
        (* v (/ (polynomial-value r a) (polynomial-value r b))))
      (let [r (if (< x 0.5) x (- 1.0 x))
            r (Math/sqrt (- (Math/log r)))]
        (if (<= r 5.0)
          (let [r (- r (double 16/10))]
            (* (Math/signum (double (- x 0.5)))
               (/ (polynomial-value r c) (polynomial-value r d))))
          (let [r (- r 5.0)]
            (* (Math/signum (double (- x 0.5)))
               (/ (polynomial-value r e) (polynomial-value r f)))))))))
(defn drop-at [n coll]
  (lazy-seq
    (when-let [s (seq coll)]
      (concat (take n s) (next (drop n s))))))

Round towards zero to an integeral value.

(defn trunc
  [x] (if (pos? x)
        (Math/floor x)
        (Math/ceil x)))

Jacknife statistics on data.

(defn jacknife
  [data statistic]
  (transpose
   (map #(statistic (drop-at %1 data)) (range (count data)))))

Calculate bootstrap values for given estimate and samples

(defn bca-nonparametric-eval
  [n size data z-alpha estimate samples jack-samples]
  (let [z0 (normal-quantile
            (/ (count (filter (partial > estimate) samples)) size))
        jack-mean (mean jack-samples)
        jack-deviation (map #(- jack-mean %1) jack-samples)
        acc (/ (reduce + 0.0 (map cube jack-deviation))
               (* 6.0 (Math/pow (reduce + 0.0 (map sqr jack-deviation)) 1.5)))
        tt (map
            #(normal-cdf (+ z0 (/ (+ z0 %1) (- 1.0 (* acc (+ z0 %1))))))
            z-alpha)
        ooo (map #(trunc (* %1 size)) tt)
        sorted-samples (sort samples)
        confpoints (map (partial nth sorted-samples) ooo)]
    [confpoints z0 acc jack-mean jack-samples]))

Non-parametric BCa estimate of a statistic on data. Size bootstrap samples are used. Confidence values are returned at the alpha normal quantiles. rng-factory is a method that returns a random number generator to use for the sampling.

An introduction to the bootstrap. Efron, B., & Tibshirani, R. J. (1993).

See http://lib.stat.cmu.edu/S/bootstrap.funs for Efron's original implementation.

(defn bca-nonparametric
  [data statistic size alpha rng-factory]
  (let [n (count data)
        data (sort data)
        estimate (statistic data)
        samples (bootstrap-sample data statistic size rng-factory)
        jack-samples (jacknife data statistic)
        alpha (if (vector? alpha) alpha [alpha])
        z-alpha (map normal-quantile alpha)]
    (if (vector? estimate)
      (map
       (partial bca-nonparametric-eval n size data z-alpha)
       estimate samples jack-samples)
      (bca-nonparametric-eval
       n size data z-alpha estimate samples jack-samples))))
(defn bca-to-estimate [alpha bca-estimate]
  [(first (first bca-estimate)) (next (first bca-estimate))])

Nonparametric assessment of multimodality for univariate data. Salgado-Ugarte IH, Shimizu M. 1998

Maximum likelihood kernel density estimation: On the potential of convolution sieves. Jones and Henderson. Computational Statistics and Data Analysis (2009)

Kernel function for estimation of multi-modality. h-k is the critical bandwidth, sample-variance is the observed sample variance. Equation 7, Nonparametric assessment of multimodality for univariate data. Salgado-Ugarte IH, Shimizu M

(defn modal-estimation-constant
  [h-k sample-variance]
  (Math/sqrt (+ 1 (/ (sqr h-k) sample-variance))))

Smoothed estimation function.

(defn smoothed-sample
  [c-k h-k data deviates]
  (lazy-seq
    (cons
     (* c-k (+ (take 1 data) (* h-k (take 1 deviates))))
     (if-let [n (next data)]
       (smoothed-sample c-k h-k n (next deviates))))))

Weight function for gaussian kernel.

(defn gaussian-weight
  [t]
  (let [k (Math/pow (* 2 Math/PI) -0.5)]
    (* k (Math/exp (/ (* t t) -2)))))

Kernel density estimator for x, given n samples X, weights K and width h.

(defn kernel-density-estimator
  [h K n X x]
  (/ (reduce #(+ %1 (K (/ (- x %2) h))) 0 X) (* n h)))
 

Copyright (c) Hugo Duncan. All rights reserved.

The use and distribution terms for this software are covered by the Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) which can be found in the file epl-v10.html at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software.

Improved Long-Period Generators Based on Linear Recurrences Modulo 2 F. Panneton, P. L'Ecuyer and M. Matsumoto http://www.iro.umontreal.ca/~panneton/WELLRNG.html

(ns criterium.well)

A bit shift that doesn't do sign extension.

Macros to help convert unsigned algorithm to our implementation with signed integers. unsign is used to convert the [0.5,-0.5] range back onto [1,0]

(defmacro bit-shift-right-ns
  [a b]
  `(let [n# ~b]
     (if (neg? n#)
       (bit-shift-left ~a (- n#))
       (bit-and
        (bit-shift-right Integer/MAX_VALUE (dec n#))
        (bit-shift-right ~a n#)))))

Convert a result based on a signed integer, and convert it to what it would have been for an unsigned integer.

(defmacro unsign
  [x]
  `(let [v# ~x]
     (if (neg? v#) (+ 1 v#) v#)))
(def int-max (bit-or (bit-shift-left Integer/MAX_VALUE 1) 1))
(defmacro limit-bits [x]
  `(bit-and int-max ~x))
(defmacro mat0-pos [t v]
  `(let [v# ~v] (bit-xor v# (bit-shift-right v# ~t))))
(defmacro mat0-neg [t v]
  `(let [v# ~v]
     (long (bit-xor v# (limit-bits (bit-shift-left v# (- ~t)))))))
(defmacro add-mod-32 [a b]
  `(long (bit-and (+ ~a ~b) 0x01f)))

Well RNG 1024a See: Improved Long-Period Generators Based on Linear Recurrences Modulo 2 F. Panneton, P. L'Ecuyer and M. Matsumoto http://www.iro.umontreal.ca/~panneton/WELLRNG.html

(defn well-rng-1024a
  ([] (well-rng-1024a
       (long-array 32 (take 32 (repeatedly #(rand-int Integer/MAX_VALUE))))
       (rand-int 32)))
  ([^longs state ^long index]
     {:pre (>= 0 index 32)}
     (let [m1 3
           m2 24
           m3 10
           fact 2.32830643653869628906e-10
           new-index (add-mod-32 index 31)
           z0 (aget state new-index)
           z1 (bit-xor (aget state index)
                 (mat0-pos 8 (aget state (add-mod-32 index m1))))
           z2 (bit-xor (mat0-neg -19 (aget state (add-mod-32 index m2)))
                       (mat0-neg -14 (aget state (add-mod-32 index m3))))]
       (aset state index (bit-xor z1 z2))
       (aset state new-index
             (bit-xor (bit-xor (mat0-neg -11 z0) (mat0-neg -7 z1))
                      (mat0-neg -13 z2)))
       (let  []
         (lazy-seq
          (cons (unsign (* (aget state new-index) fact))
                (well-rng-1024a state new-index)))))))
 

Copyright (c) Hugo Duncan. All rights reserved.

The use and distribution terms for this software are covered by the Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) which can be found in the file epl-v10.html at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software.

Implementation of ZIGNOR An improved Ziggurat method to generate normal random samples, Doornik, 2005

(ns criterium.ziggurat
  (:require criterium.well))
(def ^:dynamic *zignor-c* 128 ) ; "Number of blocks."
(def ^:dynamic *zignor-r* 3.442619855899e0)
(def ^:dynamic *zignor-v* 9.91256303526217e-3)
(defn- sqr [x] (* x x))

Initialise tables.

(defn zignor-init
  [c r v]
  (let [c (int c)
        r (double r)
        v (double v)
        #^doubles s-adzigx (double-array (inc c))
        #^doubles s-adzigr (double-array c)
        f (Math/exp (* -0.5e0 r r))]
    (aset s-adzigx 0 (/ v f)) ;; [0] is bottom block: V / f(R)
    (aset s-adzigx 1 r)
    (aset s-adzigx c (double 0.0))
    (loop [i (int 2)
           f f]
      (aset s-adzigx i
            (Math/sqrt (* -2e0 (Math/log (+ (/ v (aget s-adzigx (dec i))) f)))))
      (when (< i c)
        (recur
         (inc i)
         (Math/exp (* -0.5e0 (aget s-adzigx i) (aget s-adzigx i))))))
    (for [#^Integer i (range c)]
      (let [j (int i)]
        (aset s-adzigr j (/ (aget s-adzigx (inc j)) (aget s-adzigx j)))))
    [s-adzigr s-adzigx r (dec c)]))

Pseudo-random normal variates. An implementation of ZIGNOR See: An improved Ziggurat method to generate normal random samples, Doornik, 2005

(defn random-normal-zig
  ([]
     (random-normal-zig (criterium.well/well-rng-1024a)
                        (zignor-init *zignor-c* *zignor-r* *zignor-v*)))
  ([rng-seq]
     (random-normal-zig rng-seq (zignor-init *zignor-c* *zignor-r* *zignor-v*)))
  ([rng-seq c r v] (random-normal-zig rng-seq (zignor-init c r v)))
  ([c r v]
     (random-normal-zig (criterium.well/well-rng-1024a) (zignor-init c r v)))
  ([rng-seq [#^doubles s-adzigr #^doubles s-adzigx zignor-r mask]]
     (letfn [(random-normal-tail
               [min negative rng-seq]
               (loop [rng-seq rng-seq]
                 (let [x (/ (Math/log (first rng-seq)) min)
                       y (Math/log (first (next rng-seq)))]
                   (if (>= (* -2e0 y) (* x x))
                     (if negative
                       [(- x min) (drop 2 rng-seq)]
                       [(- min x) (drop 2 rng-seq)])
                     (recur (drop 2 rng-seq))))))]
       (let [[deviate rng-seq]
             (loop [rng-seq rng-seq]
               (let [r  (first rng-seq)
                     u  (double (- (* 2e0 r) 1e0))
                     i  (bit-and
                         (int (* Integer/MAX_VALUE (first (drop 1 rng-seq))))
                         mask)]
                 ;; first try the rectangular boxes
                 (if (< (Math/abs u) (nth s-adzigr i))
                   [(* u (nth s-adzigx i)) (drop 2 rng-seq)]
                   ;; bottom box: sample from the tail
                   (if (zero? i)
                     (random-normal-tail zignor-r (neg? u) (drop 2 rng-seq))
                     ;; is this a sample from the wedges?
                     (let [x (* u (nth s-adzigx i))
                           f0 (Math/exp
                               (* -0.5e0
                                  (- (Math/pow (nth s-adzigx i) 2) (sqr x))))
                           f1 (Math/exp
                               (* -0.5e0
                                  (- (Math/pow (nth s-adzigx (inc i)) 2)
                                     (sqr x))))]
                       (if  (< (+ f1 (* (first (drop 2 rng-seq) ) (- f0 f1)))
                               1.0)
                         [x (drop 3 rng-seq)]
                         (recur (drop 3 rng-seq) )))))))]
         (lazy-seq (cons deviate (random-normal-zig rng-seq)))))))