Data science trick #1: Don’t roll your own basic statistics functions — use these libraries instead!

by Joey deVilla on May 18, 2018

With data science and machine learning a hot topic these days (and possibly a path to the hottest job at the moment), you may be experimenting with statistics, and in doing so, you may be rolling your own statistics methods. Don’t!

You wouldn’t chop down trees for lumber for a home renovation project; you’d go to Home Depot or a lumber store and get standard cuts of wood. In the same vein, you should make use of ready-made statistics libraries, which are proven, road-tested, and let you focus on what your application actually does.

These are the ones I use:

JavaScript: jStat

If you’re doing stats in JavaScript, you want jStat, which provides not just the basic statistical functions, but all manner of distributions, including Weibull (β), Cauchy, Poisson, hypergeometric, and beta distributions, with probability density functions (pdf), cumulative density functions (cdf), inverse, mean, mode, variance, and a sample function, allowing for more complex calculations.

jStat is contained in a single file: jstat.js; there’s also the minified version, jstat.min.js.

You can also get the most up-to-date version from jsdelivr’s content delivery netowork at http://cdn.jsdelivr.net/npm/jstat@latest/dist/jstat.min.js

To install it via npm, just do this on the command line…

…and if you’re loading it while in Node, reference the child object. Here’s a session in Node:

Python: Python’s statistics library

For more in-depth statistics functions, you’ll want to go with Scipy, but for the basics — namely averages and measures of central location (mean, mode, median, and so on) and calculating spread (variance and standard deviation) — you might just want to use Python’s statistics library, which was introduced with Python 3.4.

To use it, import it first, and then you’re good to go! Here’s a session in the Python REPL:

Swift: SigmaSwiftStatistics

iOS, MacOS, WatchOS, tvOS, and server-side Swift developers can add statistical goodness to their projects with SigmaSwiftStatistics.

You can add it to your project in a number of ways:

  1. Including the SigmaDistrib.swift file into your project.
  2. Using Carthage.
  3. Using CocoaPods.
  4. Using Swift Package Manager.

Here it is in action, in a Swift playground:

Kotlin: Kotlin Statistics

If Kotlin’s your jam and you want to do stats, you want Kotlin Statistics.

I use Kotlin primarily in Android Studio, so I use Gradle to include it:

Here it is, inside an Android app written in Kotlin:

Leave a Comment

Previous post:

Next post: