Image Image Image Image Image Image Image Image Image Image

Turing Finance | December 4, 2016

Scroll to top


Machine Learning Software

Machine learning has many synonyms including, but not limited to, computational statistics, data mining, artificial intelligence, computational intelligence, and most recently deep learning (deep learning can also be seen as a specific instance of machine learning). Put simply, machine learning is the construction of algorithms which enable models to learn the hidden patterns in data. Popular machine learning techniques include decision tree learning, association rule learning, neural networks, support vector machines, clustering, Bayesian networks, meta-heuristic algorithms, and more. Because implementing your own models is typically quite time consuming there are a number of packages available which implement these techniques for us. Some of the machine learning software I have used are listed below.

For a much more comprehensive list check out this GitHub list of machine learning frameworks!

Data Mining / Statistical Analysis Software

SciKit Learn Logo

SciKit Learn provides Python tools which support classification, regression, clustering, dimensionality reduction, model selection, and data analysis

Weka Software Logo

Weka is is a collection of machine learning algorithms for data mining tasks. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization.

Gretl Software

Gretl is a cross-platform software package for econometric analysis. It supports a wide variety of estimators, time series methods, and Limited dependent variable models.

Python Packages

I a big proponent of Python because of it's readability, scalability (especially when coupled with systems like Apache Spark), and the depth of functionality offered by Python packages. The following packages are essential for the statistical analysis of data which is a fundamental aspect of machine learning.

SciPi Logo

SciPy is a designed for mathematics, science, and engineering applications. The SciPi package is the core package which brings the others together

Pandas Logo

Pandas provides data structures and data analysis tools including operations for manipulating numerical tables (matrices) and doing (financial) time series analysis.


NumPy is used for scientific computing and it contains an N-dimensional array, linear algebra, Fourier transformations, and random number generators (useful for quants).

SymPy Logo

SymPy is a library for symbolic mathematics it has features related to polynomials, combinatorics, equation solving, discrete math, matrices, and more.

Artificial Neural Networks Software

Encog Logo 3

Encog is Java based and supports data pre-processing, Support Vector Machines, Neural Networks, Bayesian Networks, Hidden Markov Models, and Genetic Algorithms (incl. programming).

PyBrain Logo

PyBrain is a modular Machine Learning Library for Python. It contains algorithms for neural networks, for reinforcement learning (and combinations), for unsupervised learning, and evolution.

Neuroph Logo

Neuroph is lightweight Java neural network framework for common neural network architectures. It contains open source Java libraries with classes which correspond to basic NN concepts.

Optimization Software

DEAP logo

DEAP (Distributed Evolutionary Algorithms for Python) contains a rapid prototyping environment for testing evolutionary algorithms. It is built off of the SCOOP framework for efficient parallel execution.

JOptimizer Logo 2

JOptimizer is a Java based open source library for constrained optimization. It includes gradient descent algorithms, linear programming, quadratic programming, etc. (Great for benchmarks)

OAT Logo

OAT (the Optimization Algorithm Toolkit) is a Java based library with implementations of evolutionary algorithms, swarm intelligence (including ant colony optimization) and other CI algorithms.