The Accord.NET Framework is not only an image processing and computer vision framework, but also a machine learning framework for .NET. One of its features is to encompass the exact same algorithms that can be found in other libraries, such as LIBLINEAR, but offer them in .NET withing a common interface ready to be incorporated in your application.
- Download the machine learning framework.
- Browse the source code online.
What is LIBLINEAR?
As its authors put, LIBLINEAR is a library for large linear classification. It is intended to be used to tackle classification and regression problems with millions of instances and features, although it can only produce linear classifiers, i.e. linear support vector machines.
The framework now offers almost all liblinear algorithms in C#, except for one. Those include:
- 0 — L2-regularized logistic regression (primal)
- 1 — L2-regularized L2-loss support vector classification (dual)
- 2 — L2-regularized L2-loss support vector classification (primal)
- 3 — L2-regularized L1-loss support vector classification (dual)
- 4 —
- 5 — L1-regularized L2-loss support vector classification
- 6 — L1-regularized logistic regression
- 7 — L2-regularized logistic regression (dual) for regression
- 11 — L2-regularized L2-loss support vector regression (primal)
- 12 — L2-regularized L2-loss support vector regression (dual)
- 13 — L2-regularized L1-loss support vector regression (dual)
As it can be seen, the command line option 4 is missing. Mode #4 refers to the Crammer and Singer’s formulation for multi-class classification. However, the framework already provides different ways to obtain both multi-class as well as multi-label classifiers through both Voting and Directed Acyclic Graphs (DDAG) mechanisms.
Additionally, the framework also offers:
- Sequential Minimial Optimization
- Sequential Minimal Optimization for Regression
- Least-Squares Learning (LS-SVMs)
- Probabilistic Output Learning (Platt’s algorithm)
The framework can equally load data and load and save support vector machines using the LIBSVM format. This means it should be straighforward to create or learn your models using one tool and run it on the other, if that would be necessary. For example, given that Accord.NET can run on mobile applications, it is possible to create and learn your models in a computing grid using liblinear and then integrate it in your Windows Phone application by loading it in Accord.NET.
The advantages are that:
- Learning algorithms implement one common interface, rather than several functions splitted through the code;
- Algorithms are available as a concise library, ready to be integrated in your existing or new applications, instead of being part of a black-box command line tool (but it can also be used as a command line tool, see sample applications);
- The algorithms can run in Windows, Windows RT, ASP.NET, Windows Phone, Android, iOS, Linux and MacOS (through Mono/Xamarin);
- They can be combined with other meta-algorithms available in the framework to create multi-class and multi-label support vector machines, as well as be part of cross-validation, bootstrapping and split-set validation techniques.
When studying and porting the algorithms, I have also set up a liblinear GitHub repository page to track changes between versions. I hope this repository can also be helpful for other people willing to track modifications done the the liblinear project.
Example
Learning linearly separable problems with a linear machine
In the following example we will create a linear machine to learn a simple linearly separable binary AND problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
// Create a simple binary AND // classification problem: double[][] problem = { // a b a + b new double[] { 0, 0, 0 }, new double[] { 0, 1, 0 }, new double[] { 1, 0, 0 }, new double[] { 1, 1, 1 }, }; // Get the two first columns as the problem // inputs and the last column as the output // input columns double[][] inputs = problem.GetColumns(0, 1); // output column int[] outputs = problem.GetColumn(2).ToInt32(); // Plot the problem on screen ScatterplotBox.Show("AND", inputs, outputs).Hold(); |
1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
// However, SVMs expect the output value to be // either -1 or +1. As such, we have to convert // it so the vector contains { -1, -1, -1, +1 }: // outputs = outputs.Apply(x => x == 0 ? -1 : 1); // Create a new linear-SVM for two inputs (a and b) SupportVectorMachine svm = new SupportVectorMachine(inputs: 2); // Create a L2-regularized L2-loss support vector classification var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs) { Loss = Loss.L2, Complexity = 1000, Tolerance = 1e-5 }; // Learn the machine double error = teacher.Run(computeError: true); // Compute the machine's answers for the learned inputs int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x))); // Plot the results ScatterplotBox.Show("SVM's answer", inputs, answers).Hold(); |
![]() |
The linear SVM’s answer to the linearly separable AND problem. As it can be seen, a linear SVM can correctly predict the colors for each of the points in the original problem. This happens because the SVM learning algorithm is able to find the line that separates the blue points from the red points. |
Learning non-linearly separable problems with a linear machine through kernel expansions
Now, we will move a bit further. We will use an explicit kernel expansion to learn the non-linearly separable exclusive or (XOR) problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
// Create a simple binary XOR // classification problem: double[][] problem = { // a b a XOR b new double[] { 0, 0, 0 }, new double[] { 0, 1, 1 }, new double[] { 1, 0, 1 }, new double[] { 1, 1, 0 }, }; // Get the two first columns as the problem // inputs and the last column as the output // input columns double[][] inputs = problem.GetColumns(0, 1); // output column int[] outputs = problem.GetColumn(2).ToInt32(); // Plot the problem on screen ScatterplotBox.Show("XOR", inputs, outputs).Hold(); |
![]() |
The binary XOR problem. The XOR problem is not a linerly separable problem, because it is not possible to draw a line separating the blue points from the red points. In this setting, we should expect a linear SVM learning algorithm to fail, because it will not be able to find this line that doesn’t exist. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
// However, SVMs expect the output value to be // either -1 or +1. As such, we have to convert // it so the vector contains { -1, -1, -1, +1 }: // outputs = outputs.Apply(x => x == 0 ? -1 : 1); // Create a new linear-SVM for two inputs (a and b) SupportVectorMachine svm = new SupportVectorMachine(inputs: 2); // Create a L2-regularized L2-loss support vector classification var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs) { Loss = Loss.L2, Complexity = 1000, Tolerance = 1e-5 }; // Learn the machine double error = teacher.Run(computeError: true); // Compute the machine's answers for the learned inputs int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x))); // Plot the results ScatterplotBox.Show("SVM's answer", inputs, answers).Hold(); |
![]() |
As we can see, the linear SVM failed to predict the correct colors for each of the points. The problem is that the answers from a linear SVM are constrained to be hyperplanes (in this 2D case, lines) that separate the points. Because there is no line that separates the blue points from the red points in the XOR problem, the linear SVM learning algorithm tries its best, finding an approximate solution, but not the XOR solution we were looking for. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
// Use an explicit kernel expansion to transform the // non-linear classification problem into a linear one // // Create a quadratic kernel Quadratic quadratic = new Quadratic(constant: 1); // Project the inptus into a higher dimensionality space double[][] expansion = inputs.Apply(quadratic.Transform); // Create a new linear-SVM for the transformed input space svm = new SupportVectorMachine(inputs: expansion[0].Length); // Create the same learning algorithm in the expanded input space teacher = new LinearDualCoordinateDescent(svm, expansion, outputs) { Loss = Loss.L2, Complexity = 1000, Tolerance = 1e-5 }; // Learn the machine error = teacher.Run(computeError: true); // Compute the machine's answers for the learned inputs answers = expansion.Apply(x => Math.Sign(svm.Compute(x))); // Plot the results ScatterplotBox.Show("SVM's answer", inputs, answers).Hold(); |
![]() |
By using an explicit kernel expansion, we can use a linear SVM to learn a non-linearly separable problem. This is possible because the kernel transformation projects the data into a higher dimensionality space where the data is indeed linearly separable. For an intuition on how this could be possible, please check the blog post kernel functions for machine learning applications. |
Learning liblinear problems from libsvm format
Now, we move even further, and use a linear machine to load one of the toy LIBLINEAR problems available in LibSVM format using the framework’s SparseReader class.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
// Create a new LibSVM sparse format data reader // to read the Wisconsin's Breast Cancer dataset // var reader = new SparseReader("examples-sparse.txt"); int[] outputs; // Read the classification problem into dense memory double[][] inputs = reader.ReadToEnd(sparse: false, labels: out outputs); // The dataset has output labels as 4 and 2. We have to convert them // into negative and positive labels so they can be properly processed. // outputs = outputs.Apply(x => x == 2 ? -1 : +1); // Create a new linear-SVM for the problem dimensions var svm = new SupportVectorMachine(inputs: reader.Dimensions); // Create a learning algorithm for the problem's dimensions var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs) { Loss = Loss.L2, Complexity = 1000, Tolerance = 1e-5 }; // Learn the classification double error = teacher.Run(); // Compute the machine's answers for the learned inputs int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x))); // Create a confusion matrix to show the machine's performance var m = new ConfusionMatrix(predicted: answers, expected: outputs); // Show it onscreen DataGridBox.Show(new ConfusionMatrixView(m)); |
![]() |
The confusion matrix for the binary classification problem. As it can be seen, the higher values concentrate in the diagonal. Those values indicate how many hits (correct guesses) the machine was able to make. The other values that don’t lie in the diagonal indicate how many errors (and of what kind) the machine made in this classification problem. |
The last version of this tutorial can also be seen on the project’s wiki pages for linear machines.
References
- R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A Library for Large Linear Classification, Journal of Machine Learning Research 9(2008), 1871-1874. Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear