Wrapping up!

As many of you might have noticed by now, the Accord.NET project has been archived on GitHub.  The project had grown way beyond my current capabilities at this time, and I felt my hiatus from the project was being more detrimental to the community than just letting it go into the hands of the next amazing developers who might find it useful and decide to take the token into their hands and re-use the best parts of it into the projects they would seem fit.

The machine learning landscape has greatly evolved since I had started this project ~15 years ago. I am certain that the knowledge that had been stored in the form of code within the framework is going to be useful for many people, and I cannot wait to see what more advanced things are going to come up in the future.

If anyone would like, feel free to grab the framework and tear it apart, grab the parts that worked better for you and fork it! As I had mentioned in the now permanent notice at the framework page, if I (Cesar De Souza) am the solely developer of any of the classes you would like to reuse, I hereby grant you an irrevocable license to do so. If I am not the only author, and the current license of the file you would like to port does not suit your needs, I can help you contact their original developers to help you re-use within your open or closed source applications.

Those past years have been an amazing ride and let’s keep ML in .NET evolving!

Using virtual data to solve real problems

It seems one of the next frontier in advancing machine learning is in data generation.

It is no secret that deep learning has been advancing object classification, language translation, speech recognition, and virtually any other task where large amounts of labeled data are available or can be easily collected. However, not all tasks can rejoice from such immediate availability of labeled data, and for some of them, such data could be almost impossible to acquire (like for example, imagine annotating every single pixel in every frame of a video, indicating to which object in a scene this pixel is related to, such as a lamp, a ball, or a road – and imagine repeating this process for thousands or even millions of videos).

Maybe it might seem surprising to some not in the field, but actually one of the most common ways to obtain labeled data for training machine learning systems currently is to use Amazon’s Mechanical Turk, a platform where human workers get paid to manual label training data for computers. Needless to say, using human labor is not really (nor should be) exactly cheap.

On the other hand, if there is anything that computers really excel at, is at replacing humans in the most repetitive and boring tasks. But wait. If this is the case, then… Are there ways in which we could use computers to generate the training data needed to train other computers?

Well, the answer is yes. And while techniques for doing so have been known for a long time, it has been receiving an ever increasing interest over the past years, and became one the recurring topics during the talks at the Thirtieth Annual Conference on Neural Information Processing Systems (NIPS2016), one of the most important conferences in artificial intelligence and machine learning that was taking place in Barcelona just a few days from when this post was first published.

Virtual Worlds and Human Actions for Video Understanding

On Wednesday 7th December we presented a demonstration at NIPS2016 on how synthetic human action videos could be generated using game engines, procedural animation, and limited-time physics simulations. To the best of our knowledge, our work represents the first time synthetic, physically-plausible human action videos have been procedurally generated from the ground-up to successfully train vision-based human action recognition systems.

Reading matrices from MATLAB (.mat) in C#

Since last year, it is now possible to read MATLAB files from .NET applications using the Accord.NET Framework. Let’s say you have a .mat file stored in your desktop. You can load it in C# using:

To install the Accord.NET Framework into your C# application, install Accord.Math and Accord.IO through NuGet.

C# equivalent to std::nth_element

Let’s say you would like to determine what is the second largest element in an array, but you don’t want to pay the cost of sorting the entire array just to discover which value would fall into this position.

In those situations, you would use std::nth_element to partially sort the array such that the element that you want will fall in the n-th position that you need.

The std::nth_element function also goes one step further: it guarantees that all elements on the left of the nth position that you asked for will be less than the value in that position. Those values, however, can be in arbitrary order.

Unfortunately, knowing the niftiness of this function doesn’t help much if you can’t call it from the environment you are programming in. The good news is that, if you are in .NET, you can call Accord.Sort.NthElement (overloads) from C#, VB.NET or any other .NET language using Accord.NET.

An example can be seen below:

The careful reader might have noticed that the entire array has been sorted, instead of just the left part as previously advertised. However, this is on purpose: for very small arrays (like the one in this example), it is more advantageous to use a full, but simpler, sorting algorithm than partial quicksort. When the arrays are large enough (the current threshold is 32 elements), a partial quicksort with a median-of-three pivoting strategy kicks in.

The source code for the function can be found here.

Video classification with hybrid models

Deep learning reigns undisputed as the new de-facto method for image classification. However, at least until the beginning of 2016, it was not yet quite clear whether deep learning methods were undisputedly better than more traditional approaches based on handcrafted features for the specific task of video classification. While most research interest is certainly being directed towards deep learning, we thought it would be fun to play on the handcrafted features side for a while and we actually made some pretty interesting discoveries along the way.

Disclaimer: this post is directed towards beginners in computer vision and may contain very loose descriptions or simplifications of some topics in order to bring interest about the field to unexperienced readers – for a succinct and peer-reviewed discussion about the topic, the reader is strongly advised to refer to https://arxiv.org/abs/1608.07138 instead.

Three (non-exhaustive) approaches for classification

In the following paragraphs, we will explore a little three different ways of performing video classification: the task of determining to which of a finite set of labels a video must belong to.

Handcrafted features: designing image features yourself

Initial video classification systems were based on handcrafted features, or, in other words, based on techniques for extracting useful information from a video that have been discovered or invented based on the expertise of the researcher.

For example, a very basic approach would be to consider that whatever could be considered as a “corner” in an image (see link for an example) would be quite relevant for determining what was inside this image. We could then present this collection of points (which would be of much reduced size than the image itself) to a classifier  and hope it would be able to determine its contents from this simplified representation of the data.

In the case of video, the most successful example of those is certainly the Dense Trajectories approach of Wang et al., that captures frame-level descriptors over pixel trajectories determined by optical flow.

Deep learning: letting networks figure out features for you

Well, some could actually think that using corners or other features to try to guess what is in an image or video is not straightforward at all – why not simply use the image or video itself as a whole and present this to the classifier to check out what it would come up with? Well the truth is that this had been tried, but until a few years ago, it simply wouldn’t work except for simple problems. For some time we didn’t know how to make classifiers that could handle extremely large amounts of data, and yet extract anything useful from it.

This started to change in 2006, when the interest on training large neural networks started to raise. In 2012, a deep convolutional network with 8 layers managed to win the ImageNet challenge, far outperforming other approaches based on Fisher Vector encodings of handcrafted features. Since then, many works have shown how deep nets could perform way better than other methods in image classification and other domains.

However, while deep nets have certainly been shown to be the undisputed winners in action classification, the same could not yet be said about video classification, at least not in the beginning of 2016.  Moreover, training deep neural networks for video can also be extremely costly, both in terms of computational power needed as well as the number and sheer size of examples needed to train those networks.

Hybrid models: getting the best out of the both worlds 😉

So, could we design classification architectures that could be both powerful and easy to train? In the sense that it wouldn’t be necessary to rely on huge amounts of labelled data in order to learn even the most basic, pixel-level aspects of a video, but instead in a way that we could leverage this knowledge already from techniques that are already known to work fairly well?

Fisher with a Deep Net
Fisher and a deep net – although not quite the same variety found in video classification.

Well, indeed, the answer seems to be yes – as long as you pay attention to some to some details that can actually make a huge difference.

This is shown in the paper “Sympathy for Details: Hybrid Classification Architectures for Action Recognition.” This paper is mostly centered about two things: a) showing that it is possible to make standard methods perform as good as deep nets for video; and b) showing that by combining Fisher Vectors of traditional handcrafted video features with a multi-layer architecture can actually perform better than most deep nets for video, achieving state-of-the-art results at the date of submission.

The paper, co-written with colleagues and advisors from Xerox Research Centre and the Computer Vision Center of the Autonomous University of Barcelona, has been accepted for publication in the 14th European Conference on Computer Vision (ECCV’16), to be held in Amsterdam, The Netherlands, this October, and can be found here.

Surface Pro 3 is not charging; power connector LED is blinking

Your Surface Pro 3 is charging intermittently, i.e. switching from charging to not charging every 10 seconds. Check for the white LED in the power connecter that attached to the right side of your Surface, and see if its blinking.

If yes, then check whether you have other devices attached to your Surface charger (such as your Lumia phone), and disconnect it.

If this solves your issue, the problem is that the charging was not managing to charge both at the same time (especially if you were charging your cellphone using a USB 3.0 cable).

PowerPoint is huge on a second monitor

If you are running Windows 10 and you have a high-DPI monitor such as in a Surface Pro 3, and connect to a second monitor using a Mini-DisplayPort adapter, then open PowerPoint, its very likely that you will find this:

PowerPoint DPI problems Surface Pro 3, Windows 10, Office 365

If you haven’t seem the problem personally, it might be difficult to guess from the picture what is going on. The problem is that PowerPoint’s Ribbon is huge given that it is running in a 21″ monitor and not in a tablet anymore.

The problem doesn’t seem to occur with Word or Excel when they are transposed from the Surface screen to the external monitor. It seems to be exclusively related to PowerPoint.

Hopefully, there is a solution for this problem. If you have Office 365, open the file

C:Program FilesMicrosoft Office 15rootoffice15powerpnt.exe.manifest

Using a text editor. Then, look for word True/PM in the following block:

And change it to:

Now save and open PowerPoint. PowerPoint should not auto-scale properly when you transpose its window from the Surface Pro 3 screen and to external monitor, and vice-versa:

PowerPoint Windows 10 Surface Pro 3 - normal DPI


How to fix blurry Windows Forms Windows in high-dpi settings

If you have just opened one of your previous applications in your brand new computer with a very high-dpi monitor, perhaps you will find out that your interface that previously worked perfectly is now utterly broken and/or blurry.

This might be the case, for example, if you just tried to open an old Windows.Forms application in your brand new Surface Pro computer.

Windows.Forms (WinForms) application in a high-dpi display. If you have your DPI set to 150%, your form might now look like this.


Same WinForms application after applying the fix detailed here.


How to fix it

  1. Go the the Forms designer, then select your Form (by clicking at its title bar)
  2. Press F4 to open the Properties window, then locate the AutoScaleMode property
  3. Change it from Font (default) to Dpi.


Now, go to Program.cs (or the file where your Main method is located) and change it to look like

Save and compile. Now your form should look crispy again.

I encountered this problem while opening and editing Accord.NET sample applications in Visual Studio in a Surface 3 Pro.

Related resources

Hidden Conditional Random Fields

If you are attempting to solve a sequence classification problem, perhaps you might be interested on learning about Hidden Conditional Random Fields (HCRFs). Those are the discriminant counterpart of  classifiers based on a set of hidden Markov models (HMMs).

Hidden Conditional Random Field
Credit: theaucitron, wheat fields.

Hidden what?

Hidden Markov models are simple models that can be created to recognize whether a sequence of observations is similar to the previous sequences that the model has seen before. However, if we create one HMM after each type of sequence that we are trying to distinguish, and them individually ask each model whether it recognizes the given sequence, we have just created a hidden Markov model classifier.

However, we might have a slightly better way of classifying those sequences. This method for creating a classifier (i.e. creating individual models to model each sequence type, then asking which model how strongly it recognizes a new sequence) is known as generative learning. But we could also have created a model from the ground-up that was just focused on distinguishing between sequence types without modeling them first. This would be known as discriminative learning.

And as mentioned in the tagline for this article, HCRFs are the discriminative doppelganger of the HMM classifier. Let’s see then how we can use them.

Creating HCRFs sequence classifiers in C#/.NET

If you would like to explore them in your projects, the Accord.NET Framework provides Hidden Markov Models, Hidden Markov Model Classifiers, Conditional Random Fields and Hidden Conditional Random Fields.

Continue reading →

Version invariant deserialization in .NET

If you have serialized an object using a previous version of a library or program, after you try to deserialize this object again you might encounter the following exception:

{“Exception has been thrown by the target of an invocation.”}

The inner exception might read:

[System.IO.FileLoadException] = {“Could not load file or assembly ‘Accord.Math, Version=, Culture=neutral, PublicKeyToken=fa1a88e29555ccf7’ or one of its dependencies. The located assembly’s manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)”:”Accord.Math, Version=, Culture=neutral, PublicKeyToken=fa1a88e29555ccf7″}

In this case, it is very likely that those exceptions are occurring because the .NET run-time is looking for an assembly with the specific version indicated above. Even if you have new assemblies with exact the same name and exact public key token, the .NET might still refuse to deserialize it.

In order to get around this, put the following static class into your application:

Now, go back where you were using your deserializer and getting that exception, and instead of calling formatter.Deserialize, call formatter.DeserializeAnyVersion:


Deserialization now might work as expected; but please keep in mind that we might be loosing some security here. However, this might be a concern only if your application is dynamically loading assemblies at run-time.

Here are some resources discussing the problem:

Such extension method will also be included in the Accord.NET Framework.