Principal Component Analysis (PCA) is a technique for exploratory data analysis with many success applications in several research fields. It is often used in image processing, data analysis, data pre-processing, visualization and is often used as one of the most basic building steps in many complex algorithms.

One of the most popular resources for learning about PCA is the excellent tutorial due to Lindsay I Smith. On her tutorial, Lindsay gives an example application for PCA, presenting and discussing the steps involved in the analysis.

Souza, C. R. “A Tutorial on Principal Component Analysis with the Accord.NET Framework“. Department of Computing, Federal University of São Carlos, Technical Report, 2012.

This said, the above technical report aims to show, discuss and otherwise present the reader to the Principal Component Analysis while also reproducing all Lindsay’s example calculations using the Accord.NET Framework. The report comes with complete source C# code listings and also has a companion Visual Studio solution file containing all sample source codes ready to be tinkered inside a debug application.

Great Job, César!

Just let me know if you need any help.

Olá Cezar boa tarde,gostaria de saber se reflow que você no seu HP DV6000 ainda está funcionando,nesse meio tempo acontecerão outros problemas na placa?

Hi Cezar,

Would it be possible to have a quick email chat re your markov code?

xmoon2000@googlemail.com

Hola Cesar

No logro agregar Accord.Math ni Accord.Statistics como referencias. Pude, sin embargo, agregar AForge y AForge.math sin problemas. Que me recomiendas para identificar el problema?

Saludos, y gracias por compartir tu trabajo

Armando Scalise

Hola Armando,

Gracias por su interés en el framework! ¿Ha tratado de seguir la guía de instalación inicial? Si incluso después de seguir los pasos que se muestran allí no funciona, por favor asegúrese de que también está utilizando una solución NET 4. Si intenta agregar los assemblies en un proyecto NET 3.5, muy probablemente encontrará el problema que usted está describiendo.

Si este es el caso, puedes descargar los binarios especiales para NET 3.5 desde la página de le framework.

Espero que ayude!

Saludos,

Cesar

PCA is a very useful technique in data mining. I have often used it for noise filtering and trend identification. Recently I applied it on Indian stock markets and came up with some insights. Details here:

http://quantcity.blogspot.in/2013/12/pca-on-nifty-stocks.html

This is just beautiful tutorial, I want to thank you for your work, it means a lot to us who are just starting to learn this stuff.

I am a beginner in MLBP implementation. I have an input size of 200 vectors, each vector has 4 variables (components) of real numbers, and one output corresponding to each input vector. I have selected one hidden layer with sigmoid function and one output layer with purelin function. I have obtained error of the net using feed forward for each input vector. I am following Hagan’s book on LMBP. I am stuck up in back propagating the sensitivities and obtaining Jacobian matrix. My problem is how to obtain the derivative δv1/δx1, δv1/δx2, etc for each element in the Jacobian Matrix. For example, in the book it is stated that δv1/δx1=δe1,1/δw11,1, etc, I understand the notations but the problem is e and w terms are pure numbers. If e contains an expression which includes an activation function then its derivative is possible. Am I right? Second part of the doubt is I have assumed same weight value for each component of each input vector; for example, I have taken w1 = 0.20 same value for all the components of input vectors in the first layer, with a bias of 1 for each input (and not for the components); and w2=0.1 in the output layer with bias=0. Is there any error in this assumption? The third point is how many elements of Jacobian matrix of the input-output combinations should I expect? Should it be 800? Please explain. Also suggest some book or tutorial which explains with problem of the type I have stated.

Hello Cesar,

Please help me with the solution of my problem.

I am a beginner in MLBP implementation. I have an input size of 200 vectors, each vector has 4 variables (components) of real numbers, and one output corresponding to each input vector. I have selected one hidden layer with sigmoid function and one output layer with purelin function. I have obtained error of the net using feed forward for each input vector. I am following Hagan’s book on LMBP. I am stuck up in back propagating the sensitivities and obtaining Jacobian matrix. My problem is how to obtain the derivative δv1/δx1, δv1/δx2, etc for each element in the Jacobian Matrix. For example, in the book it is stated that δv1/δx1=δe1,1/δw11,1, etc, I understand the notations but the problem is e and w terms are pure numbers. If e contains an expression which includes an activation function then its derivative is possible. Am I right? Second part of the doubt is I have assumed same weight value for each component of each input vector; for example, I have taken w1 = 0.20 same value for all the components of input vectors in the first layer, with a bias of 1 for each input (and not for the components); and w2=0.1 in the output layer with bias=0. Is there any error in this assumption? The third point is how many elements of Jacobian matrix of the input-output combinations should I expect? Should it be 800? Please explain. Also suggest some book or tutorial which explains with problem of the type I have stated.

Hi there,

The Jacobian matrix must contain the derivatives of each network parameter for each input sample in your training problem. This means that, to figure out the size of your matrix, the first think would be to figure out the number of parameters in your network.

The number of parameters can be achieved by either counting how many free weights you have in your network (such as all the neural weights plus their bias).

If in your example you have just one hidden layer with, let’s say, H neurons, then you would have (4+1)*H + (H+1) parameters. Those would be the weight for each component of your input vectors (4) plus the bias (+1). You will have one of those for each hidden neuron, thus the (4+1)*H part. Each hidden neuron will be connected to a single output neuron, and this output neuron will have it bias term, thus (H+1).

If you have 200 input vectors, the size of your Jacobian matrix will be 200 x (4+1)*H+(H+1).

In order to obtain the derivatives, you will have to known the derivative of your activation function, and use the chain-rule to obtain a formula for them. Here, try to compute it manually for a few functions, then afterwards you might start to see a pattern to start coding it. Note: it is actually very similar to how backpropagation works.

Sorry for the lateness in replying this question, but I hope it can still help you or others facing the same issue!

Best regards,

Cesar

Hello Cezar,

thanks for sharing your code with the community.

I have a question with regards to your submission to the CodeProject on Accord.NET KernelPrincipalComponentAnalysis: in the Compute() method you do the normalization of ingenvectors and have the following line:

double eig = System.Math.Sqrt(System.Math.Abs(evals[j]));

Why eigenvectors would be normalized by something related to eigenvalues?

Cheers,

Yuri Rzhanov

Hi Yuri,

This is done simply to enforce some normalization on the Eigenvectors. Since the eigenvectors are not unique (since you can multiply an eigenvector by a scalar and still have an eigenvector), you can attempt to minimize this variability by enforcing some kind of rule over their norm. In this case, the eigenvectors are being normalized using the square root of their corresponding absolute eigenvalue. You might see that other packages do the same, so we can have some kind of convention on how different software report their results.

Unfortunately, even if we normalize the eigenvectors in such way, there is still one problem that might result in slightly different results between other software packages: the sign of the vectors. Unfortunately, there is no consensus on how different packages (such as Matlab, Octave, etc) should report the sign of the eigenvectors, and often they indeed vary between different software. But please note that this should not matter at all, because even if signs (or scales) change, those different packages reporting different answers are all reporting correct answers.

Hope it helps!

Best regards,

Cesar

Hi Cesar

I’ve stumbled upon your work quite some times in the last few days since I was doing some research on classification. You recently made a comment on a question I had about PCA on stackoverflow. I was wondering if you have any idea on how to combine PCA with a classification algorithm such as Naive Bayes (if there are better ways to do classify using PCA, please let me know). I’m currently struggling to achieve this using the Accord.NET framework. I also posted a question about this on Stackoverflow (see: http://stackoverflow.com/questions/29696565/how-to-classify-documents-using-naive-bayes-and-principal-component-analysis-c).

I’m not sure if you can help me about but I would definitely appreciate it if you could have a look at my question or if you know how you can classify using the results of PCA.

Hi Cesar

First of all thank you for helping me solve my previous problem. It has made me able to progress with my work on PCA. Since I can’t contact you personally, I hope that by posting here I will be able to reach you.

Have you ever run into the problem where pca.compute() simply takes too long or where you receive an OutOfMemory exception? I’m currently facing this problem and have just posted a question about it on stackoverflow, which can be viewed here: https://stackoverflow.com/questions/30122738/how-to-solve-outofmemoryexception-that-is-thrown-using-principal-component-analy

I tried solving it by using a covariance matrix as presented on the website of the Accord.NET framework. But this raises an OutOfMemory exception.

Hi Cesar,

thank you for you excellent tutorial.

Is there any way to calculate the contribution of each variable to the principal components ?

Best,

Rocco