# Principal Component Analysis

This is about what I think I understood about Principal Component Analysis. I will update this blog post later.

The code is in github and it works but I think the eigen values could be wrong. I have to test it further.

These are the two main functions.

```
"""Compute the covariance matrix for a given dataset.
"""
def estimateCovariance( data ):
print data
mean = getmean( data )
print mean
dataZeroMean = map(lambda x : x - mean, data )
print dataZeroMean
covar = map( lambda x : np.outer(x,x) , dataZeroMean )
print getmean( covar )
return getmean( covar )

"""Computes the top `k` principal components, corresponding scores, and all eigenvalues.
"""
def pca(data, k=2):

d = estimateCovariance(  data )

eigVals, eigVecs = eigh(d)

validate( eigVals, eigVecs )
inds = np.argsort(eigVals)[::-1]
topComponent = eigVecs[:,inds[:k]]
print '\nTop Component: \n{0}'.format(topComponent)

correlatedDataScores = map(lambda x : np.dot( x ,topComponent), data )
print ('\nScores : \n{0}'
.format('\n'.join(map(str, correlatedDataScores))))
print '\n eigenvalues: \n{0}'.format(eigVals[inds])