Statistics | MindSpace

statlearning class

December 6, 2013 Leave a comment

This will be very useful for people like me who want to apply this to Capacity Planning

Rob Tibshirani and I are offering a MOOC in January on Statistical Learning. This “massive open online course" is free, and is based entirely on our new book “An Introduction to Statistical Learning with Applications in R” (James, Witten, Hastie, Tibshirani 2013, Springer). http://www-bcf.usc.edu/~gareth/ISL/ The pdf of the book will also be free.


The course, hosted on Open edX, consists of video lecture segments, quizzes, video R sessions, interviews with famous statisticians,

lecture notes, and more. The course starts on January 22 and runs for 10 weeks.

Please consult the course webpage http://statlearning.class.stanford.edu/ to enroll and for for further details. ---------------------------------------------------------------------------------------- Trevor Hastie hastie@stanford.edu Professor, Department of Statistics, Stanford University Phone: (650) 725-2231 Fax: (650) 725-8977 URL: http://www.stanford.edu/~hastie address: room 104, Department of Statistics, Sequoia Hall 390 Serra Mall, Stanford University, CA 94305-4065 --------------------------------------------------------------------------------------

Filed under R, Statistics Tagged with Stanford University, Statistical Learning

Statistics of agreement

September 27, 2012 Leave a comment

I found this formula that calculates the percentage of agreement between two ratings quite interesting and coded the following simple steps using ‘R’. This is called Cohen’s kappa and even though there is nothing original about this entry it is very useful. I wrote the simple R code though because I am learning R.
It was also surprising that I didn’t know about it and our teams are not at all technical enough even to use these foundational principles. As is evident this has wide applications in the fields of percentage agreement calculations when two teams don’t agree or auditors don’t agree with each other. Whither will our antagonistic attitude towards good calculations in technical and project management drive us.

The other point that is a highlight is that I found the description of this formula in a paper dealing with Architecture Trade-off Analysis Method.

The matrix created below shows that two people agree with each other on certain points and disagree on others. The formula to calculate the level of agreement is

Observed percentage of agreement - Expected percentage of agreement -------------------------------------------------------------- 1 - Expected percentage of agreement


R code
kappa<-matrix(c(5,2,1,2),ncol=2)
colnames(kappa)<-c("Disagree","Agree")
rownames(kappa)<-c("Disagree","Agree")
kappa

( I have formatted the output of 'R' as a table )


  

  
Disagree Agree

  
Disagree 5 1

  
Agree 2 2


kappamargin<-kappa/margin.table(kappa)
kappamargin

( I have formatted the output of 'R' which are the percentages as a table )


  

  
Disagree Agree

  
Disagree 0.5 0.1

  
Agree 0.2 0.2


Observed percentage of agreement = 0.5 + 0.2
Now we want the totals as this table shows. We multiply the total figures of the same color


  

  
Agree Disagree Total

  
Agree 0.5 0.1 0.6

  
Disagree 0.2 0.2 0.4

  
Total 0.7 0.3


So I have just used this line of code to create a matrix of the totals for illustration.
marginals<-matrix(c(margin.table(kappamargin,1),margin.table(kappamargin,2)),ncol=2)
marginals

( I have formatted the output of 'R' as a table )


  0.6 0.7

  
0.4 0.3


Expected percentage of agreement = ( 0.6 * 0.7 ) + ( 0.4 * 0.3 )
So final kappa value is 
(0.7 - (marginals[1,1] * marginals[1,2]) + (marginals[2,1] * marginals[2,2])) /
(1- (marginals[1,1] * marginals[1,2]) + (marginals[2,1] * marginals[2,2]))

0.57
(i.e)
0.7 - (( 0.6 * 0.7 ) + ( 0.4 * 0.3 ))

----------------------------------

1 - (( 0.6 * 0.7 ) + ( 0.4 * 0.3 ))


Disagree	Agree
Disagree	5	1
Agree	2	2


Disagree	Agree
Disagree	0.5	0.1
Agree	0.2	0.2


Agree	Disagree	Total
Agree	0.5	0.1	0.6
Disagree	0.2	0.2	0.4
Total	0.7	0.3

Filed under Statistics

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

MindSpace

statlearning class

Statistics of agreement

R code

Blogroll