My first AWS cluster

I have deployed to the cloud before but this time it is AWS.

Screen Shot 2014-08-20 at 10.40.17 AM

Screen Shot 2014-08-20 at 10.42.14 AM

Screen Shot 2014-08-20 at 10.45.01 AM

Screen Shot 2014-08-20 at 10.45.19 AM

A billing alarm for safety.

Screen Shot 2014-08-20 at 11.07.26 AM

Steve Vinoski recommends these papers

One does not need a degree from the IIT  to read and understand these papers. These are accessible and one has to persevere. That is all. In India technical work is considered taboo because the society thinks it is the prerogative of people with advanced degrees.

Steve Vinoski recommends these papers.

 

“Eventual Consistency Today: Limitations, Extensions, and Beyond”, Peter Bailis, Ali Ghodsi. This article provides an excellent description of eventual consistency and
recent work on eventually consistent systems.

“A comprehensive study of Convergent and Commutative Replicated Data Types”, M. Shapiro, N. Preguiça, C. Baquero, M. Zawirski. This paper explores and details data types that work well for applications built on eventually consistent systems.

“Notes on Distributed Systems for Young Bloods”, J. Hodges. This excellent blog post succinctly summarizes the past few decades of
distributed systems research and discoveries, and also explains some implementation concerns we’ve learned along the way to keep in mind when build distributed applications.

“Impossibility of Distributed Consensus with One Faulty Process”, M.Fischer, N. Lynch, M. Paterson. This paper is nearly 30 years old but is critical to understanding fundamental properties of distributed systems.

“Dynamo: Amazon’s Highly Available Key-value Store”, G. DeCandia, et al. A classic paper detailing trade-offs for high availability distributed systems.

Matrix multiplication

This is simplified code that multiplies two matrices – a and b – using data stored in a file.

["a", 0, 0, 63]
["a", 0, 1, 45]
["a", 0, 2, 93]
["a", 0, 3, 32]
["a", 0, 4, 49]
["a", 1, 0, 33]
["a", 1, 3, 26]
["a", 1, 4, 95]
["a", 2, 0, 25]
["a", 2, 1, 11]
["a", 2, 3, 60]
["a", 2, 4, 89]
["a", 3, 0, 24]
["a", 3, 1, 79]
["a", 3, 2, 24]
["a", 3, 3, 47]
["a", 3, 4, 18]
["a", 4, 0, 7]
["a", 4, 1, 98]
["a", 4, 2, 96]
["a", 4, 3, 27]
["b", 0, 0, 63]
["b", 0, 1, 18]
["b", 0, 2, 89]
["b", 0, 3, 28]
["b", 0, 4, 39]
["b", 1, 0, 59]
["b", 1, 1, 76]
["b", 1, 2, 34]
["b", 1, 3, 12]
["b", 1, 4, 6]
["b", 2, 0, 30]
["b", 2, 1, 52]
["b", 2, 2, 49]
["b", 2, 3, 3]
["b", 2, 4, 95]
["b", 3, 0, 77]
["b", 3, 1, 75]
["b", 3, 2, 85]
["b", 4, 1, 46]
["b", 4, 2, 33]
["b", 4, 3, 69]
["b", 4, 4, 88]

I loaded this data in a dict so that the keys are the tuples(row,column).

      for x in sorted(matrixa.items()):
            print x

((0, 0), 63)
((0, 1), 45)
((0, 2), 93)
((0, 3), 32)
((0, 4), 49)
((1, 0), 33)
((1, 3), 26)
((1, 4), 95)
((2, 0), 25)
((2, 1), 11)
((2, 3), 60)
((2, 4), 89)
((3, 0), 24)
((3, 1), 79)
((3, 2), 24)
((3, 3), 47)
((3, 4), 18)
((4, 0), 7)
((4, 1), 98)
((4, 2), 96)
((4, 3), 27)

It is a sparse matrix. So all the values may not be there. The code has to check that.

     l = []

      for i in range(0,5):
        for j in range(0,5):
            for k in range(0,5):
               if (i,k) in matrixa:
                   x = matrixa[(i,k)]
               else:
                   x = 0
               if (k,j) in matrixb:
                   y = matrixb[(k,j)]
               else:
                   y = 0
               z += x * y
            l.append([i,j,z])
            z = 0
     for i in l:
        print i

The result is this.

[0, 0, 11878]
[0, 1, 14044]
[0, 2, 16031]
[0, 3, 5964]
[0, 4, 15874]
[1, 0, 4081]
[1, 1, 6914]
[1, 2, 8282]
[1, 3, 7479]
[1, 4, 9647]
[2, 0, 6844]
[2, 1, 9880]
[2, 2, 10636]
[2, 3, 6973]
[2, 4, 8873]
[3, 0, 10512]
[3, 1, 12037]
[3, 2, 10587]
[3, 3, 2934]
[3, 4, 5274]
[4, 0, 11182]
[4, 1, 14591]
[4, 2, 10954]
[4, 3, 1660]
[4, 4, 9981]

Filter a Python dictionary

((u'Valjean', u'MmeDeR'), (u'MmeDeR', u'Valjean'))
((u'Valjean', u'Montparnasse'), (u'Montparnasse', u'Valjean'))
((u'Myriel', u'Count'), (u'Count', u'Myriel'))
((u'Valjean', u'Judge'), (u'Judge', u'Valjean'))
((u'Napoleon', u'Myriel'), (u'Myriel', u'Napoleon'))
((u'Myriel', u'Valjean'), (u'Valjean', u'Myriel'))
((u'Champtercier', u'Myriel'), (u'Myriel', u'Champtercier'))
((u'Myriel', u'OldMan'), (u'OldMan', u'Myriel'))
((u'Valjean', u'Woman1'), (u'Woman1', u'Valjean'))
((u'Valjean', u'Babet'), (u'Babet', u'Valjean'))
((u'MlleBaptistine', u'Valjean'), (u'Valjean', u'MlleBaptistine'))
((u'MmeMagloire', u'Myriel'), (u'Myriel', u'MmeMagloire'))
((u'Valjean', u'Labarre'), (u'Labarre', u'Valjean'))
((u'Valjean', u'Woman2'), (u'Woman2', u'Valjean'))
((u'Myriel', u'Geborand'), (u'Geborand', u'Myriel'))
((u'Valjean', u'Marguerite'), (u'Marguerite', u'Valjean'))
((u'MlleBaptistine', u'MmeMagloire'), (u'MmeMagloire', u'MlleBaptistine'))
((u'Valjean', u'Isabeau'), (u'Isabeau', u'Valjean'))
((u'Valjean', u'Simplice'), (u'Simplice', u'Valjean'))
((u'Valjean', u'Gillenormand'), (u'Gillenormand', u'Valjean'))
((u'Valjean', u'MlleGillenormand'), (u'MlleGillenormand', u'Valjean'))
((u'Myriel', u'Champtercier'), (u'Champtercier', u'Myriel'))
((u'Valjean', u'MmeMagloire'), (u'MmeMagloire', u'Valjean'))
((u'Valjean', u'Fantine'), (u'Fantine', u'Valjean'))
((u'MlleBaptistine', u'Myriel'), (u'Myriel', u'MlleBaptistine'))
((u'Valjean', u'Myriel'), (u'Myriel', u'Valjean'))
((u'Valjean', u'Cosette'), (u'Cosette', u'Valjean'))

I am used to the verbosity of Java. So Python is like a breath of fresh air.
But this Python code stumped me for the better part of two days.

I was trying to filter the keys in the dictionary shown above based on its own values. One of the examples is shown in a different color. So if the key matches a value then that dictionary row with that particular key should be removed. The row with the matching value is untouched.

It is literally one line of Python.

    for v in friends.iteritems():
        print v
    result = [k for k, v in friends.iteritems() if v not in friends.keys()]
    for v in result:
        print v

Python function to find the country of origin of a Tweet

This is a rudimentary function to trace the origin from the tweet’s JSON structure. I am using a dictionary of states and abbreviations.

  def checklocation(self,data):
       location = data.get('user').get('location')
       place = data.get('place')
       if place:
           country = data.get('place').get('country')
           full_name = data.get('place').get('full_name')
           for c in self.states.itervalues():
                   p = re.compile(c.lower(), re.IGNORECASE)
                   if country:
                       m = p.match(country.lower())
                       if m:
                           return m.group()
                   if full_name:
                       m = p.match(full_name.lower())
                       if m:
                           return m.group()
                   if location:
                       m = p.match(location.lower())
                       if m:
                           return m.group()