Provision a VM using Packer and Vagrant

About 8 years back I worked for a company serving customers of the Payment Card Industry. They had a dire need of Infrastructure as Code(IaC) to build a Windows Active-Passive Cluster with Connect:Direct and engineers spent day and night to set it up manually. The ruckus created by that is still etched in my mind.

Now when I tried a simple recipe it worked like a charm. It isn’t very complicated as it is a simple test.

I started with this repo.

C:\Packer\ubuntu\ubuntu>packer build -only=vmware-iso -var='ssh_fullname=mirage' -var='ssh_password=mirage' -var-file=ubuntu1804.json ubuntu.json
vmware-iso: output will be in this color.

Warnings for build 'vmware-iso':

* A checksum type of 'none' was specified. Since ISO files are so big,
a checksum is highly recommended.
* Your vmx data contains the following variable(s), which Packer normally sets when it generates its own default vmx template. This may cause your build to fail or behave unpredictably: numvcpus, memsize

==> vmware-iso: Retrieving ISO
==> vmware-iso: Trying /Volumes/Storage/software/ubuntu/ubuntu-18.04.4-server-amd64.iso
==> vmware-iso: Trying /Volumes/Storage/software/ubuntu/ubuntu-18.04.4-server-amd64.iso?checksum=a5b0ea5918f850124f3d72ef4b85bda82f0fcd02ec721be19c1a6952791c8ee8
==> vmware-iso: /Volumes/Storage/software/ubuntu/ubuntu-18.04.4-server-amd64.iso?checksum=a5b0ea5918f850124f3d72ef4b85bda82f0fcd02ec721be19c1a6952791c8ee8 => C:/Packer/ubuntu/ubuntu/Volumes/Storage/software/ubuntu/ubuntu-18.04.4-server-amd64.iso
==> vmware-iso: Creating floppy disk...
vmware-iso: Copying files flatly from floppy_files
vmware-iso: Copying file: http/preseed.cfg
vmware-iso: Done copying files from floppy_files
vmware-iso: Collecting paths from floppy_dirs
vmware-iso: Resulting paths from floppy_dirs : []
vmware-iso: Done copying paths from floppy_dirs

Add box to Vagrant

C:\Packer\ubuntu\ubuntu\box\vmware>vagrant box add ubuntu1804-0.1.0.box --name vmwarepackeransible
==> box: Box file was not detected as metadata. Adding it directly...
==> box: Adding box 'vmwarepackeransible' (v0) for provider:
box: Unpacking necessary files from: file://C:/Packer/ubuntu/ubuntu/box/vmware/ubuntu1804-0.1.0.box
box:
==> box: Successfully added box 'vmwarepackeransible' (v0) for 'vmware_desktop'!

Initialize

C:\Packer\ubuntu\ubuntu\box\vmware>vagrant init vmwarepackeransible
A `Vagrantfile` has been placed in this directory. You are now
ready to `vagrant up` your first virtual environment! Please read
the comments in the Vagrantfile as well as documentation on
`vagrantup.com` for more information on using Vagrant.

C:\Packer\ubuntu\ubuntu\box\vmware>vagrant up
Bringing machine 'default' up with 'vmware_desktop' provider...
==> default: Cloning VMware VM: 'vmwarepackeransible'. This can take some time...
==> default: Verifying vmnet devices are healthy...
==> default: Preparing network adapters...
==> default: Starting the VMware VM...
==> default: Waiting for the VM to receive an address...
==> default: Forwarding ports...
default: -- 22 => 2222
==> default: Waiting for machine to boot. This may take a few minutes...
default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
default:
default: Vagrant insecure key detected. Vagrant will automatically replace
default: this with a newly generated keypair for better security.
default:
default: Inserting generated public key within guest...
default: Removing insecure key from the guest if it's present...
default: Key inserted! Disconnecting and reconnecting using new SSH key...
==> default: Machine booted and ready!
==> default: Configuring network adapters within the VM...
==> default: Waiting for HGFS to become available...
==> default: Enabling and configuring shared folders...
default: -- C:/Packer/ubuntu/ubuntu/box/vmware: /vagrant

Shell provisioner in Vagrantfile

config.vm.provision "shell", inline: <<-SHELL
add-apt-repository ppa:openjdk-r/ppa -y
apt-get update
echo "\n----- Installing Java 8 ------\n"
apt-get -y install  openjdk-8-jdk
update-alternatives --config java

SSH into vagrant and check

SHELLvagrant@vagrant:~$ java -version
openjdk version "1.8.0_252"
OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)
OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)

There are other scenarious that are complicated but a simple test like this works as expected.

Dune is a Ocaml build system

Here is my attempt to properly build a toy Ocaml project using Dune.

Since this is the learning phase the Ocaml code may not be idiomatic.

My unit test framework is Alcotest

As is the case with other Ocaml tools and techniques information about this is sketchy. I wish there were more articles and examples as it is fun to work with this language..

I will add more details as I research this further. But for now here is a brief description of the dune build file.

  • dune runtest executes  all the tests
  • Dune does not install dependencies automatically . So, for example, I have to execute ‘opam install alcotest’. That is how one installs any opam package we need generally.

graph.opam

This looks like the file in which one specifies the framework versions and build instructions.

opam-version: "2.0"
authors: [ "Mohan" ]
synopsis: "Learning Dune"
description: """
Learning Dune
"""
tags: []
depends: [
"ocaml" {
>= "4.02.3"}
"dune"
"alcotest" {with-test}
]
build: [
["dune" "subst"] {pinned}
["dune" "build" "-p" name "-j" jobs]
["dune" "runtest" "-p" name "-j" jobs] {with-test}
]

Dune dependencies

This specifies the dependencies and the modules. My code module is ‘graph‘ and my test module is ‘kruskaltest‘.

  •  
(library
(modules graph)
(name graph)
)
(test
(name kruskaltest)
(modules kruskaltest)
(libraries alcotest graph))
view raw dune.lisp hosted with ❤ by GitHub

Main module

type edge = ( int * int * int ) list
module type EDGECOMPARE = sig
type t
val compare : 'a list -> 'a list ->bool
end
module EDGECOMPARE_weight : EDGECOMPARE with type t = edge = struct
type t = edge
let compare l r = ( List.nth l ( List.length l/ 2 )) == ( List.nth r (List.length r/2))
end
view raw kruskal.ml hosted with ❤ by GitHub

Test

let testcompare () = Alcotest.(check bool) "Test for weight comparison" true (Graph.EDGECOMPARE_weight.compare [1,2,4] [ 5,6,7] )
let () =
Alcotest.run "Weights"
[
("test compare weights of edges", [
Alcotest.test_case "Compare weights" `Quick testcompare;
]);
]
view raw kruskaltest.ml hosted with ❤ by GitHub

mirage@mirage:~/theorem$ dune runtest
kruskaltest alias runtest
Testing Weights.
This run has ID `D7DAB7A8-A60A-4522-9732-54FAE2331A72`.
[OK] test compare weights of edges 0 Compare weights.
The full test results are available in `/home/mirage/theorem/_build/default/_build/_tests/D7DAB7A8-A60A-4522-9732-54FAE2331A72`.
Test Successful in 0.000s. 1 test run.

R Reference classes

A pure OO approach and a functional representation of it are at loggerheads. That is evident when one tries to adopt an OO approach using a powerful functional language. That is my personal opinion.

R has many Object-oriented features built into it.

R has three object oriented (OO) systems: [[S3]], [[S4]] and [[R5]].

Reference classes are one such feature.

OO

Let us consider this data. The id is that of a Subject who is in
a room where monitoring equipment gathers some data. There are several visits to gather this data.

id visit room value timepoint
14 0 bedroom 6 53
14 0 bedroom 6 54
15 0 bedroom 2.75 56

The idea that this code is based on is from Martin Fowler’s book Analysis Patterns Reusable Object Models. The chapter on Observations and Measurements has a diagram roughly equivalent to the
one shown at the top.

The code is lightly tested several times but without unit tests.

library(plyr)
library(dplyr)
library(purrr)

CompoundUnit <- setRefClass("CompoundUnit",
fields = list(micrograms = 'numeric',
cubicmeter = 'numeric'))

Location <- setRefClass("Location",
fields = list( room = 'character'),
methods=list(getlocation = function(){
room
},
summary = function(){
paste('Room [' , room , ']')
}))

library(objectProperties)
# An Enum which could have behaviour associated with it.
# This is convoluted but the only way I know to represent constants and validate them.
#
###############################################################################

MeasurementVisitEnum.gen <- setSingleEnum("MeasurementVisit",levels = c('0', '1', '2'))
par.gen <- setRefClass("Visit",
properties(fields = list(visit = "MeasurementVisitSingleEnum"),
prototype = list(visit =
new("MeasurementVisitSingleEnum",
'0'))))

What is the significance of this convoluted code ?

It restricts the values that are set to 0.1 and 2. It is like the Java enum

But this is not strictly a requirement here. It is just that there is a facility to identify erroneous data if we need it.

> MeasurementVisitEnum.gen par.gen visits visits$visit visits$visit visits$visit visits$visit <- as.character(3)
Error in (function (val) :
Attempt to set invalid value on 'visit': value '3' does not belong to level set
( 0, 1, 2 )



TimePoint <- setRefClass("TimePoint",
fields = list(time = 'numeric'))

Quantity <- setRefClass("Quantity",
fields = list(amount = "numeric",
units = CompoundUnit))

Measurement encapsulates the quantity, the time point and the visit number. So, for example, during visit 0, at this time point the quantity was observed. This type of encapsulation in the true spirit of OO has its
disadvantages as we will see later.

Measurement <- setRefClass("Measurement",
fields = list(
quantity = "Quantity",
timepoint = "TimePoint",
visit = "Visit"),
methods=list(getvisit = function(){
visit$visit
},getquantity = function(){
quantity
})
)

Subject <- setRefClass("Subject",
fields = list( id = "numeric",
measurement = "Measurement",
location = "Location"),
methods=list(getmeasurement = function()
{
measurement
},
getid = function()
{
id
},
getlocation = function()
{
location
},
summary = function()#Implement other summary methods in appropriate objects as per their responsibilities
{
paste("Subject summary ID [",id,"] Location [",location$summary(),"]")
},show = function(){
cat("Subject summary ID [",id,"] Location [",location$summary(),"]\n")
})
)

LongitudinalDatum is the class LongitudinalData inherits from. This inheritance is shown as an example. Not all methods that should belong in the super class are properly added. There are many methods in the sub class that can be moved a level up.

subsummary in the super class can be called from the sub class. The line if( subject(x) == id){ in the sub class LongitudinalData calls this super class method.

LongitudinalDatum  datum

measurements <<- list()
load(datum)

},load = function( df ){
by(df, 1:nrow(df), function(row) {
visits <- par.gen$new()
visits$visit <- as.character(row$visit)

u <- CompoundUnit$new( micrograms = 1,
cubicmeter = 1 )

q <- Quantity$new(amount = row$value,
units = u )

t <- TimePoint$new(time = row$timepoint)

m <- Measurement$new(
quantity = q,
timepoint = t,
visit = visits)

l <- Location$new( room = as.character(row$room))

s <- Subject$new( id = row$id,
measurement = m,
location = l)
measurements <<- c( measurements, s )

})

},
getmeasurementslength = function(){
length(measurements)
},
findsubject = function( id ){
result % map(., function(x) {
if( subject(x) == id){
result <<- x # Warning message is benign for this example. result
#cannot be a class state. It is really local.
}
}
)
result

},
visit = function( sub,v ){
measurementsvisit % map(., function(x) {
m <- x$getmeasurement()
if (m$getvisit() == v && x$getid() == sub$getid() ){
measurementsvisit <<- c(measurementsvisit,x)
}
}

)

list(visit = measurementsvisit )
}
},
room = function( t, room ){
if( length( t) == 0 ){
c('NA')
}else{
measurementsvisitroom % map(., function(x) {
if( x$getlocation()$getlocation() == room )
measurementsvisitroom <% map(., function(y) {
if (x$getid() == y$getid() ){
m <<- x$getmeasurement()
summaries <% summary
}
},subjectsummary = function( subject ){
filteredmeasurements <-
keep(measurements, function(x){
x$getid() == subject$getid()
})
groupedmeasurements % lapply(function(x){
m <% rbind_all()
dataColumns <- c('amount')

ddply(groupedmeasurements,c('visit','location'),function(x)
colSums(x[dataColumns]))
}

)
)

How does this work ?

The data is loaded into an object hierarchy in the load function. I did observe that it was slow most probably because my Eclipse StatET for R setup needs more memory.

Since the methods are all encapsulated by the class I am using the reference to call methods. The result of findsubject is passed to subjectsummary because I am piping the result of one method to the next.

ld <- LongitudinalData$new()

out % ld$subjectsummary()
print(out)

So here the result of findsubject(14) is passed as the first parameter when visit(0) is called. 0 becomes the second parameter.

out % ld$visit(0) %>% ld$room("bedroom")

The final result from this pipeline is whatever is returned by the last method room("bedroom").

I would like to reassert that this is just one way of combining multiple methods using Reference classes. There are much more powerful functional approaches that don’t require this many lines of code. This example illustrates a particular Object-oriented approach.

Flattening the Reference classes

The OO hierarchy here does not seem to be malleable when used with some R packagea like dplyr. Try as I may, I cannot coerce the Reference classes into a R data frame and pipe it through stages using dplyr. Remember I want to use functions like map and filter to get the data out of these reference clasees in a shape that I want.

So I abandon my OO approach and flatten the objects and create a data frame. Now I get back the data in the shape I want.

groupedmeasurements %
lapply(
function(x){
m <% rbind_all()

This is how one gets the following output.

out % ld$subjectsummary()
print(out)

visit location amount
0 bedroom 12.00
0 dining room 2.75
0 living room 2.75
0 room 5.50
0 tv room 2.75
1 room 2.75

Conclusion

This exercise has not helped me determine in which context R’s Reference classes are specifically used. The other OO systems like S3 and S4 may be more useful but this article is about RC’s. Why should I flatten my object hierarchy to reshape my data in a convenient way ? There may be specialized R packages that use the OO approach and expose API’s but I am not aware of them. So at this time I understand that there is a dichotomy between RC’s and the powerful functional approach. I personally like to use the functional programming paradigm when dealing with data.

Joy of OCaml

screenshot-from-2016-12-03-19-14-34

I have spent most of last week with my Emacs editor and the OCaml development environment. Since I have some OCaml code to complete I will add more details soon.

Suffice it to say that this setup taxed me so much. OPAM does not seem to install easily in Windows. As is my wont in such cases I started with Cygwin and after two days switched to a Ubuntu VM. I didn’t think I was gaining much by reporting Cygwin permission issues to owners of OPAM Windows installers.

Emacs company mode for autocompletion

The toolchain includes company as well as Merlin
and Tuareg.

screenshot-from-2016-12-03-19-00-53

Utop is a toplevel for OCaml

screenshot-from-2016-12-03-19-10-58

Emacs elisp

It looks like this at this time and I use Gist because WordPress does not support Lisp or OCaml or Haskell yet. Filed a support ticket.

(package-initialize)
(load "/home/mohan/.opam/4.02.1/share/emacs/site-lisp/tuareg-site-file")
(let ((opam-share (ignore-errors (car (process-lines "opam" "config" "var"
"share")))))
(when (and opam-share (file-directory-p opam-share))
;; Register Merlin
(add-to-list 'load-path (expand-file-name "emacs/site-lisp" opam-share))
(autoload 'merlin-mode "merlin" nil t nil)
;; Automatically start it in OCaml buffers
(add-hook 'tuareg-mode-hook 'merlin-mode t)
(add-hook 'caml-mode-hook 'merlin-mode t)
;; Use opam switch to lookup ocamlmerlin binary
(setq merlin-command 'opam)))
(company-mode)
(add-to-list 'load-path "/home/mohan/.opam/4.02.1/share/emacs/site-lisp")
(require 'ocp-indent)
(autoload 'utop-minor-mode "utop" "Minor mode for utop" t)
(autoload 'utop-setup-ocaml-buffer "utop" "Toplevel for OCaml" t)
(autoload 'merlin-mode "merlin" "Merlin mode" t)
(utop-minor-mode)
;; Important to note that setq-local is a macro and it needs to be
;; separate calls, not like setq
(setq-local merlin-completion-with-doc t)
(setq-local indent-tabs-mode nil)
(setq-local show-trailing-whitespace t)
(setq-local indent-line-function 'ocp-indent-line)
(setq-local indent-region-function 'ocp-indent-region)
(custom-set-variables
;; custom-set-variables was added by Custom.
;; If you edit it by hand, you could mess it up, so be careful.
;; Your init file should contain only one such instance.
;; If there is more than one, they won't work right.
'(package-selected-packages (quote (company))))
(custom-set-faces
;; custom-set-faces was added by Custom.
;; If you edit it by hand, you could mess it up, so be careful.
;; Your init file should contain only one such instance.
;; If there is more than one, they won't work right.
)
; Make company aware of merlin
(with-eval-after-load 'company
(add-to-list 'company-backends 'merlin-company-backend))
; Enable company on merlin managed buffers
(add-hook 'merlin-mode-hook 'company-mode)

view raw
ocaml.lisp
hosted with ❤ by GitHub

More about OCaml code later. This creates an associative list of tuples containing characters and the number of times they occur in a String. MultiSet is a module that is not shown either but as I mentioned I have more to write about this wonderful programming language.

let insert l a =
if List.mem_assoc a l
then
let n = List.assoc a l in (a, n+1)::(List.remove_assoc a l)
else (a, 1)::l
let letters (word : string) : char MultiSet.t =
let rec insert (l : char MultiSet.t) (c : string) (i : int) : char MultiSet.t =
if ( String.length c > 1 ) then
insert ( MultiSet.insert l (String.get c i) ) ( String.sub c 1 ((String.length c) 1) ) 0
else
MultiSet.insert l (String.get c 0)
in insert MultiSet.empty word 0
;;

view raw
letter.ml
hosted with ❤ by GitHub

Polyglot programming using Jenkins

jenkins Facility for languages develops when one does not squander existing opportunities to code. That is what I think.

Jenkins, the CI enabler supports a few languages like Python and Groovy. The Python package I used to make the Rest API calls is ‘Python Jenkins’.It is interesting to note that run_script executes Groovy code.

I didn’t test it exactly when the Unix server runs out of disk space but assumed the text from the console output will match.Moreover the encryption routine works as expected but the decryption function doesn’t work. It seems that since I call the Rest API there could be a encryption/decryption key mismatch.


'''
Created on Oct 12, 2016

@author: Mohan Radhakrishnan

This python module gets the console output of the latest
build and if the text 'No space left on device' is found in
the output it sends a mail.
I've taken liberties with the 'functional paradigm'

'''
import smtplib
import jenkins
import os
def main():

overrideenvironmentvariables()

server = jenkins.Jenkins('http://localhost:8080', username='Mohan', password='Welcome1')

notifydisaster(server)

'''
Notify
'''
def notifydisaster( server ):
print( getconsoleoutput(server) )
name,buildnumber,consoleoutput = getconsoleoutput(server)
if (consoleoutput.find("Caused by: java.io.IOException: No space left on device") != -1):
print("Caused by: java.io.IOException: No space left on device")
sendmail( name,buildnumber )

'''
Notify
Password Encryption/decryption code has to be tested and used
'''
def sendmail(name,buildnumber):
smtp = smtplib.SMTP('smtp.gmail.com', 587)
smtp.ehlo()
smtp.starttls()
smtp.login("x.y@z.com","Password")
smtp.sendmail('x.y@z.com', 'x.y@z.com', 'Subject: No space left on device\n \
Job ' + name + ' Build ' + str(buildnumber) + ' fails due to lack of disk space')

'''
Get the console output of the particular
Job's build
'''
def getconsoleoutput(server):
information = getJobName(server)
if information:
return information[ 0 ]['name'] ,getlastjobDetails(server),server.get_build_console_output(information[ 0 ]['name'], getlastjobDetails(server))

'''
Get Job and other details
and filter the Job we are interested in
'''
def getJobName(server):
jobs = server.get_all_jobs(0)
filtercriterion = ['CITestPipeline']

return list(filter( lambda d: d['fullname'] in filtercriterion, jobs))

'''
Get Job and other details
Return '0' as the build number assuming
it signifies that there is no such build number
'''

def getlastjobDetails(server):
information = getJobName(server)
if information:
last_build_number = server.get_job_info(information[ 0 ]['name'])['lastCompletedBuild']['number']
return last_build_number
else:
return 0

'''
Attempt here to encrypt Passwords using Jenkins' key
Not tested properly
'''
def encrypt(server ):
value = server.run_script("""
secret = hudson.util.Secret.fromString("Password")
println secret.getEncryptedValue()
println secret.getPlainText()
""")
print (value)

def decrypt(server ):
decryptedvalue = server.run_script("""
secret = hudson.util.Secret.fromString("aiJREkuBjWHX9UWIyhEzwnnAJReuZnQVEtUr0KgvXKg")
println hudson.util.Secret.toString(secret)
""")
print (decryptedvalue)
return decryptedvalue
'''
Override this proxy setting as we don't
need it and it causes an error.
'''
def overrideenvironmentvariables():
os.environ["HTTP_PROXY"] = ''

if __name__=="__main__":
main()

Spacemacs

I will update this post soon as my day job leaves little time for fun aspects like this.

Spacemacs’ new Haskell layer is what I like now eventhough the Haskell editor setup is not easy for the novice.

After installing Spacemacs these are the basic steps I followed.

Added the Haskell layer

;; —————————————————————-
;; Example of useful layers you may want to use right away.
;; Uncomment some layer names and press (Vim style) or
;; (Emacs style) to install them.
;; —————————————————————-
;; auto-completion
;; better-defaults
emacs-lisp
(haskell .variables haskell-enable-shm-support t)
;; git
;; markdown
;; org
;; (shell :variables
;; shell-default-height 30
;; shell-default-position ‘bottom)
;; spell-checking
;; syntax-checking
;; version-control

Added this to .spacemacs

(defun dotspacemacs/user-init ()
“Initialization function for user code.
It is called immediately after `dotspacemacs/init’, before layer configuration
executes.
This function is mostly useful for variables that need to be set
before packages are loaded. If you are unsure, you should try in setting them in
`dotspacemacs/user-config’ first.”
(add-hook ‘haskell-mode-hook ‘turn-on-haskell-indentation)
(add-to-list ‘exec-path “C:/Users/476458/AppData/Roaming/local/bin/”)
)

C:/Users/476458/AppData/Roaming/local/bin/ contains other tools installed by Stack.

Stack is a cross-platform program for developing Haskell projects. It is aimed at Haskellers both new and experienced.

Spacemacs

Suggested by spacemacs Reddit group

(defun dotspacemacs/user-config ()
“Configuration function for user code.
This function is called at the very end of Spacemacs initialization after
layers configuration.
This is the place where most of your configurations should be done. Unless it is
explicitly specified that a variable should be set before a package is loaded,
you should place your code here.”
(spacemacs/set-leader-keys-for-major-mode ‘haskell-mode
“cx” ‘inferior-haskell-load-and-run
)
)

This helps me compile and execute the Haskell program by using the keystrokes

SPC m c x

Spacemacs1

Getting introduced to Matlab

There was a time when I thought Matlab is a tool used by engineers. Is it even possible for a student of humanities to get access to such tools ? It is and there are academic licenses.

Matrix

matrix = zeros(10,10);

matrix(1,2) = 3
matrix(2,2) = 30
matrix(1,3) = 2
matrix(4,3) = 14
matrix(5,2) = 199
matrix(6,2) = 733

\begin{bmatrix} 0 & 3 & 2 & 0 & 0\\ 0 & 30 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 14 & 0 & 0\\ 0 & 199 & 0 & 0 & 0\\ 0 & 733 & 0 & 0 & 0\\ \end{bmatrix}

Maximum value from a matrix
[max_value, node] = max(matrix(:));

fprintf ('Maximum value is %d and node is %d\n', max_value, node);

node seems to be a reference to the element which is used below to get the row and column using ind2sub.


[i, j] = ind2sub(size(matrix), node);

fprintf ('Row is %d and column is %d\n', i, j);

sub2ind gives the linear indice of the element when we have the row and column of the element.


linearindice = sub2ind(size(matrix), 1, 2);

fprintf ('Linear Indice is %d \n', linearindice);

Sort

I get the sorted matrix and also the matrix of indices of the sorted elements. Very useful.


[values, indices] = sort( matrix );
Sorted values

\begin{bmatrix} 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 3 & 0 & 0 & 0\\ 0 & 30 & 0 & 0 & 0\\ 0 & 199 & 2 & 0 & 0\\ 0 & 733 & 14 & 0 & 0\\ \end{bmatrix}

Indices of sorted values

\begin{bmatrix} 1 & 3 & 2 & 1 & 1\\ 2 & 4 & 3 & 2 & 2\\ 3 & 1 & 5 & 3 & 3\\ 4 & 2 & 6 & 4 & 4\\ 5 & 5 & 1 & 5 & 5\\ 6 & 6 & 4 & 6 & 6\\ \end{bmatrix}

Java 8 Optional

I think there are more elegant ways to check if Optional is empty or not but here I have to collect everything in a ArrayList. So I wasn’t able to include isPresent() in the lambda pipeline.

package com.test;

import java.util.*;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class OptionalTest {

    private static List<String> newImports = new ArrayList<>();

    public static Optional<List<String>> getOptionalNewImports() {

        return Optional.ofNullable(newImports);
    }

    public static void main( String... argv ){

        newImports.add("org.apache.struts");

        if( getOptionalNewImports().isPresent() ) {
            List<String> imports = new ArrayList<>();
                    getOptionalNewImports().get().stream()
                    .map(p -> "import " + p + ";")
                    .collect(Collectors.toCollection(() -> imports));
            imports.forEach( System.out::println);
        }

    }
}

This is the relevant method.

        public Optional<List<String>> getOptionalNewImports() {

            return Optional.ofNullable(newImports);
        }

This is a proper usage of ifPresent. I assign a value to a variable if the value is present.

                        rules.getOptionalClassIdentifier().ifPresent( a -> {this.classIdentifier = a;});


Notes about Machine Learning fundamentals

I have decided to try a different tack in this post. Gradually as I learn some basic ideas about statistics and Machine Learning I will update this post with code, graphs or procedures used to configure tools. So in a few weeks I will have charted a simple course through the basic Machine Learning terrain. I hope. But these are just basic ideas to prepare oneself to read a more advanced math text.

To be updated …   I will add more details in subsequent posts.

Tools

Anaconda based on Anaconda

GraphLab based on GraphLab

ipython based on ipython

The installation process was tortuous because I work in a corporate environment.

Install GraphLab Create with Command Line

The installation is based on dato’s instructions.

Step 1: Ensure Python 2.7.x

Anaconda with Python 2.x installation didn’t complete in my Windows 7 machine due to some access restriction. It couldn’t set this version of Python as the default.
So I installed Anaconda with Python 3.x. GraphLab works with only Python 2.x

In order to create a Python 2.7 environment the command used is

conda create -n dato-env python=2.7 anaconda

This was blocked by my Virus Scanner and I had to coax our security team to update my policy settings to allow this.

Traceback (most recent call last):
File “D:\Continuum\Anaconda3.4\Scripts\conda-script.py”, line 4, in <module>
sys.exit(main())
File “D:\Continuum\Anaconda3.4\lib\site-packages\conda\cli\main.py”, line 202,
in main
args_func(args, p)
File “D:\Continuum\Anaconda3.4\lib\site-packages\conda\cli\main.py”, line 207,
in args_func
args.func(args, p)
File “D:\Continuum\Anaconda3.4\lib\site-packages\conda\cli\main_create.py”, li
ne 50, in execute
install.install(args, parser, ‘create’)
File “D:\Continuum\Anaconda3.4\lib\site-packages\conda\cli\install.py”, line 4
20, in install
plan.execute_actions(actions, index, verbose=not args.quiet)
File “D:\Continuum\Anaconda3.4\lib\site-packages\conda\plan.py”, line 502, in
execute_actions
inst.execute_instructions(plan, index, verbose)
File “D:\Continuum\Anaconda3.4\lib\site-packages\conda\instructions.py”, line
140, in execute_instructions
cmd(state, arg)
File “D:\Continuum\Anaconda3.4\lib\site-packages\conda\instructions.py”, line
55, in EXTRACT_CMD
install.extract(config.pkgs_dirs[0], arg)
File “D:\Continuum\Anaconda3.4\lib\site-packages\conda\install.py”, line 448,
in extract
t.extractall(path=path)
File “D:\Continuum\Anaconda3.4\lib\tarfile.py”, line 1980, in extractall
self.extract(tarinfo, path, set_attrs=not tarinfo.isdir())
File “D:\Continuum\Anaconda3.4\lib\tarfile.py”, line 2019, in extract
set_attrs=set_attrs)
File “D:\Continuum\Anaconda3.4\lib\tarfile.py”, line 2088, in _extract_member
self.makefile(tarinfo, targetpath)
File “D:\Continuum\Anaconda3.4\lib\tarfile.py”, line 2128, in makefile
with bltn_open(targetpath, “wb”) as target:
PermissionError: [Errno 13] Permission denied: ‘D:\\Continuum\\Anaconda3.4\\pkgs
\\python-2.7.11-4\\Lib\\pdb.doc’

The last line shown above is what I presume was blocked by the Virus scanner. When the logs were shown to the security team who updated the scanner rules.

Learning some of these topics may be difficult if we don’t read a more advanced book. So I am constrained by the lack of deep knowledge of a related math subject.

But a question like this one must be simple. Right ?

Identify which model performs better when you have the intercept, slope and Residual Sum of squares.

No data point is given.One can plot the lines when their intercepts and slopes are known but I don’t know how that helps.

Plot some lines when we know their intercepts and slopes. Data points are random though and are irrevelant at this time.
from ggplot import *
import pandas as pd
data = {'x': [0, 2, 3, 4, 5, 4, 3.2, 3.3, 2.6, 8.4],
'y': [4.2, 2.6, 1.2, 23, 23, 42, 1.2, 63, 2.3, 2.1],
}
df = pd.DataFrame(data)
g = ggplot(df, aes(x='x', y='y')) + \
geom_point() + \
geom_abline(intercept=0, slope=1.4, colour=&amp;quot;red&amp;quot;) \
+ geom_abline(intercept=3.1, slope=1.4, colour=&amp;quot;blue&amp;quot;) \
+ geom_abline(intercept=2.7, slope=1.9, colour=&amp;quot;green&amp;quot;) \
+ geom_abline(intercept=0, slope=2.3, colour=&amp;quot;black&amp;quot;)
print(g)

 

Q2

Emacs haskell-mode

I set about installing emacs and its haskell-mode based on emacs-haskell-tutorial. But this is what worked for me. This is only part of the process and I will add any new information I come across.

Eval: (find-file user-init-file)

Press Enter. It loads my .emacs file if it is there.Create one and save it if it isn’t there.

 

This section should be enough if there is no proxy.

(require ‘package)
(add-to-list ‘package-archives
‘(“melpa-stable” . “http://stable.melpa.org/packages/&#8221;) t)
(package-initialize)

Corporate proxy

If there is a proxy add this section too.

(setq url-proxy-services
‘((“no_proxy” . “^\\(localhost\\|10.*\\)”)
(“http” . “proxy.cognizant.com:6050”)
(“https” . “proxy.cognizant.com:6050”)))

(setq url-http-proxy-basic-auth-storage
(list (list “proxy.cognizant.com:6050”
(cons “Credentials !”
(base64-encode-string “user:password”)))))

cabal

Cabal seems to be the package manager for Haskell libraries.

I had cabal.exe locally and realized it is not the correct way.

D:\Frege>..\Downloads\cabal update
Downloading the latest package list from hackage.haskell.org
Note: there is a new version of cabal-install available.
To upgrade, run: cabal install cabal-install

D:\Frege>ghc-pkg list Cabal
D:/Haskell Platform/7.10.3\lib\package.conf.d:
Cabal-1.22.5.0

D:\Frege>..\Downloads\cabal install happy
cabal: The program ghc version >=6.4 is required but it could not be found.

So I installed Haskell Platform and added the bin folder to the PATH.

Everything worked after that.

D:\Frege>cabal install cabal-install
Resolving dependencies…
Downloading cabal-install-1.22.9.0…
Configuring cabal-install-1.22.9.0…
Building cabal-install-1.22.9.0…
Linking dist\build\cabal\cabal.exe …
Installing executable(s) in C:\Users\476458\AppData\Roaming\cabal\bin
Installed cabal-install-1.22.9.0

D:\Frege>cabal
cabal: no command given (try –help)

D:\Frege>cabal –version
cabal-install version 1.22.6.0
using version 1.22.5.0 of the Cabal library

D:\Frege>ls
HaskellPlatform-7.10.3-x86_64-setup.exe InstallCert.java
InstallCert$SavingTrustManager.class emacs-24.5-bin-i686-mingw32
InstallCert.class emacs-24.5-bin-i686-mingw32.zip

D:\Frege>cabal update
Downloading the latest package list from hackage.haskell.org

D:\Frege>cabal install happy
Resolving dependencies…
Downloading happy-1.19.5…
[1 of 1] Compiling Main ( C:\Users\476458\AppData\Local\Temp\cabal-t
Linking dist\build\happy\happy.exe …
Installing executable(s) in C:\Users\476458\AppData\Roaming\cabal\bin
Installed happy-1.19.5

Haskell-mode

Not yet sure if the mode is effective. I don’t see any syntax highlighting yet.

haskell-mode

Update : Now it looks better

proper-haskell-mode