‘R’ regex

My Java Virtual Machine memory pool dump is this. I had earlier isolated only Code Cache memory pool for analysis and that is the following section. I converted this to data frame first.

1            Peak Usage    : init:2359296, used:13914944, committed:13959168, max:50331648Current Usage : init:2359296, used:13913536, committed:13959168, max:50331648|------------------| committed:13.31Mb+---------------------------------------------------------------------+|//////////////////|                                                  | max:48Mb+---------------------------------------------------------------------+|------------------| used:13.27Mb
2            Peak Usage    : init:2359296, used:13916608, committed:13959168, max:50331648Current Usage : init:2359296, used:13915200, committed:13959168, max:50331648|------------------| committed:13.31Mb+---------------------------------------------------------------------+|//////////////////|                                                  | max:48Mb+---------------------------------------------------------------------+|------------------| used:13.27Mb
3             Peak Usage    : init:2359296, used:13949120, committed:13991936, max:50331648Current Usage : init:2359296, used:13947712, committed:13991936, max:50331648|------------------| committed:13.34Mb+---------------------------------------------------------------------+|//////////////////|                                                  | max:48Mb+---------------------------------------------------------------------+|------------------| used:13.3Mb
4            Peak Usage    : init:2359296, used:13968576, committed:13991936, max:50331648Current Usage : init:2359296, used:13956224, committed:13991936, max:50331648|------------------| committed:13.34Mb+---------------------------------------------------------------------+|//////////////////|                                                  | max:48Mb+---------------------------------------------------------------------+|------------------| used:13.31Mb

I tried this to cut only Current Usage statistics. The regex is

    greedy

and matches upto the first ‘pipe’ character.

y <- apply( y, 1, function(z) str_extract(z,"Current.*?[/|]"))
  [1] "Current Usage : init:2359296, used:13913536, committed:13959168, max:50331648|"
  [2] "Current Usage : init:2359296, used:13915200, committed:13959168, max:50331648|"
  [3] "Current Usage : init:2359296, used:13947712, committed:13991936, max:50331648|"
  [4] "Current Usage : init:2359296, used:13956224, committed:13991936, max:50331648|"

R users’ list member Arun recommends this

> as.matrix(gsub("^.*(Current.*?[/|]).*","\\1",y$y))
       [,1]                                                                            
  [1,] "Current Usage : init:2359296, used:13913536, committed:13959168, max:50331648|"
  [2,] "Current Usage : init:2359296, used:13915200, committed:13959168, max:50331648|"
  [3,] "Current Usage : init:2359296, used:13947712, committed:13991936, max:50331648|"
  [4,] "Current Usage : init:2359296, used:13956224, committed:13991936, max:50331648|"
  [5,] "Current Usage : init:2359296, used:13968832, committed:14024704, max:50331648|"

Both work as expected.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: