Custom Code to create Ragged Tensor

I have been preparing to write a longer version about Tensorflow with Tikz diagrams. Eventually there will be sufficient number of pages to write a short book. And I have been looking for tools to generate the book’s text, Tikz diagrams and the code as a PDF book.

I know that descriptions are important too and just colorful diagrams won’t cut it. But I am trying. I will add

more descriptions and diagrams to this same post till I am satisfied.

RaggedTensor is a tensor with one or more ragged dimensions, which are dimensions whose slices may have different lengths.

tf.RaggedTensor is part of the TensorFlow library. This code attempts to do the same.

We start with the source [3, 1, 4, 2, 5, 9, 2] and a template showing the row position like this [0, 0, 0, 0, 1, 1, 2].

Our map is like this.

The longest repeating value in the template is 0. So we will store the first 4 values(3 ,1, 4, 2) from the source in row 1. Row 2 has values 5 and 9. Since we need 4 values we fill -999 in the next two positions in row 2. Row 3 now has only value 2. The other 3 positions are filled with -999.

There are many ways to code this but if you start with

elements, index, count = tf.unique_with_counts([0, 0, 0, 0, 1, 1, 2])
print('Elements ',elements)

which gives all the data you need then the following code fills up the ‘ragged’ tensor with the ‘filler’

Note : I have hard-coded if( slice.shape[0] < 4): this. This is the length of the longest repeating value but you can obtain this from tf.unique_with_counts and pass it. I also don’t account for missing values – [0, 0, 0, 0, 2]. But elements in the code above gives you what is present. So you could add a row of ‘fillers’ using a simple loop when you find a value missing.


import tensorflow as tf

fill_value = tf.constant([-999]) # value to insert
elements, index, count = tf.unique_with_counts([0, 0, 0, 0, 1, 1, 2])
print('Elements ',elements)
values = [3, 1, 4, 1, 5, 9, 2]

ta = tf.TensorArray(dtype=tf.int32,size=1, dynamic_size=True,clear_after_read=False)

def fill_values(slice,i):
    slices = slice
    if( slice.shape[0] < 4):
        for j in range( 4 - slice.shape[0] ):
            slices = tf.concat([slices,fill_value],0)
            tf.print('Fill ',slices)
    return ta.write(i,slices)

def slices( begin, c, i, filler ):
    slice = tf.slice(  values,
                       begin=[ begin ],
                       size=[ c[i] ])
    begin = begin + c[i]
    tf.print('Slice' , slice)
    ta = fill_values(slice,i)
    print('TensorArray ', ta.stack())
    # Note: The output of this function should be used.
    # If it is not, a warning will be logged or an error may be raised.
    # To mark the output as used, call its .mark_used() method.
    return [begin , c, tf.add(i, 1), filler]

def condition( begin, c, i, _ ):
    return tf.less(i, tf.size(c))

i = tf.constant(0)
filler = tf.constant(-999)
r = tf.while_loop(  condition,slices,[0, count, i, filler ])
print('TensorArray ', ta.stack())

Write logic using loop using TensorFlow

The programming paradigm one adopts when coding TensorFlow is not what I use normally. One has to learn a few tricks to get used to it. When you also consider the eager mode introduced in TensorFlow 2 it can be hard.

Recently I answered a question on Stackoverflow. The question was about writing a loop to take advantage of the GPU.My desktop has a old NVIDIA GPU and my Mac has a AMD GPU. So neither was useful to test this code. But I managed to rewrite the loop using TensorFlow 2.

The original code is this.


def multivariate_data(dataset, target, start_index, end_index, history_size,
                      target_size, step, single_step=False):
  data = []
  labels = []
  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size
  #print(history_size)
  for i in range(start_index, end_index):
    indices = range(i-history_size, i, step)
    data.append(dataset[indices])
    if single_step:
      labels.append(target[i+target_size])
    else:
      labels.append(target[i:i+target_size])
  return np.array(data), np.array(labels)

I will add a diagram or two with some explanation later on. This type of diagram is drawn using /Library/TeX/texbin/pdflatex and my Tikz editor. I have a plan to generate a PDF from the text and diagrams using tools later.

This creates a empty 1-D tensor and fills the values in it based on conditions in the loop. It is as simple as it gets but can be used to understand how to operate loops.

If you notice it is also possible to pick ranges from the source and move to the target like this. This line of code begs for a diagram as higher the rank of a tensor the more complicated it is to visualize what is happening. Remember this is a 1-D or Rank 0 tensor.

self._data = tf.concat([self._data,[tf.gather(dataset, i)]],0)

The final code is this.

import tensorflow as tf

class MultiVariate():
    def __init__(self):
        self._data = None
        self._labels = None

    def multivariate_data(self,
                          dataset,
                          start_index,
                          end_index,
                          history_size,
                          target_size,
                          single_step=False):
         start_index = start_index + history_size
         print("end_index ", end_index)
         print("start_index ", start_index)
         if self._data is None:
             self._data = tf.cast(tf.Variable(tf.reshape((), (0,))),dtype=tf.int32)
         if self._labels is None:
             self._labels = tf.cast(tf.Variable(tf.reshape((), (0,))),dtype=tf.int32)
         if end_index is None:
            end_index = len(dataset) - target_size

         def cond(i, j):
             return tf.less(i, j)

         def body(i, j):
             #A range of values are gathered
             self._data = tf.concat([self._data,[tf.gather(dataset, i)]],0)
             if ( i == start_index ): #Showing how A range of values are gathered and appended
                self._data = tf.concat([self._data,tf.gather(dataset, tf.range(1, 3, 1))],0)
             return tf.add( i , 1 ), j

         _,_ = tf.while_loop(cond, body, [start_index,end_index],shape_invariants=[start_index.get_shape(), end_index.get_shape()])
         return self._data

mv = MultiVariate()
d =    mv.multivariate_data(
                      tf.constant([1,88,99,4,5,6,7,8,9]),
                      tf.constant(2),
                      tf.constant(8),
                      tf.constant(1),
                      tf.constant(2),
                      tf.constant(2))
print("print ",d)