DGIM Algorithm

I think I understood the basic Datar-Gionis-Indyk-Motwani Algorithm which is explained in the book “Mining of massive datasets” by Jure Leskovec(Stanford Univ.),Anand Rajaraman(Milliway Labs) and Jeffrey D. Ullman(Stanford Univ.)

I will add more details later but the diagram below explains it. I used Tikz to draw this picture. I will check-in the tikz code to my github and post the link.

Update: https://github.com/mohanr/tikz/blob/master/dgim

Suppose we are using the DGIM algorithm of Section 4.6.2 to estimate the number of 1's in suffixes of a sliding window of length 40. The current timestamp is 100.
Note: we are showing timestamps as absolute values, rather than modulo the window size, as DGIM would do.
Suppose that at times 101 through 105, 1's appear in the stream. Compute the set of buckets that would exist in the system at time 105.

dgim

6 Responses to DGIM Algorithm

  1. Celia says:

    You post interesting content here. Your page deserves much more visitors.
    It can go viral if you give it initial boost, i know
    useful tool that can help you, simply search in google: svetsern traffic
    tips

  2. Uday says:

    Illustrated in Grate way. Without reading any stuff, I have just understood the algorithm because of your figure.

  3. Good to know it was useful.

  4. Binit says:

    For time 105. You are merging 95&87. But image shows merging of 102&95.

  5. @Binit I have to check that. Been a long time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: