DGIM Algorithm
October 25, 2014 6 Comments
I think I understood the basic Datar-Gionis-Indyk-Motwani Algorithm which is explained in the book “Mining of massive datasets” by Jure Leskovec(Stanford Univ.),Anand Rajaraman(Milliway Labs) and Jeffrey D. Ullman(Stanford Univ.)
I will add more details later but the diagram below explains it. I used Tikz to draw this picture. I will check-in the tikz code to my github and post the link.
Update: https://github.com/mohanr/tikz/blob/master/dgim
Suppose we are using the DGIM algorithm of Section 4.6.2 to estimate the number of 1's in suffixes of a sliding window of length 40. The current timestamp is 100.
Note: we are showing timestamps as absolute values, rather than modulo the window size, as DGIM would do.
Suppose that at times 101 through 105, 1's appear in the stream. Compute the set of buckets that would exist in the system at time 105.
You post interesting content here. Your page deserves much more visitors.
It can go viral if you give it initial boost, i know
useful tool that can help you, simply search in google: svetsern traffic
tips
Illustrated in Grate way. Without reading any stuff, I have just understood the algorithm because of your figure.
That is wonderful.
Good to know it was useful.
For time 105. You are merging 95&87. But image shows merging of 102&95.
@Binit I have to check that. Been a long time.