Skip to content

Exposing the getEstimate to test distinct values against a pre-built frequent items sketch #2

@cccs-jc

Description

@cccs-jc

I'm interested in the frequent items (heavy hitters) sketch. I would like to build such a sketch on say a month of data.

I would then use this sketch on newly arriving events. What I would like to enrich my stream of events with a new column which states if the current event is a heavy hitter or not. One way I was thinking of doing this is by calling getEstimate of the ItemsSketch.

https://github.com/apache/datasketches-java/blob/2e9b84e8230e1067bfc5e977de4e3a3a13445169/src/main/java/org/apache/datasketches/frequencies/ItemsSketch.java#L338

Unfrortunately this function is note exposed in your library but seems like it would be easy to add?

Does it make sense to use getEstimate to achieve what I want or is there a even better way which I did not think of ?

Thanks in advance. @maropu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions