Conversation
…t reflected on the shark UI
|
Hi Sundeep, the current Shark master doesn't include support for partitioned cached tables. |
|
Hi Harvey, The current patch is meant to allow users to track the storage/memory usage on Shark Storage UI per table as opposed to 'rdd_###'. Inserts/overwrites to the cached tables render the current Storage UI quite hard to follow. It does not handle drop parititions and overwrites in any special way, but it does guarantee that each block of data is identified by a unique number and has the table name associated with it on the UI. I am planning on submitting another patch once we have partition support that has naming conventions derived from hive's partition information. |
|
Can one of the admins verify this patch? |
|
Yeah, the storage UI is a bit confusing right now :( |
|
Based on hive's documentation, shouldn't the insert overwrite on table unpersist the existing RDDs? (partitions just unpersist the overwritten partitions). If this is the case, I can push a fix on that front. |
|
Yeah, that sounds good - created a ticket for that here: https://spark-project.atlassian.net/browse/SHARK-202. |
|
Sure. I do not seem to have permissions to assign myself the ticket. If you can help with that, I will take on the ticket. :) |
|
Done - assigned it to you. Thx! |
|
Oh, it looks like the assignments were concurrent.... |
|
What's the status of this pr? |
Fix to ensure tablenames for multi-insert/partitioned cached table get reflected on the shark UI.