Conversation
fbertsch
commented
Jan 20, 2026
- Extract the latest instance of that column name from historical schemas
- Required columns will be come non-required
- The parent field must already be present in the schema
- Add spark procedure call to run undelete
- Extract the latest instance of that column name from historical schemas - Required columns will be come non-required - The parent field must already be present in the schema - Add spark procedure call to run undelete
|
@bryanck - could you take a look? This API is imperfect, since it doesn't allow an undelete of a field that has been recreated, i.e.:
The string version of x is unrecoverable. I doubt this is such a large use case. To support that, we'd need an API to expose the historical fields of a table, and then undelete by ID, which is probably too much complexity for a good API. |
|
Fixes #14488 |
| old: "method org.apache.iceberg.orc.ORC.WriteBuilder org.apache.iceberg.orc.ORC.WriteBuilder::config(java.lang.String,\ | ||
| \ java.lang.String)" | ||
| justification: "Removing deprecations for 1.2.0" | ||
| "1.10.0": |
There was a problem hiding this comment.
This generated yaml changed the structure - I can try to move it back around to minimize this diff
| import org.apache.iceberg.types.Types; | ||
| import org.junit.jupiter.api.Test; | ||
|
|
||
| public class TestSchemaUndelete extends HadoopTableTestBase { |
There was a problem hiding this comment.
test coverage looks good. the only corner case I could add would be
- add and delete a column count twice (so having different ids)
- verify the undeleted column is the most recent one.
would be a simple addition to testUndeletePreservesFieldId
There was a problem hiding this comment.
thanks. new test looks good; ensures that scan happens in the correct order. You'll need actual iceberg committers to review and approve the rest of the patch now...
This is common logic that is a little confusing at first glance. Also adds a hard failure in the case that the parent field is, for some reason, not available in the schema (this should be impossible, and if it happens, will return a null pointer)