-
Notifications
You must be signed in to change notification settings - Fork 1.7k
AVRO-4226: Missing doc attributes for record fields #3624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
AVRO-4226: Missing doc attributes for record fields #3624
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes the missing doc attribute support for Avro record fields in the PHP implementation. The doc attribute was being ignored when parsing schemas and when converting schemas back to Avro format.
Changes:
- Added
docattribute parsing and serialization support in AvroField class - Added a test to verify doc attributes are preserved during parse and serialization round-trips
- Improved type hints in test code following modern PHP standards
- Updated Composer version in CI workflow
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| lang/php/lib/Schema/AvroField.php | Added doc property and constructor parameter; included doc in toAvro() output; added return type hints to methods |
| lang/php/lib/Schema/AvroRecordSchema.php | Parse doc attribute from field definitions and pass to AvroField constructor |
| lang/php/test/SchemaTest.php | Added test for doc attribute preservation; improved type hints; renamed method to follow PSR standards |
| .github/workflows/test-lang-php.yml | Updated Composer version from 2.8.12 to 2.9.3 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * @throws AvroSchemaParseException | ||
| */ | ||
| public static function fromFieldDefinition(array $avro, ?string $defaultNamespace, AvroNamedSchemata $schemata): self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martin-g I dicovered a bug when adding this method, basically one of the tests (this one) was failing with the error "MD5 is not a schema we know about.".
This happened because in this method I'm not passing the AvroNamedSchemata as a reference, so my fear is that at some point we are not referring to the same instance. This could be happening because this method was calling the cloneWithNewSchema function, which clones the schemata, but then it was assigning the cloned schemata to the reference, overriding it.
Since all the PHP object are passed by reference by default, I think we should do a cleanup of all these function which have as a parameter something like this ?AvroNamedSchemata &$schemata = null and change them to AvroNamedSchemata $schemata if it's a strong requirment, instead of relying on a reference which, at some point, it will be overwritten by someone changing it from null to something else.
IMHO it's better to have an initialized empty AvroNamedSchemata that will be updated during parsing, this is why I've changed the naming of the method to registerNamedSchema (reference)
Edit:
I did the cleanup in this commit 377c963
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| */ | ||
| public function __construct( | ||
| ?string $name, | ||
| private function __construct( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why private ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to private in order to be sure we initialize a validated AvroField using the static method, which validates the data before initializing it.
Do you prefer to revert this change and move everything in the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is only about breaking the API. I'd prefer to avoid any breaks if possible.
But 1.13.0 is considered as a major version according to Avro version policy, so breaks are expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, my main concern is that, with the current version, some validation on AvroField are made in the foreach inside AvroRecordSchema::parseFields, other ones in the AvroField class.
With these new changes we are sure that:
- only validated data can reach the method
__constructorand you cannot initialize an invalid class calling directly the__construtor - we can strictly type the parameters in the AvroField
__construct, instead having mixed types
|
Next time please try to not mix non-functional changes like renamings with bug-fixes/features. |
Co-authored-by: Martin Grigorov <[email protected]>
What is the purpose of the change
The
docattribute is ignored when a an AVRO schema is parsed and when encoding an AvroField instance to AVRO.Verifying this change
This change is already covered by existing tests, and a new test is introduced in this PR (
test_doc_attribute_on_primitive_fields).Documentation