-
Notifications
You must be signed in to change notification settings - Fork 16.1k
Description
Components of top-level options settings (i.e., not within a textproto literal after the = sign) may contain fully-qualified names, most commonly used for extensions, although this is not a requirement, so long that field is directly or indirectly import option'd.
For example, we can write:
edition = "2024";
import "proto2/descriptor.proto"; // Or google.protobuf, if you prefer.
message A {
option (proto2.FieldOptions.deprecated) = true;
}Both protoc and Buf's implementation accept this, and everything works as expected in all backends.
However, due to protoc performing various validity checks in the parser, before name resolution, it incorrectly accepts the following program that our implementation rejects:
edition = "2024";
import "proto2/descriptor.proto"; // Or google.protobuf, if you prefer.
message A {
// Note that `option map_entry = true;` is rejected by both compilers.
option (proto2.FieldOptions.map_entry) = true;
}This affects five options-related checks that occur too early, within the parser:
map_entrynot being user-settable, as above. This seems to segfault some plugins, while causing others to exit with an error from deep inside the C++ runtime. Generating a descriptor works fine and will cause loading errors at runtime.- Part of
allow_aliasvalidation, specifically errors around whether aliases actually exist. The part that rejects aliases correctly accepts the program ifoption (proto2.EnumOptions.allow_alias)is set. featurescopying from a map-typed field to the corresponding entry type's fields, the one place where feature inheritance resolution is not implemented correctly in most runtimes. If using(proto2.FieldOptions.features), the explicitly-set features are not copied correctly.- The check for whether a message is a
MessageSet. - A check for
enforce_utf8, which IIRC was never open-sourced, sincefeatures.utf8_validationreplaced it.
The fix is not simple, because it requires name resolution: (proto2.FieldOptions.map_field) could refer to an extension named my_package.proto2.FieldOptions.map_field when used in package my_package, while (FieldOptions.map_field) can refer to proto2.FieldOptions.map_field when used in package proto. This is further complicated outside of google3, where the package has multiple components.
I suspect that the main reason this is broken is that this language feature is not particularly well-known. I am somewhat to blame because I never got around to rewriting protoc's frontend.
I discovered this bug by staring at parser.cc to validate that we are covering every map entry descriptor generating case correctly.