diff --git a/.gitmodules b/.gitmodules index 458114b..4abbf24 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,3 +1,6 @@ [submodule "spec/fixtures/metaschema"] path = spec/fixtures/metaschema url = git@github.com:metaschema-framework/metaschema.git +[submodule "spec/fixtures/oscal"] + path = spec/fixtures/oscal + url = https://github.com/usnistgov/OSCAL.git diff --git a/.rubocop_todo.yml b/.rubocop_todo.yml index 7661274..d1af6b8 100644 --- a/.rubocop_todo.yml +++ b/.rubocop_todo.yml @@ -1,6 +1,6 @@ # This configuration was generated by # `rubocop --auto-gen-config` -# on 2026-04-14 10:25:31 UTC using RuboCop version 1.86.1. +# on 2026-04-21 23:09:15 UTC using RuboCop version 1.86.1. # The point is for the user to remove these configuration records # one by one as the offenses are removed from the code base. # Note that changes in the inspected code, or installation of new @@ -11,49 +11,366 @@ Gemspec/RequiredRubyVersion: Exclude: - 'metaschema.gemspec' -# Offense count: 14 +# Offense count: 21 # This cop supports safe autocorrection (--autocorrect). # Configuration parameters: EnforcedStyle, IndentationWidth. # SupportedStyles: with_first_argument, with_fixed_indentation Layout/ArgumentAlignment: Exclude: - - 'lib/metaschema.rb' + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + - 'spec/model_generator_spec.rb' + - 'spec/ruby_source_emitter_spec.rb' + +# Offense count: 10 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyle, IndentationWidth. +# SupportedStyles: with_first_element, with_fixed_indentation +Layout/ArrayAlignment: + Exclude: + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 6 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: IndentationWidth. +Layout/AssignmentIndentation: + Exclude: + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 74 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyleAlignWith. +# SupportedStylesAlignWith: either, start_of_block, start_of_line +Layout/BlockAlignment: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 71 +# This cop supports safe autocorrection (--autocorrect). +Layout/BlockEndNewline: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 7 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: AllowForAlignment. +Layout/CommentIndentation: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 10 +# This cop supports safe autocorrection (--autocorrect). +Layout/ElseAlignment: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 40 +# This cop supports safe autocorrection (--autocorrect). +Layout/EmptyLineAfterGuardClause: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' # Offense count: 1 # This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EmptyLineBetweenMethodDefs, EmptyLineBetweenClassDefs, EmptyLineBetweenModuleDefs, DefLikeMacros, AllowAdjacentOneLineDefs, NumberOfEmptyLines. +Layout/EmptyLineBetweenDefs: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +Layout/EmptyLines: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +Layout/EmptyLinesAroundMethodBody: + Exclude: + - 'lib/metaschema/metapath_evaluator.rb' + +# Offense count: 11 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyleAlignWith. +# SupportedStylesAlignWith: keyword, variable, start_of_line +Layout/EndAlignment: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: AllowForAlignment, AllowBeforeTrailingComments, ForceEqualSignAlignment. +Layout/ExtraSpacing: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + +# Offense count: 33 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: AllowMultipleStyles, EnforcedHashRocketStyle, EnforcedColonStyle, EnforcedLastArgumentHashStyle. +# SupportedHashRocketStyles: key, separator, table +# SupportedColonStyles: key, separator, table +# SupportedLastArgumentHashStyles: always_inspect, always_ignore, ignore_implicit, ignore_explicit +Layout/HashAlignment: + Exclude: + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/model_generator.rb' + - 'spec/model_generator_spec.rb' + +# Offense count: 33 +# This cop supports safe autocorrection (--autocorrect). # Configuration parameters: EnforcedStyle. -# SupportedStyles: empty_lines, empty_lines_except_namespace, empty_lines_special, no_empty_lines -Layout/EmptyLinesAroundModuleBody: +# SupportedStyles: normal, indented_internal_methods +Layout/IndentationConsistency: Exclude: - - 'lib/metaschema.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' -# Offense count: 16 +# Offense count: 174 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: Width, EnforcedStyleAlignWith, AllowedPatterns. +# SupportedStylesAlignWith: start_of_line, relative_to_receiver +Layout/IndentationWidth: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 208 # This cop supports safe autocorrection (--autocorrect). # Configuration parameters: Max, AllowHeredoc, AllowURI, AllowQualifiedName, URISchemes, AllowRBSInlineAnnotation, AllowCopDirectives, AllowedPatterns, SplitStrings. # URISchemes: http, https Layout/LineLength: Exclude: - - 'lib/metaschema.rb' + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' - 'spec/metaschema_spec.rb' + - 'spec/model_generator_spec.rb' + - 'spec/ruby_source_emitter_spec.rb' + +# Offense count: 2 +# This cop supports safe autocorrection (--autocorrect). +Layout/MultilineBlockLayout: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 5 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyle, IndentationWidth. +# SupportedStyles: aligned, indented, indented_relative_to_receiver +Layout/MultilineMethodCallIndentation: + Exclude: + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' -# Offense count: 14 +# Offense count: 7 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyle, IndentationWidth. +# SupportedStyles: aligned, indented +Layout/MultilineOperationIndentation: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 57 # This cop supports safe autocorrection (--autocorrect). # Configuration parameters: AllowInHeredoc. Layout/TrailingWhitespace: Exclude: - - 'lib/metaschema.rb' + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + - 'spec/ruby_source_emitter_spec.rb' + +# Offense count: 4 +# Configuration parameters: IgnoreLiteralBranches, IgnoreConstantBranches, IgnoreDuplicateElseBranch. +Lint/DuplicateBranch: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/model_generator.rb' # Offense count: 1 +Lint/IneffectiveAccessModifier: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + +# Offense count: 6 +# This cop supports unsafe autocorrection (--autocorrect-all). +# Configuration parameters: AllowedMethods, InferNonNilReceiver, AdditionalNilMethods. +# AllowedMethods: instance_of?, kind_of?, is_a?, eql?, respond_to?, equal? +# AdditionalNilMethods: present?, blank?, try, try! +Lint/RedundantSafeNavigation: + Exclude: + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 2 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: IgnoreEmptyBlocks, AllowUnusedKeywordArguments. +Lint/UnusedBlockArgument: + Exclude: + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 10 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: AllowUnusedKeywordArguments, IgnoreEmptyMethods, IgnoreNotImplementedMethods, NotImplementedExceptions. +# NotImplementedExceptions: NotImplementedError +Lint/UnusedMethodArgument: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: ContextCreatingMethods, MethodCreatingMethods. +Lint/UselessAccessModifier: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + +# Offense count: 4 +# This cop supports safe autocorrection (--autocorrect). +Lint/UselessAssignment: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 82 +# Configuration parameters: AllowedMethods, AllowedPatterns, CountRepeatedAttributes, Max. +Metrics/AbcSize: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + - 'lib/metaschema/type_mapper.rb' + +# Offense count: 15 # Configuration parameters: CountComments, CountAsOne, AllowedMethods, AllowedPatterns, inherit_mode. # AllowedMethods: refine Metrics/BlockLength: - Max: 28 + Max: 52 + +# Offense count: 86 +# Configuration parameters: AllowedMethods, AllowedPatterns, Max. +Metrics/CyclomaticComplexity: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 106 +# Configuration parameters: CountComments, CountAsOne, AllowedMethods, AllowedPatterns. +Metrics/MethodLength: + Max: 283 # Offense count: 2 +# Configuration parameters: CountKeywordArgs, MaxOptionalParameters. +Metrics/ParameterLists: + Max: 7 + +# Offense count: 72 +# Configuration parameters: AllowedMethods, AllowedPatterns, Max. +Metrics/PerceivedComplexity: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 20 +# Configuration parameters: MinNameLength, AllowNamesEndingInNumbers, AllowedNames, ForbiddenNames. +# AllowedNames: as, at, by, cc, db, id, if, in, io, ip, of, on, os, pp, to +Naming/MethodParameterName: + Exclude: + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + +# Offense count: 23 +# This cop supports unsafe autocorrection (--autocorrect-all). +Performance/MapCompact: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 8 +# This cop supports safe autocorrection (--autocorrect). +Performance/StringIdentifierArgument: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 1 +# This cop supports unsafe autocorrection (--autocorrect-all). +Performance/StringInclude: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + +# Offense count: 1 +RSpec/DescribeMethod: + Exclude: + - 'spec/model_generator_spec.rb' + +# Offense count: 8 # Configuration parameters: CountAsOne. RSpec/ExampleLength: Max: 10 +# Offense count: 9 +RSpec/MultipleExpectations: + Max: 2 + +# Offense count: 4 +# Configuration parameters: AllowSubject. +RSpec/MultipleMemoizedHelpers: + Max: 7 + # Offense count: 1 # Configuration parameters: AllowedPatterns. # AllowedPatterns: ^expect_, ^assert_ @@ -62,9 +379,190 @@ RSpec/NoExpectationExample: - 'spec/metaschema_spec.rb' # Offense count: 2 +# Configuration parameters: CustomTransform, IgnoreMethods, IgnoreMetadata, InflectorPath, EnforcedInflector. +# SupportedInflectors: default, active_support +RSpec/SpecFilePathFormat: + Exclude: + - 'spec/model_generator_spec.rb' + - 'spec/ruby_source_emitter_spec.rb' + +# Offense count: 137 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyle, ProceduralMethods, FunctionalMethods, AllowedMethods, AllowedPatterns, AllowBracesOnProceduralOneLiners, BracesRequiredMethods. +# SupportedStyles: line_count_based, semantic, braces_for_chaining, always_braces +# ProceduralMethods: benchmark, bm, bmbm, create, each_with_object, measure, new, realtime, tap, with_object +# FunctionalMethods: let, let!, subject, watch +# AllowedMethods: lambda, proc, it +Style/BlockDelimiters: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 3 +# This cop supports unsafe autocorrection (--autocorrect-all). +Style/CombinableLoops: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 4 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyle, SingleLineConditionsOnly, IncludeTernaryExpressions. +# SupportedStyles: assign_to_condition, assign_inside_condition +Style/ConditionalAssignment: + Exclude: + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyle, AllowComments. +# SupportedStyles: empty, nil, both +Style/EmptyElse: + Exclude: + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 3 +# This cop supports unsafe autocorrection (--autocorrect-all). +# Configuration parameters: AllowedReceivers. +# AllowedReceivers: Thread.current +Style/HashEachMethods: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 1 +# This cop supports unsafe autocorrection (--autocorrect-all). +Style/HashExcept: + Exclude: + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 16 +# This cop supports unsafe autocorrection (--autocorrect-all). +Style/IdenticalConditionalBranches: + Exclude: + - 'lib/metaschema/model_generator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 3 +# This cop supports unsafe autocorrection (--autocorrect-all). +Style/MapIntoArray: + Exclude: + - 'lib/metaschema/markdown_doc_generator.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +Style/ModuleMemberExistenceCheck: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 18 +# This cop supports safe autocorrection (--autocorrect). +Style/MultilineIfModifier: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 4 +# This cop supports safe autocorrection (--autocorrect). +Style/MultilineTernaryOperator: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: AllowMethodComparison, ComparisonsThreshold. +Style/MultipleComparison: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 6 +# This cop supports unsafe autocorrection (--autocorrect-all). +# Configuration parameters: EnforcedStyle, AllowedMethods, AllowedPatterns. +# SupportedStyles: predicate, comparison +Style/NumericPredicate: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/markdown_doc_generator.rb' + - 'lib/metaschema/metapath_evaluator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 1 +# Configuration parameters: AllowedMethods. +# AllowedMethods: respond_to_missing? +Style/OptionalBooleanParameter: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +Style/RedundantAssignment: + Exclude: + - 'lib/metaschema/metapath_evaluator.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: AllowedMethods. +# AllowedMethods: infinite?, nonzero? +Style/RedundantCondition: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 3 +# This cop supports safe autocorrection (--autocorrect). +Style/RedundantParentheses: + Exclude: + - 'lib/metaschema/json_schema_generator.rb' + - 'lib/metaschema/model_generator.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: EnforcedStyle. +# SupportedStyles: implicit, explicit +Style/RescueStandardError: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 2 +# This cop supports unsafe autocorrection (--autocorrect-all). +# Configuration parameters: ConvertCodeThatCanStartToReturnNil, AllowedMethods, MaxChainLength. +# AllowedMethods: present?, blank?, presence, try, try! +Style/SafeNavigation: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 1 +# This cop supports unsafe autocorrection (--autocorrect-all). +Style/SlicingWithRange: + Exclude: + - 'lib/metaschema/constraint_validator.rb' + +# Offense count: 1 +# This cop supports safe autocorrection (--autocorrect). +# Configuration parameters: AllowModifier. +Style/SoleNestedConditional: + Exclude: + - 'lib/metaschema/model_generator.rb' + +# Offense count: 12 +# This cop supports unsafe autocorrection (--autocorrect-all). +# Configuration parameters: Mode. +Style/StringConcatenation: + Exclude: + - 'lib/metaschema/ruby_source_emitter.rb' + +# Offense count: 4 # This cop supports safe autocorrection (--autocorrect). # Configuration parameters: EnforcedStyle, ConsistentQuotesInMultiline. # SupportedStyles: single_quotes, double_quotes Style/StringLiterals: Exclude: - - 'lib/metaschema.rb' + - 'lib/metaschema/constraint_validator.rb' + - 'lib/metaschema/model_generator.rb' + - 'spec/ruby_source_emitter_spec.rb' diff --git a/lib/metaschema.rb b/lib/metaschema.rb index 469eb62..9292871 100644 --- a/lib/metaschema.rb +++ b/lib/metaschema.rb @@ -97,4 +97,12 @@ def self.validate(file_path) autoload :FormalName, "metaschema/formal_name" autoload :SchemaVersion, "metaschema/schema_version" autoload :ShortName, "metaschema/short_name" + autoload :AugmentType, "metaschema/augment_type" + autoload :ConstraintValidator, "metaschema/constraint_validator" + autoload :JsonSchemaGenerator, "metaschema/json_schema_generator" + autoload :MarkdownDocGenerator, "metaschema/markdown_doc_generator" + autoload :MetapathEvaluator, "metaschema/metapath_evaluator" + autoload :ModelGenerator, "metaschema/model_generator" + autoload :RubySourceEmitter, "metaschema/ruby_source_emitter" + autoload :TypeMapper, "metaschema/type_mapper" end diff --git a/lib/metaschema/allowed_value_type.rb b/lib/metaschema/allowed_value_type.rb index 6d13408..629aeac 100644 --- a/lib/metaschema/allowed_value_type.rb +++ b/lib/metaschema/allowed_value_type.rb @@ -2,7 +2,7 @@ module Metaschema class AllowedValueType < Lutaml::Model::Serializable - attribute :content, :string + attribute :content, :string, collection: true attribute :value, :string attribute :deprecated, :string attribute :a, AnchorType, collection: true diff --git a/lib/metaschema/anchor_type.rb b/lib/metaschema/anchor_type.rb index 4486905..be6cf65 100644 --- a/lib/metaschema/anchor_type.rb +++ b/lib/metaschema/anchor_type.rb @@ -2,7 +2,7 @@ module Metaschema class AnchorType < Lutaml::Model::Serializable - attribute :content, :string + attribute :content, :string, collection: true attribute :href, :string attribute :title, :string attribute :code, CodeType, collection: true diff --git a/lib/metaschema/augment_type.rb b/lib/metaschema/augment_type.rb new file mode 100644 index 0000000..9bcc8b3 --- /dev/null +++ b/lib/metaschema/augment_type.rb @@ -0,0 +1,39 @@ +# frozen_string_literal: true + +module Metaschema + # Represents an element in a metaschema document. + # Augments add documentation, flags, or properties to definitions + # from imported modules without modifying the original module. + # + # Example: + # + # Document Metadata + # Provides information about the document. + # + # + class AugmentType < Lutaml::Model::Serializable + attribute :name, :string + attribute :formal_name, FormalName + attribute :description, MarkupLineDatatype + attribute :prop, PropertyType, collection: true + attribute :remarks, RemarksType + attribute :example, ExampleType, collection: true + attribute :flag, FlagReferenceType, collection: true + attribute :define_flag, InlineFlagDefinitionType, collection: true + + xml do + element "augment" + ordered + namespace ::Metaschema::Namespace + + map_attribute "name", to: :name + map_element "formal-name", to: :formal_name + map_element "description", to: :description + map_element "prop", to: :prop + map_element "remarks", to: :remarks + map_element "example", to: :example + map_element "flag", to: :flag + map_element "define-flag", to: :define_flag + end + end +end diff --git a/lib/metaschema/code_type.rb b/lib/metaschema/code_type.rb index 8d804f1..8c98023 100644 --- a/lib/metaschema/code_type.rb +++ b/lib/metaschema/code_type.rb @@ -2,7 +2,7 @@ module Metaschema class CodeType < Lutaml::Model::Serializable - attribute :content, :string + attribute :content, :string, collection: true attribute :klass, :string attribute :a, AnchorType, collection: true attribute :insert, InsertType, collection: true diff --git a/lib/metaschema/constraint_validator.rb b/lib/metaschema/constraint_validator.rb new file mode 100644 index 0000000..06ce7e8 --- /dev/null +++ b/lib/metaschema/constraint_validator.rb @@ -0,0 +1,483 @@ +# frozen_string_literal: true + +module Metaschema + class ConstraintValidator + attr_reader :errors + + def initialize + @errors = [] + end + + # Validate a generated class instance against its metaschema constraints. + # Returns an array of ConstraintError objects. + def validate(instance, constraint_def) + @errors = [] + return @errors unless constraint_def + + validate_allowed_values(instance, constraint_def) + validate_matches(instance, constraint_def) + if constraint_def.respond_to?(:has_cardinality) + validate_has_cardinality(instance, + constraint_def) + end + if constraint_def.respond_to?(:is_unique) + validate_is_unique(instance, + constraint_def) + end + if constraint_def.respond_to?(:expect) + validate_expect(instance, + constraint_def) + end + if constraint_def.respond_to?(:index_has_key) + validate_index_has_key(instance, + constraint_def) + end + + @errors + end + + # Recursively validate an entire instance tree. + # Validates each node's own constraints, then recurses into children. + def self.validate_tree(instance) + errors = [] + + if instance.is_a?(Lutaml::Model::Serializable) + # Validate this instance's own constraints + if instance.respond_to?(:validate_constraints) + errors.concat(instance.validate_constraints) + end + + # Validate occurrence constraints (min/max-occurs) + if instance.respond_to?(:validate_occurrences) + errors.concat(instance.validate_occurrences) + end + + # Recurse into all attribute values + instance.class.attributes.each_key do |attr_name| + value = instance.send(attr_name) + next if value.nil? + + if value.is_a?(Array) + value.each { |v| errors.concat(validate_tree(v)) if v.is_a?(Lutaml::Model::Serializable) } + elsif value.is_a?(Lutaml::Model::Serializable) + errors.concat(validate_tree(value)) + end + end + end + + errors + end + + private + + # ── allowed-values ──────────────────────────────────────────────── + + def validate_allowed_values(instance, constraint_def) + constraints = Array(constraint_def.allowed_values) + constraints.each do |c| + target = c.target || "." + values = resolve_target_values(instance, target) + allowed = Array(c.enum).filter_map(&:value) + allow_other = c.allow_other == "yes" + level = c.level || "ERROR" + + values.each do |val| + next if val.nil? || val.to_s.empty? + next if allow_other + next if allowed.include?(val.to_s) + + @errors << ConstraintError.new( + constraint_type: :allowed_values, + level: level, + message: "Value '#{val}' not in allowed values: #{allowed.join(', ')}", + target: target, + ) + end + end + end + + # ── matches ─────────────────────────────────────────────────────── + + def validate_matches(instance, constraint_def) + constraints = Array(constraint_def.matches) + constraints.each do |c| + target = c.target || "." + values = resolve_target_values(instance, target) + level = c.level || "ERROR" + + values.each do |val| + next if val.nil? || val.to_s.empty? + + if c.regex + unless val.to_s.match?(Regexp.new(c.regex)) + @errors << ConstraintError.new( + constraint_type: :matches, + level: level, + message: "Value '#{val}' does not match regex '#{c.regex}'", + target: target, + ) + end + elsif c.datatype + unless datatype_matches?(val, c.datatype) + @errors << ConstraintError.new( + constraint_type: :matches, + level: level, + message: "Value '#{val}' does not match datatype '#{c.datatype}'", + target: target, + ) + end + end + end + end + end + + # ── has-cardinality ────────────────────────────────────────────── + + def validate_has_cardinality(instance, constraint_def) + constraints = Array(constraint_def.has_cardinality) + constraints.each do |c| + target = c.target || "." + level = c.level || "ERROR" + count = count_target_items(instance, target) + + if c.min_occurs && count < c.min_occurs + @errors << ConstraintError.new( + constraint_type: :has_cardinality, + level: level, + message: "Expected at least #{c.min_occurs} items at '#{target}', got #{count}", + target: target, + ) + end + + if c.max_occurs && c.max_occurs != "unbounded" && count > c.max_occurs.to_i + @errors << ConstraintError.new( + constraint_type: :has_cardinality, + level: level, + message: "Expected at most #{c.max_occurs} items at '#{target}', got #{count}", + target: target, + ) + end + end + end + + # ── is-unique ──────────────────────────────────────────────────── + + def validate_is_unique(instance, constraint_def) + constraints = Array(constraint_def.is_unique) + constraints.each do |c| + target = c.target || "." + level = c.level || "ERROR" + key_fields = Array(c.key_field).map(&:target) + + items = resolve_target_collection(instance, target) + next unless items.is_a?(Array) && items.length > 1 + + # Build key tuples for each item + seen = {} + items.each_with_index do |item, idx| + key = if key_fields.empty? + extract_value(item) + else + key_fields.map do |kf| + resolve_flag_value(item, kf) + end + end + key_str = Array(key).join("|") + + if seen.key?(key_str) + @errors << ConstraintError.new( + constraint_type: :is_unique, + level: level, + message: "Duplicate key '#{key_str}' at '#{target}' (items #{seen[key_str]} and #{idx})", + target: target, + ) + else + seen[key_str] = idx + end + end + end + end + + # ── expect ─────────────────────────────────────────────────────── + + def validate_expect(_instance, constraint_def) + # expect constraints use XPath test expressions which are complex + # to evaluate without a full XPath engine. Log as WARNING for now. + constraints = Array(constraint_def.expect) + constraints.each do |c| + # Future: evaluate c.test against instance + end + end + + # ── index-has-key ──────────────────────────────────────────────── + + def validate_index_has_key(_instance, constraint_def) + # index-has-key requires an index registry which is complex. + # Stub for now. + constraints = Array(constraint_def.index_has_key) + constraints.each do |c| + # Future: look up index by c.name and validate keys + end + end + + # ── Target Resolution ──────────────────────────────────────────── + + # Resolve a Metaschema target expression to values from an instance. + # Delegates to MetapathEvaluator for complex expressions. + def resolve_target_values(instance, target) + return [extract_value(instance)] if target == "." + + # Use MetapathEvaluator for complex patterns + if complex_target?(target) + evaluator = MetapathEvaluator.new(instance) + return evaluator.resolve(target) + end + + # .//name — descendant search + if target.start_with?(".//") + path = target[3..] + return resolve_descendant_values(instance, path) + end + + # .[@flag='value']/rest — conditional + if target.start_with?(".[@") && target.include?("]/") + return resolve_conditional_path(instance, target) + end + + # @flag-name — flag value + if target.start_with?("@") + flag_name = target[1..].gsub("-", "_") + return [resolve_flag_value(instance, flag_name)] + end + + # field-name — child field value + [resolve_child_value(instance, target)] + end + + # Determine if a target expression requires MetapathEvaluator. + def complex_target?(target) + target.include?("has-oscal-namespace") || + target.include?("starts-with") || + target.include?(" and ") || + target.include?(" or ") || + target.include?("(.)") || + target.match?(/\w+\[.*\]/) || + (target.include?("/@") && !target.start_with?(".[@")) + end + + # Count items at a target path (for cardinality checks). + def count_target_items(instance, target) + if complex_target?(target) + evaluator = MetapathEvaluator.new(instance) + items = evaluator.resolve_collection(target) + return items.compact.length + end + + return 1 unless target.include?("/") || target.start_with?(".") + + # Handle conditional paths like ".[@type='quatrain']/line" + if target.start_with?(".[@") && target.include?("]/") + filter_attr, filter_val, rest = parse_conditional(target) + flag_val = resolve_flag_value(instance, filter_attr) + return 0 unless flag_val.to_s == filter_val + + child_name = rest.gsub("-", "_").to_sym + child = get_child(instance, child_name) + return 0 unless child + return child.length if child.is_a?(Array) + + return 1 + end + + # .//name — count all descendants + if target.start_with?(".//") + path = target[3..] + values = resolve_descendant_values(instance, path) + return values.length + end + + 0 + end + + # Resolve a collection of items at a target path (for uniqueness checks). + def resolve_target_collection(instance, target) + return [instance] if target == "." + + if complex_target?(target) + evaluator = MetapathEvaluator.new(instance) + return evaluator.resolve_collection(target) + end + + # Simple child name + child_name = target.gsub("-", "_").to_sym + child = get_child(instance, child_name) + return child if child.is_a?(Array) + + child ? [child] : [] + end + + def extract_value(item) + return item unless item.is_a?(Lutaml::Model::Serializable) + + # Try common value attributes + if item.respond_to?(:content) + val = item.content + return val unless using_default?(item, :content) + end + + item + end + + def resolve_flag_value(instance, flag_name) + return instance unless instance.is_a?(Lutaml::Model::Serializable) + + sym = flag_name.to_s.gsub("-", "_").to_sym + return instance.send(sym) if instance.respond_to?(sym) + + nil + end + + def resolve_child_value(instance, child_name) + return instance unless instance.is_a?(Lutaml::Model::Serializable) + + sym = child_name.to_s.gsub("-", "_").to_sym + child = get_child(instance, sym) + return extract_value(child) if child + + nil + end + + def resolve_descendant_values(instance, path) + # Simplified: split path and search recursively + parts = path.split("/") + collect_descendants(instance, parts) + end + + def collect_descendants(instance, parts) + return [] unless instance.is_a?(Lutaml::Model::Serializable) + + current_name = parts[0].gsub("-", "_").to_sym + rest = parts[1..] + + child = get_child(instance, current_name) + return [] unless child + + items = child.is_a?(Array) ? child : [child] + + if rest.empty? + items.map { |i| extract_value(i) } + else + items.flat_map { |i| collect_descendants(i, rest) } + end + end + + def resolve_conditional_path(instance, target) + filter_attr, filter_val, rest = parse_conditional(target) + + flag_val = resolve_flag_value(instance, filter_attr) + return [] unless flag_val.to_s == filter_val + + resolve_target_values(instance, rest) + end + + def parse_conditional(target) + # Parse ".[@attr='value']/rest" + m = target.match(/\.\[@(\w+)(?:-\w+)*='([^']+)'\]\/(.+)/) + return [nil, nil, target] unless m + + m[1..].first # raw attr including hyphens + # Re-extract properly + match = target.match(/\.\[@([\w-]+)='([^']+)'\]\/(.+)/) + [match[1].gsub("-", "_"), match[2], match[3]] + end + + def get_child(instance, sym) + return nil unless instance.respond_to?(sym) + + instance.send(sym) + end + + def using_default?(instance, attr_name) + instance.respond_to?(:using_default?) && instance.using_default?(attr_name) + rescue NoMethodError + false + end + + def datatype_matches?(value, datatype) + case datatype + when "string" then true + when "integer", "int" then value.to_s.match?(/\A-?\d+\z/) + when "positive-integer" then value.to_s.match?(/\A[1-9]\d*\z/) + when "boolean" then ["true", "false", "1", "0"].include?(value.to_s) + when "date" then value.to_s.match?(/\A\d{4}-\d{2}-\d{2}\z/) + when "datetime" then value.to_s.match?(/\A\d{4}-\d{2}-\d{2}T/) + when "uri" then value.to_s.match?(/\A[a-zA-Z][a-zA-Z0-9+\-.]*:/) + when "uuid" then value.to_s.match?(/\A[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-/) + else true # Unknown datatype, pass by default + end + end + + # Validate min/max occurrence constraints on an instance. + # occurrence_constraints is a Hash of {attr_name => {min: N, max: N}} + def self.validate_occurrences(instance, occurrence_constraints) + errors = [] + return errors unless occurrence_constraints && !occurrence_constraints.empty? + + occurrence_constraints.each do |attr_name, constraints| + value = instance.respond_to?(attr_name) ? instance.send(attr_name) : nil + count = case value + when nil then 0 + when Array then value.length + else 1 + end + + min = constraints[:min] + max = constraints[:max] + + if min&.positive? && count < min + errors << ConstraintError.new( + constraint_type: :occurrence, + level: "ERROR", + message: "Expected at least #{min} '#{attr_name}', got #{count}", + target: attr_name.to_s, + ) + end + + if max && count > max + errors << ConstraintError.new( + constraint_type: :occurrence, + level: "ERROR", + message: "Expected at most #{max} '#{attr_name}', got #{count}", + target: attr_name.to_s, + ) + end + end + + errors + end + + # Simple wrapper for constraint error info + class ConstraintError + attr_reader :constraint_type, :level, :message, :target + + def initialize(constraint_type:, level:, message:, target:) + @constraint_type = constraint_type + @level = level + @message = message + @target = target + end + + def to_s + "[#{level}] #{constraint_type}: #{message} (target: #{target})" + end + + def error? + level == "ERROR" + end + + def warning? + level == "WARNING" + end + end + end +end diff --git a/lib/metaschema/inline_markup_type.rb b/lib/metaschema/inline_markup_type.rb index 99db5c4..bc0b272 100644 --- a/lib/metaschema/inline_markup_type.rb +++ b/lib/metaschema/inline_markup_type.rb @@ -9,7 +9,7 @@ class CodeType < Lutaml::Model::Serializable end class InlineMarkupType < Lutaml::Model::Serializable - attribute :content, :string + attribute :content, :string, collection: true attribute :a, AnchorType, collection: true attribute :insert, InsertType, collection: true attribute :br, :string, collection: true diff --git a/lib/metaschema/json_schema_generator.rb b/lib/metaschema/json_schema_generator.rb new file mode 100644 index 0000000..f75e553 --- /dev/null +++ b/lib/metaschema/json_schema_generator.rb @@ -0,0 +1,456 @@ +# frozen_string_literal: true + +require "json" + +module Metaschema + # Generates JSON Schema (draft-07) from a parsed Metaschema document. + # + # Usage: + # ms = Metaschema::Root.from_xml(File.read("metaschema.xml")) + # schema = JsonSchemaGenerator.generate(ms) + # puts JSON.pretty_generate(schema) + # + # The generator walks the metaschema definition tree and emits a JSON Schema + # with a top-level object for each root assembly, and shared $defs for all + # referenced types. + class JsonSchemaGenerator + SCHEMA_URI = "http://json-schema.org/draft-07/schema#" + + # Maps metaschema as-type to JSON Schema type. + TYPE_MAP = { + "string" => { "type" => "string" }, + "markup-line" => { "type" => "string" }, + "markup-multiline" => { "type" => "string" }, + "boolean" => { "type" => "boolean" }, + "integer" => { "type" => "integer" }, + "positive-integer" => { "type" => "integer", "minimum" => 1 }, + "non-negative-integer" => { "type" => "integer", "minimum" => 0 }, + "decimal" => { "type" => "number" }, + "date" => { "type" => "string", "format" => "date" }, + "date-time" => { "type" => "string", "format" => "date-time" }, + "dateTime" => { "type" => "string", "format" => "date-time" }, + "dateTime-with-timezone" => { "type" => "string", + "format" => "date-time" }, + "uri" => { "type" => "string", "format" => "uri" }, + "uri-reference" => { "type" => "string" }, + "uuid" => { "type" => "string", + "pattern" => "^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$" }, + "base64" => { "type" => "string", "contentEncoding" => "base64" }, + "token" => { "type" => "string" }, + "email" => { "type" => "string", "format" => "email" }, + "ip-v4-address" => { "type" => "string", "format" => "ipv4" }, + "ip-v6-address" => { "type" => "string", "format" => "ipv6" }, + }.freeze + + def self.generate(metaschema, id: nil) + new(metaschema, id: id).generate + end + + def initialize(metaschema, id: nil) + @metaschema = metaschema + @id = id + @definitions = {} + @field_defs = {} + @assembly_defs = {} + @flag_defs = {} + end + + def generate + collect_definitions + + @metaschema.define_assembly&.each { |a| build_assembly_schema(a) } + @metaschema.define_field&.each { |f| build_field_def_schema(f) } + @metaschema.define_flag&.each { |f| build_flag_schema(f) } + + root_assemblies = (@metaschema.define_assembly || []).select do |a| + a.root_name&.content + end + if root_assemblies.one? + root = root_assemblies.first + root_name = root.root_name.content + @definitions[root.name] || { "type" => "object" } + + schema = { + "$schema" => SCHEMA_URI, + "$id" => @id, + "type" => "object", + "properties" => { root_name => { "$ref" => "#/$defs/#{root.name}" } }, + "required" => [root_name], + "additionalProperties" => false, + "$defs" => @definitions, + } + schema.delete("$id") unless @id + schema + else + { + "$schema" => SCHEMA_URI, + "$id" => @id, + "$defs" => @definitions, + }.compact + end + end + + private + + def collect_definitions + @metaschema.define_assembly&.each do |a| + @assembly_defs[a.name] = a if a.name + end + @metaschema.define_field&.each { |f| @field_defs[f.name] = f if f.name } + @metaschema.define_flag&.each { |f| @flag_defs[f.name] = f if f.name } + end + + # ── Assembly ─────────────────────────────────────────────────────── + + def build_assembly_schema(assembly_def) + return @definitions[assembly_def.name] if @definitions.key?(assembly_def.name) + + # Placeholder to prevent cycles + @definitions[assembly_def.name] = { "type" => "object" } + + props = {} + required = [] + pattern_props = {} + + # Flags → object properties + (assembly_def.define_flag || []).each do |fl| + name = fl.name + next unless name + + props[name] = build_flag_type_schema(fl) + required << name if fl.required == "yes" + end + + (assembly_def.flag || []).each do |fr| + ref = fr.ref + next unless ref + + fd = @flag_defs[ref] + next unless fd + + props[ref] = build_flag_type_schema(fd) + required << ref if fr.required == "yes" + end + + # Model children + if assembly_def.model + model = assembly_def.model + collect_model_children(model, props, required, pattern_props) + end + + schema = { "type" => "object", "properties" => props } + schema["required"] = required unless required.empty? + schema["additionalProperties"] = false + schema["patternProperties"] = pattern_props unless pattern_props.empty? + + if assembly_def.formal_name && !assembly_def.formal_name.is_a?(TrueClass) + title = assembly_def.formal_name.is_a?(String) ? assembly_def.formal_name : assembly_def.formal_name.content + schema["title"] = title if title && !title.empty? + end + if assembly_def.description.respond_to?(:content) + desc = assembly_def.description.content + schema["description"] = desc if desc && !desc.empty? + end + + @definitions[assembly_def.name] = schema + end + + def collect_model_children(model, props, required, pattern_props) + (model.field || []).each do |fr| + add_field_ref(fr, props, required, pattern_props) + end + (model.assembly || []).each do |ar| + add_assembly_ref(ar, props, required, pattern_props) + end + (model.define_field || []).each do |fd| + add_inline_field(fd, props, required) + end + (model.define_assembly || []).each do |ad| + add_inline_assembly(ad, props, required) + end + (model.choice || []).each do |c| + collect_choice_children(c, props, required, pattern_props) + end + (model.choice_group || []).each do |cg| + collect_choice_group_children(cg, props, required, pattern_props) + end + end + + def collect_choice_children(choice, props, required, pattern_props) + (choice.field || []).each do |fr| + add_field_ref(fr, props, required, pattern_props) + end + (choice.assembly || []).each do |ar| + add_assembly_ref(ar, props, required, pattern_props) + end + (choice.define_field || []).each do |fd| + add_inline_field(fd, props, required) + end + (choice.define_assembly || []).each do |ad| + add_inline_assembly(ad, props, required) + end + end + + def collect_choice_group_children(cg, props, _required, pattern_props) + group_as = cg.group_as + json_name = group_as&.name + + child_field_refs = cg.field || [] + child_asm_refs = cg.assembly || [] + + if group_as&.in_json == "BY_KEY" && json_name + # BY_KEY: object with pattern properties + inner_props = {} + child_field_refs.each do |fr| + ref = fr.ref + fd = @field_defs[ref] + inner_props.merge!(build_field_by_key_schema(fd)) if fd + end + pattern_props[json_name] = inner_props unless inner_props.empty? + elsif group_as&.in_json == "ARRAY" && json_name && child_field_refs.one? + # Array of single field type + fr = child_field_refs.first + ref = fr.ref + fd = @field_defs[ref] + if fd + items = build_field_items_schema(fd) + props[json_name] = { "type" => "array", "items" => items } + end + elsif group_as&.in_json == "ARRAY" && json_name && child_asm_refs.one? + ar = child_asm_refs.first + ref = ar.ref + props[json_name] = + { "type" => "array", "items" => { "$ref" => "#/$defs/#{ref}" } } + build_assembly_schema(@assembly_defs[ref]) if @assembly_defs[ref] + end + end + + # ── Field Ref ────────────────────────────────────────────────────── + + def add_field_ref(fr, props, required, pattern_props) + ref = fr.ref + return unless ref + + fd = @field_defs[ref] + return unless fd + + group_as = fr.group_as + json_name = fr.use_name&.content || group_as&.name || ref + + if group_as&.in_json == "BY_KEY" + key_flag = fd.json_key&.flag_ref + if key_flag + inner = build_field_object_schema(fd) + pattern_props[".*"] = inner + end + elsif group_as && %w[ARRAY SINGLETON_OR_ARRAY].include?(group_as.in_json) + items = build_field_items_schema(fd) + arr = { "type" => "array", "items" => items } + if group_as.in_json == "SINGLETON_OR_ARRAY" + arr = { "oneOf" => [items, arr] } + end + props[json_name] = arr + else + # Singleton field + has_flags = (fd.define_flag || []).any? || (fd.flag || []).any? + has_vk = fd.json_value_key || fd.json_value_key_flag + props[json_name] = if has_flags || has_vk + build_field_object_schema(fd) + else + build_field_scalar_schema(fd) + end + end + + required << json_name if fr.min_occurs&.to_i&.> 0 + end + + # ── Assembly Ref ─────────────────────────────────────────────────── + + def add_assembly_ref(ar, props, required, _pattern_props) + ref = ar.ref + return unless ref + + build_assembly_schema(@assembly_defs[ref]) if @assembly_defs[ref] + + group_as = ar.group_as + json_name = group_as&.name || ref + + if group_as&.in_json == "BY_KEY" + # BY_KEY: object whose keys are dynamic + props[json_name] = { + "type" => "object", + "additionalProperties" => { "$ref" => "#/$defs/#{ref}" }, + } + elsif group_as && %w[ARRAY SINGLETON_OR_ARRAY].include?(group_as.in_json) + arr = { "type" => "array", "items" => { "$ref" => "#/$defs/#{ref}" } } + if group_as.in_json == "SINGLETON_OR_ARRAY" + arr = { "oneOf" => [{ "$ref" => "#/$defs/#{ref}" }, arr] } + end + props[json_name] = arr + else + props[json_name] = { "$ref" => "#/$defs/#{ref}" } + end + + required << json_name if ar.min_occurs&.to_i&.> 0 + end + + # ── Inline Definitions ───────────────────────────────────────────── + + def add_inline_field(fd, props, _required) + return unless fd.name + + name = fd.name + props[name] = build_field_scalar_schema(fd) + end + + def add_inline_assembly(ad, props, _required) + return unless ad.name + + name = ad.name + build_assembly_schema(ad) if ad.model + props[name] = { "$ref" => "#/$defs/#{ad.name}" } + end + + # ── Field Schema Builders ────────────────────────────────────────── + + def build_field_scalar_schema(fd) + schema = type_for(fd.as_type) + apply_field_constraints(schema, fd) + schema + end + + def build_field_items_schema(fd) + has_flags = (fd.define_flag || []).any? || (fd.flag || []).any? + if has_flags + build_field_object_schema(fd) + else + build_field_scalar_schema(fd) + end + end + + def build_field_object_schema(fd) + obj = { "type" => "object", "properties" => {} } + required = [] + value_key = fd.json_value_key || "STRVALUE" + + # Value property + value_schema = type_for(fd.as_type) + apply_field_constraints(value_schema, fd) + obj["properties"][value_key] = value_schema + required << value_key + + # Flags + (fd.define_flag || []).each do |fl| + next unless fl.name + + obj["properties"][fl.name] = build_flag_type_schema(fl) + required << fl.name if fl.required == "yes" + end + + (fd.flag || []).each do |fr| + next unless fr.ref + + fdef = @flag_defs[fr.ref] + obj["properties"][fr.ref] = + fdef ? build_flag_type_schema(fdef) : { "type" => "string" } + required << fr.ref if fr.required == "yes" + end + + obj["required"] = required unless required.empty? + obj["additionalProperties"] = false + obj + end + + def build_field_by_key_schema(fd) + build_field_object_schema(fd) + end + + # ── Field Schema Builder (standalone definitions) ────────────────── + + def build_field_def_schema(fd) + return @definitions[fd.name] if @definitions.key?(fd.name) + + has_flags = (fd.define_flag || []).any? || (fd.flag || []).any? + schema = if has_flags + build_field_object_schema(fd) + else + build_field_scalar_schema(fd) + end + + if fd.formal_name && !fd.formal_name.is_a?(TrueClass) + title = fd.formal_name.is_a?(String) ? fd.formal_name : fd.formal_name.content + schema["title"] = title if title && !title.empty? + end + if fd.description.respond_to?(:content) + desc = fd.description.content + schema["description"] = desc if desc && !desc.empty? + end + + @definitions[fd.name] = schema + end + + # ── Flag Schema Builders ─────────────────────────────────────────── + + def build_flag_schema(flag_def) + return @definitions[flag_def.name] if @definitions.key?(flag_def.name) + + schema = build_flag_type_schema(flag_def) + @definitions[flag_def.name] = schema + schema + end + + def build_flag_type_schema(flag_or_def) + schema = type_for(flag_or_def.as_type) + + # Apply constraints + constraint = flag_or_def.constraint + if constraint + apply_allowed_values(schema, constraint.allowed_values) + apply_matches(schema, constraint.matches) + end + + if flag_or_def.formal_name && !flag_or_def.formal_name.is_a?(TrueClass) + title = flag_or_def.formal_name.is_a?(String) ? flag_or_def.formal_name : flag_or_def.formal_name.content + schema["title"] = title if title && !title.empty? + end + if flag_or_def.description.respond_to?(:content) + desc = flag_or_def.description.content + schema["description"] = desc if desc && !desc.empty? + end + schema + end + + # ── Constraints ──────────────────────────────────────────────────── + + def apply_field_constraints(schema, fd) + constraint = fd.constraint + return unless constraint + + apply_allowed_values(schema, constraint.allowed_values) + apply_matches(schema, constraint.matches) + end + + def apply_allowed_values(schema, constraints) + return unless constraints + + Array(constraints).each do |c| + enum_values = Array(c.enum).filter_map(&:value) + schema["enum"] = enum_values unless enum_values.empty? + end + end + + def apply_matches(schema, constraints) + return unless constraints + + Array(constraints).each do |c| + schema["pattern"] = c.regex if c.regex + end + end + + # ── Type Mapping ─────────────────────────────────────────────────── + + def type_for(as_type) + TYPE_MAP[as_type]&.dup || { "type" => "string" } + end + end +end diff --git a/lib/metaschema/list_item_type.rb b/lib/metaschema/list_item_type.rb index 33c1a89..04299ae 100644 --- a/lib/metaschema/list_item_type.rb +++ b/lib/metaschema/list_item_type.rb @@ -6,7 +6,7 @@ class ListType < Lutaml::Model::Serializable; end class BlockQuoteType < Lutaml::Model::Serializable; end class ListItemType < Lutaml::Model::Serializable - attribute :content, :string + attribute :content, :string, collection: true attribute :a, AnchorType, collection: true attribute :insert, InsertType, collection: true attribute :br, :string, collection: true diff --git a/lib/metaschema/markdown_doc_generator.rb b/lib/metaschema/markdown_doc_generator.rb new file mode 100644 index 0000000..c89c42d --- /dev/null +++ b/lib/metaschema/markdown_doc_generator.rb @@ -0,0 +1,354 @@ +# frozen_string_literal: true + +module Metaschema + # Generates human-readable Markdown documentation from a parsed Metaschema document. + # + # Usage: + # ms = Metaschema::Root.from_xml(File.read("metaschema.xml")) + # markdown = MarkdownDocGenerator.generate(ms) + # File.write("docs.md", markdown) + # + # The generator walks the metaschema definition tree and emits Markdown with: + # - Schema title and version + # - Table of contents + # - Assembly, field, and flag definitions with descriptions + # - Property tables showing types, constraints, and cardinality + # - Examples from elements + class MarkdownDocGenerator + def self.generate(metaschema) + new(metaschema).generate + end + + def initialize(metaschema) + @metaschema = metaschema + @output = [] + end + + def generate + header + table_of_contents + definitions + @output.join("\n") + end + + private + + def header + title = extract_text(@metaschema.schema_name) || "Metaschema" + version = extract_text(@metaschema.schema_version) + @output << "# #{title}" + @output << "" + @output << "**Version:** #{version}" if version + @output << "" if version + end + + def table_of_contents + assemblies = @metaschema.define_assembly || [] + fields = @metaschema.define_field || [] + flags = @metaschema.define_flag || [] + + items = assemblies.map do |a| + "- [#{a.name} (Assembly)](##{anchor(a.name)})" + end + fields.each { |f| items << "- [#{f.name} (Field)](##{anchor(f.name)})" } + flags.each { |f| items << "- [#{f.name} (Flag)](##{anchor(f.name)})" } + + return if items.empty? + + @output << "## Table of Contents" + @output << "" + items.each { |i| @output << i } + @output << "" + end + + def definitions + (@metaschema.define_assembly || []).each { |a| assembly_section(a) } + (@metaschema.define_field || []).each { |f| field_section(f) } + (@metaschema.define_flag || []).each { |f| flag_section(f) } + end + + # ── Assembly ─────────────────────────────────────────────────────── + + def assembly_section(asm) + @output << "## #{asm.name}" + @output << "" + formal_name_and_description(asm) + + # Flags + flag_rows = (asm.define_flag || []).map do |f| + flag_row(f, inline: true) + end + (asm.flag || []).each do |f| + flag_rows << ["`#{f.ref}`", "flag", f.required == "yes" ? "Yes" : "No", + "-"] + end + + # Model children + model = asm.model + child_rows = [] + if model + (model.field || []).each { |fr| child_rows << field_ref_row(fr) } + (model.assembly || []).each { |ar| child_rows << assembly_ref_row(ar) } + (model.define_field || []).each do |fd| + child_rows << inline_field_row(fd) + end + (model.define_assembly || []).each do |ad| + child_rows << inline_assembly_row(ad) + end + (model.choice || []).each do |c| + (c.field || []).each do |fr| + child_rows << field_ref_row(fr, choice: true) + end + (c.assembly || []).each do |ar| + child_rows << assembly_ref_row(ar, choice: true) + end + (c.define_field || []).each do |fd| + child_rows << inline_field_row(fd, choice: true) + end + (c.define_assembly || []).each do |ad| + child_rows << inline_assembly_row(ad, choice: true) + end + end + (model.choice_group || []).each do |cg| + child_rows << choice_group_row(cg) + end + end + + unless flag_rows.empty? && child_rows.empty? + @output << "### Properties" + @output << "" + @output << "| Name | Type | Required | Description |" + @output << "|------|------|----------|-------------|" + flag_rows.each { |r| @output << "| #{r.join(' | ')} |" } + child_rows.each { |r| @output << "| #{r.join(' | ')} |" } + @output << "" + end + + constraints_section(asm.constraint) + examples_section(asm.example) + + @output << "---" + @output << "" + end + + # ── Field ────────────────────────────────────────────────────────── + + def field_section(fd) + @output << "## #{fd.name}" + @output << "" + formal_name_and_description(fd) + + @output << "- **Type:** `#{fd.as_type || 'string'}`" + @output << "- **Collapsible:** #{fd.collapsible == 'yes' ? 'Yes' : 'No'}" if fd.collapsible == "yes" + + # Flags on this field + flag_rows = (fd.define_flag || []).map { |f| flag_row(f, inline: true) } + (fd.flag || []).each do |f| + flag_rows << ["`#{f.ref}`", "flag", f.required == "yes" ? "Yes" : "No", + "-"] + end + + if flag_rows.any? + @output << "" + @output << "### Flags" + @output << "" + @output << "| Name | Type | Required | Description |" + @output << "|------|------|----------|-------------|" + flag_rows.each { |r| @output << "| #{r.join(' | ')} |" } + end + + @output << "" + constraints_section(fd.constraint) + examples_section(fd.example) + + @output << "---" + @output << "" + end + + # ── Flag ─────────────────────────────────────────────────────────── + + def flag_section(fl) + @output << "## #{fl.name}" + @output << "" + formal_name_and_description(fl) + + @output << "- **Type:** `#{fl.as_type || 'string'}`" + @output << "" + + constraints_section(fl.constraint) + @output << "---" + @output << "" + end + + # ── Constraint helpers ───────────────────────────────────────────── + + def constraints_section(constraint) + return unless constraint + + allowed = constraint.allowed_values + matches = constraint.matches + + parts = [] + + if allowed + Array(allowed).each do |av| + target = av.respond_to?(:target) ? (av.target || ".") : "." + values = Array(av.enum).filter_map(&:value) + allow_other = av.allow_other == "yes" + level = av.level || "ERROR" + next if values.empty? + + desc = "Allowed values for `#{target}`: #{values.map do |v| + "`#{v}`" + end.join(', ')}" + desc += " (or other)" if allow_other + desc += " [#{level}]" + parts << desc + end + end + + if matches + Array(matches).each do |m| + target = m.target || "." + if m.regex + parts << "Matches regex `/#{m.regex}/` on `#{target}`" + elsif m.datatype + parts << "Matches datatype `#{m.datatype}` on `#{target}`" + end + end + end + + return if parts.empty? + + @output << "### Constraints" + @output << "" + parts.each { |p| @output << "- #{p}" } + @output << "" + end + + # ── Examples ─────────────────────────────────────────────────────── + + def examples_section(examples) + return unless examples && !examples.empty? + + @output << "### Examples" + @output << "" + + Array(examples).each_with_index do |ex, i| + name = ex.description&.content || "Example #{i + 1}" + @output << "#### #{name}" + @output << "" + if ex.remarks&.content + @output << ex.remarks.content + @output << "" + end + end + end + + # ── Row builders ─────────────────────────────────────────────────── + + def field_ref_row(fr, choice: false) + ref = fr.ref + group_as = fr.group_as + json_name = group_as&.name || fr.use_name&.content || ref + cardinality = cardinality_str(fr.min_occurs, fr.max_occurs, group_as) + prefix = choice ? "*choice* " : "" + ["`#{json_name}`", "#{prefix}field `#{ref}`", cardinality, ""] + end + + def assembly_ref_row(ar, choice: false) + ref = ar.ref + group_as = ar.group_as + json_name = group_as&.name || ref + cardinality = cardinality_str(ar.min_occurs, ar.max_occurs, group_as) + prefix = choice ? "*choice* " : "" + ["`#{json_name}`", "#{prefix}assembly `#{ref}`", cardinality, ""] + end + + def inline_field_row(fd, choice: false) + return [] unless fd.name + + prefix = choice ? "*choice* " : "" + ["`#{fd.name}`", "#{prefix}field (inline)", "-", ""] + end + + def inline_assembly_row(ad, choice: false) + return [] unless ad.name + + prefix = choice ? "*choice* " : "" + ["`#{ad.name}`", "#{prefix}assembly (inline)", "-", ""] + end + + def choice_group_row(cg) + group_as = cg.group_as + json_name = group_as&.name || "choice-group" + ["`#{json_name}`", "choice group", cardinality_str(nil, nil, group_as), + ""] + end + + def flag_row(fl, inline: false) + name = fl.name + type = fl.as_type || "string" + desc = extract_description(fl) + ["`#{name}`", "flag `#{type}`", fl.required == "yes" ? "Yes" : "No", desc] + end + + # ── Helpers ──────────────────────────────────────────────────────── + + def formal_name_and_description(defn) + formal = defn.formal_name + if formal && !formal.is_a?(TrueClass) + text = formal.is_a?(String) ? formal : formal.content + @output << "**#{text}**" if text && !text.empty? + @output << "" + end + + desc = extract_description(defn) + if desc && !desc.empty? + @output << desc + @output << "" + end + end + + def extract_description(defn) + return nil unless defn.respond_to?(:description) && defn.description + + if defn.description.respond_to?(:content) + defn.description.content + else + defn.description.to_s + end + end + + def extract_text(value) + return nil unless value + + if value.respond_to?(:content) + value.content + elsif value.is_a?(String) + value + else + value.to_s + end + end + + def cardinality_str(min, max, group_as) + min_val = min.to_i + max_val = max == "unbounded" ? nil : max&.to_i + + if group_as + "1..#{max_val || '*'}" if min_val >= 0 + elsif min_val.positive? && max_val + "#{min_val}..#{max_val}" + elsif min_val.positive? + "#{min_val}..*" + else + "0..1" + end + end + + def anchor(name) + name.downcase.gsub(/[^a-z0-9-]/, "-") + end + end +end diff --git a/lib/metaschema/markup_line_datatype.rb b/lib/metaschema/markup_line_datatype.rb index 31e26b5..521716c 100644 --- a/lib/metaschema/markup_line_datatype.rb +++ b/lib/metaschema/markup_line_datatype.rb @@ -2,7 +2,7 @@ module Metaschema class MarkupLineDatatype < Lutaml::Model::Serializable - attribute :content, :string + attribute :content, :string, collection: true attribute :a, AnchorType, collection: true attribute :insert, InsertType, collection: true attribute :br, :string, collection: true diff --git a/lib/metaschema/metapath_evaluator.rb b/lib/metaschema/metapath_evaluator.rb new file mode 100644 index 0000000..c667beb --- /dev/null +++ b/lib/metaschema/metapath_evaluator.rb @@ -0,0 +1,385 @@ +# frozen_string_literal: true + +module Metaschema + # Evaluates Metapath (XPath subset) expressions against Ruby object instances. + # + # Supported patterns (covering OSCAL constraint targets): + # "." — current instance + # "@flag-name" — flag value + # "child-name" — child field value + # "child-name/@attr" — child's flag value + # "//descendant" — descendant values + # "child[@attr='val']" — filtered children + # "child[@attr='val']/@attr2" — filtered child's attribute + # "child[func(...)]/@attr" — function-based filter + # ".[condition]/path" — conditional navigation + # ".[condition]" — filter current instance + # "(.)[condition]/path" — parenthesized self with filter + # + # Supported predicate functions: + # has-oscal-namespace('uri') — checks prop/element ns attribute + # starts-with(@attr, 'prefix') — string prefix check + # + # Supported predicate operators: + # @attr='value' — attribute equals + # @attr=('v1','v2',...) — attribute in set + # and / or — logical operators + # + class MetapathEvaluator + OSCAL_NS = "http://csrc.nist.gov/ns/oscal" + + attr_reader :context + + def initialize(context) + @context = context + end + + # Resolve a Metapath expression to values from the context instance. + # Returns an array of values. + def resolve(path) + return [extract_value(@context)] if path == "." + + path = normalize_path(path) + steps = parse_steps(path) + evaluate_steps(@context, steps) + end + + # Resolve a path to a collection of items (for uniqueness/cardinality checks). + def resolve_collection(path) + path = normalize_path(path) + steps = parse_steps(path) + evaluate_steps_collection(@context, steps) + end + + private + + # Normalize path patterns + def normalize_path(path) + # (.)[pred]/rest → .[pred]/rest + path.sub(/\A\(\.\)/, ".") + # // at start → descendant:: + end + + # Parse a Metapath expression into evaluation steps. + def parse_steps(path) + steps = [] + remaining = path + + while remaining && !remaining.empty? + # descendant-or-self //name + if remaining.start_with?(".//") + remaining = remaining[3..] + name, rest = split_step(remaining) + steps << { type: :descendant, name: name } + remaining = rest + next + end + + # .[predicate]/rest + if remaining.start_with?(".[") + pred, rest = extract_predicate_block(remaining[1..]) + inner_rest = extract_after_predicate(rest) + steps << { type: :filter_self, predicate: pred } + remaining = inner_rest + next + end + + # @attr — attribute access + if remaining.start_with?("@") + name, rest = split_step(remaining[1..]) + steps << { type: :attribute, name: name } + remaining = rest + next + end + + # child[predicate]/@attr — filtered child + if remaining.match?(/\A[\w-]+\[/) + m = remaining.match(/\A([\w-]+)\[/) + child_name = m[1] + pred, rest = extract_predicate_block(remaining[m[1].length..]) + steps << { type: :filtered_child, name: child_name, predicate: pred } + remaining = extract_after_predicate(rest) + next + end + + # child-name — simple child access + if remaining.match?(/\A[\w-]+/) + name, rest = split_step(remaining) + steps << { type: :child, name: name } + remaining = rest + next + end + + # Skip unrecognized prefix + remaining = remaining[1..] + end + + steps + end + + # Evaluate parsed steps against a context instance. + def evaluate_steps(context, steps) + return [extract_value(context)] if steps.empty? + + current_items = [context] + + steps.each do |step| + next_items = [] + current_items.each do |item| + next_items.concat(evaluate_step(item, step)) + end + current_items = next_items + end + + current_items + end + + def evaluate_steps_collection(context, steps) + return [context] if steps.empty? + + current_items = [context] + + steps.each do |step| + next_items = [] + current_items.each do |item| + case step[:type] + when :child + children = get_children(item, step[:name]) + next_items.concat(children) + when :descendant + next_items.concat(find_descendants(item, step[:name])) + when :attribute + next_items << resolve_attr(item, step[:name]) + when :filtered_child + children = get_children(item, step[:name]) + filtered = children.select do |c| + evaluate_predicate(c, step[:predicate]) + end + next_items.concat(filtered) + when :filter_self + if evaluate_predicate(item, step[:predicate]) + next_items << item + end + else + next_items << item + end + end + current_items = next_items + end + + current_items + end + + def evaluate_step(item, step) + case step[:type] + when :attribute + [resolve_attr(item, step[:name])] + when :child + children = get_children(item, step[:name]) + children.map { |c| extract_value(c) } + when :descendant + find_descendants(item, step[:name]).map { |d| extract_value(d) } + when :filtered_child + children = get_children(item, step[:name]) + children.select { |c| evaluate_predicate(c, step[:predicate]) } + when :filter_self + evaluate_predicate(item, step[:predicate]) ? [item] : [] + else + [item] + end + end + + # ── Predicate Evaluation ────────────────────────────────────────── + + def evaluate_predicate(item, predicate) + return true unless predicate + + # Handle "and" operators (simple split) + if predicate.include?(" and ") + parts = split_logical(predicate, " and ") + return parts.all? { |p| evaluate_single_predicate(item, p.strip) } + end + + # Handle "or" operators + if predicate.include?(" or ") + parts = split_logical(predicate, " or ") + return parts.any? { |p| evaluate_single_predicate(item, p.strip) } + end + + evaluate_single_predicate(item, predicate) + end + + def evaluate_single_predicate(item, pred) + pred = pred.strip + + # @attr='value' — simple attribute equals + if (m = pred.match(/\A@([\w-]+)\s*=\s*'([^']+)'\z/)) + attr_val = resolve_attr(item, m[1]) + return attr_val.to_s == m[2] + end + + # @attr=('v1','v2',...) — value in set + if (m = pred.match(/\A@([\w-]+)\s*=\s*\(([^)]+)\)\z/)) + attr_val = resolve_attr(item, m[1]) + values = m[2].scan(/'([^']+)'/).flatten + return values.include?(attr_val.to_s) + end + + # has-oscal-namespace('uri') — check ns attribute against OSCAL namespace + if (m = pred.match(/\Ahas-oscal-namespace\(\s*'([^']+)'\s*\)\z/)) + ns_uri = m[1] + ns_val = resolve_attr(item, "ns") + return ns_val.to_s == ns_uri || (ns_uri == OSCAL_NS && (ns_val.nil? || ns_val.to_s.empty?)) + end + + # starts-with(@attr, 'prefix') — string prefix check + if (m = pred.match(/\Astarts-with\(\s*@([\w-]+)\s*,\s*'([^']+)'\s*\)\z/)) + attr_val = resolve_attr(item, m[1]) + return attr_val.to_s.start_with?(m[2]) + end + + # Combining functions with @attr='val' using 'and' + if pred.include?(" and ") + parts = split_logical(pred, " and ") + return parts.all? { |p| evaluate_single_predicate(item, p.strip) } + end + + false + end + + # ── Instance Navigation ─────────────────────────────────────────── + + def resolve_attr(instance, attr_name) + return instance unless instance.is_a?(Lutaml::Model::Serializable) + + sym = attr_name.gsub("-", "_").to_sym + return instance.send(sym) if instance.respond_to?(sym) + + nil + end + + def get_children(instance, child_name) + return [] unless instance.is_a?(Lutaml::Model::Serializable) + + sym = child_name.gsub("-", "_").to_sym + return [] unless instance.respond_to?(sym) + + child = instance.send(sym) + case child + when Array then child + when nil then [] + else [child] + end + end + + def find_descendants(instance, name) + results = [] + return results unless instance.is_a?(Lutaml::Model::Serializable) + + sym = name.gsub("-", "_").to_sym + + instance.class.attributes.each_key do |attr_name| + value = instance.send(attr_name) + next if value.nil? + + items = value.is_a?(Array) ? value : [value] + items.each do |item| + next unless item.is_a?(Lutaml::Model::Serializable) + + if attr_name == sym + results << item + end + + results.concat(find_descendants(item, name)) + end + end + + results + end + + def extract_value(item) + return item unless item.is_a?(Lutaml::Model::Serializable) + return item.content if item.respond_to?(:content) && item.content + + item + end + + # ── Parsing Helpers ─────────────────────────────────────────────── + + def split_step(path) + idx = path.index("/") + idx ? [path[0...idx], path[(idx + 1)..]] : [path, nil] + end + + def extract_predicate_block(str) + # str starts with "[..." + depth = 0 + i = 0 + while i < str.length + case str[i] + when "[" + depth += 1 + when "]" + depth -= 1 + return [str[1...i], str[(i + 1)..]] if depth.zero? + when "'" + # Skip string literal + i += 1 + while i < str.length && str[i] != "'" + i += 1 + end + end + i += 1 + end + [str[1..], ""] + end + + def extract_after_predicate(rest) + return nil unless rest + + rest.start_with?("/") ? rest[1..] : rest + end + + # Split on logical operators respecting parentheses and quotes + def split_logical(expr, op) + parts = [] + depth = 0 + current = +"" + i = 0 + in_string = false + + while i < expr.length + ch = expr[i] + + if ch == "'" && depth >= 0 + in_string = !in_string + current << ch + i += 1 + next + end + + unless in_string + case ch + when "(", "[" + depth += 1 + when ")", "]" + depth -= 1 + end + + if depth.zero? && expr[i, op.length + 2] == " #{op} " + parts << current.strip + current = +"" + i += op.length + 2 + next + end + end + + current << ch + i += 1 + end + + parts << current.strip + parts + end + end +end diff --git a/lib/metaschema/model_generator.rb b/lib/metaschema/model_generator.rb new file mode 100644 index 0000000..dfdc91f --- /dev/null +++ b/lib/metaschema/model_generator.rb @@ -0,0 +1,2175 @@ +# frozen_string_literal: true + +module Metaschema + class ModelGenerator + class << self + def generate_from_file(metaschema_path, base_path: nil) + base_path ||= File.dirname(File.expand_path(metaschema_path)) + generate_from_xml(File.read(metaschema_path), base_path: base_path) + end + + def generate_from_xml(xml_string, base_path: nil) + metaschema = Metaschema::Root.from_xml(xml_string) + new.generate(metaschema, base_path: base_path) + end + + def generate_from_metaschema(metaschema, base_path: nil) + new.generate(metaschema, base_path: base_path) + end + + def to_ruby_source(metaschema_path, module_name:, base_path: nil, +split: false) + classes = generate_from_file(metaschema_path, base_path: base_path) + emitter = RubySourceEmitter.new(classes, module_name, self) + split ? emitter.emit_split : emitter.emit + end + end + + RESERVED_WORDS = %i[class module method hash object_id nil? is_a? kind_of? + instance_of? respond_to? send].freeze + + def generate(metaschema, base_path: nil) + @classes = {} + @flag_defs = {} + @assembly_defs = {} + @field_defs = {} + @namespace = metaschema.namespace + + # Resolve imports — merge definitions from imported modules + resolve_and_merge_imports(metaschema, base_path) + + collect_flag_definitions(metaschema) + collect_definition_registries(metaschema) + + # Apply augments — add docs/flags to imported definitions + apply_augments(metaschema) + + # Phase 1: Create field classes for all definitions (top-level + imported) + @field_defs.each_value do |fd| + create_field_class(fd) unless @classes.key?("Field_#{safe_attr(fd.name)}") + end + + # Phase 1: Create assembly placeholders for all definitions (top-level + imported) + @assembly_defs.each_value do |ad| + create_assembly_placeholder(ad) unless @classes.key?("Assembly_#{safe_attr(ad.name)}") + + # Phase 2: Populate assembly classes for all definitions + populate_assembly_class(ad) unless @classes["Assembly_#{safe_attr(ad.name)}"]&.instance_variable_get(:@populated) + end + + @classes + end + + private + + def safe_attr(name) + sym = name.gsub("-", "_").to_sym + RESERVED_WORDS.include?(sym) ? :"#{sym}_attr" : sym + end + + # ── Import Resolution ────────────────────────────────────────────── + + def resolve_and_merge_imports(metaschema, base_path) + imported_defs = resolve_imports(metaschema, base_path) + + # Merge imported definitions — first definition wins (top-level takes priority) + imported_defs.each do |defs| + defs[:flags].each { |name, defn| @flag_defs[name] ||= defn } + defs[:assemblies].each { |name, defn| @assembly_defs[name] ||= defn } + defs[:fields].each { |name, defn| @field_defs[name] ||= defn } + end + end + + def resolve_imports(metaschema, base_path, visited: Set.new) + imports = metaschema.import + return [] unless imports && !imports.empty? + + imports.flat_map do |import_elem| + href = import_elem.href + next [] unless href + + # Resolve relative to the importing file's directory + import_path = if base_path + File.expand_path(href, + base_path) + else + File.expand_path(href) + end + next [] unless File.exist?(import_path) + + # Cycle detection — skip already-visited files + next [] if visited.include?(import_path) + + visited.add(import_path) + + # Parse the imported metaschema + imported = Metaschema::Root.from_xml(File.read(import_path)) + + # Recursively resolve transitive imports + transitive = resolve_imports(imported, File.dirname(import_path), + visited: visited) + + # Collect definitions from this imported module + defs = { flags: {}, assemblies: {}, fields: {} } + imported.define_flag&.each { |f| defs[:flags][f.name] = f if f.name } + imported.define_assembly&.each do |a| + defs[:assemblies][a.name] = a if a.name + end + imported.define_field&.each { |f| defs[:fields][f.name] = f if f.name } + + transitive + [defs] + end + end + + # ── Augment Application ───────────────────────────────────────────── + + def apply_augments(metaschema) + return unless metaschema.respond_to?(:augment) + + augments = metaschema.augment + return unless augments && !augments.empty? + + augments.each do |aug| + name = aug.name + next unless name + + # Try to find the definition to augment + target = @assembly_defs[name] || @field_defs[name] || @flag_defs[name] + next unless target + + # Apply documentation augmentations + apply_augment_docs(target, aug) + apply_augment_flags(target, aug) + end + end + + def apply_augment_docs(target, augment) + # Add formal-name if provided and target doesn't have one + if augment.formal_name && !target.formal_name + target.formal_name = augment.formal_name + end + + # Add description if provided and target doesn't have one + if augment.description && (!target.respond_to?(:description) || !target.description) && target.respond_to?(:description=) + target.description = augment.description + end + end + + def apply_augment_flags(target, augment) + # Add flag references to assembly/field definitions + return unless augment.flag&.any? || augment.define_flag&.any? + + # Add flag references + if target.respond_to?(:flag) + existing_refs = (target.flag || []).map(&:ref) + augment.flag.each do |fr| + next if existing_refs.include?(fr.ref) + + target.flag = (target.flag || []) + [fr] + end + end + + # Add inline flag definitions + if target.respond_to?(:define_flag) + existing_names = (target.define_flag || []).map(&:name) + augment.define_flag.each do |fd| + next if existing_names.include?(fd.name) + + target.define_flag = (target.define_flag || []) + [fd] + end + end + end + + # ── Flag Definitions ────────────────────────────────────────────── + + def collect_flag_definitions(metaschema) + metaschema.define_flag&.each do |flag_def| + @flag_defs[flag_def.name] = flag_def if flag_def.name + end + end + + def collect_definition_registries(metaschema) + metaschema.define_assembly&.each do |ad| + @assembly_defs[ad.name] = ad if ad.name + end + metaschema.define_field&.each do |fd| + @field_defs[fd.name] = fd if fd.name + end + end + + # Resolve the XML element name for an assembly reference + def assembly_xml_element_name(assembly_ref) + ref_name = assembly_ref.ref + return ref_name unless ref_name + + # Local override takes priority + return assembly_ref.use_name.content if assembly_ref.use_name&.content + + # Check definition's use_name + defn = @assembly_defs[ref_name] + return defn.use_name.content if defn&.use_name&.content + + ref_name + end + + # Resolve the XML element name for a field reference + def field_xml_element_name(field_ref) + ref_name = field_ref.ref + return ref_name unless ref_name + + return field_ref.use_name.content if field_ref.use_name&.content + + defn = @field_defs[ref_name] + return defn.use_name.content if defn&.use_name&.content + + ref_name + end + + # ── Field Class Generation ──────────────────────────────────────── + + def create_field_class(field_def) + return unless field_def.name + + klass_name = "Field_#{field_def.name.gsub('-', '_')}" + klass = Class.new(Lutaml::Model::Serializable) + @classes[klass_name] = klass + + is_markup = TypeMapper.markup?(field_def.as_type) + is_multiline = TypeMapper.multiline?(field_def.as_type) + content_type = TypeMapper.map(field_def.as_type) + + if is_multiline + apply_markup_multiline_attributes(klass) + elsif is_markup + apply_markup_attributes(klass) + elsif field_def.collapsible == "yes" + klass.attribute :content, content_type, collection: true + else + klass.attribute :content, content_type + end + + field_def.define_flag&.each { |f| add_inline_flag(klass, f) } + field_def.flag&.each { |f| add_flag_reference(klass, f) } + + build_field_xml(klass, field_def.name, is_markup || is_multiline, + field_def, is_multiline) + build_field_json(klass, field_def) + + # Allow string-based deserialization: lutaml-model's of_json expects a + # Hash, but fields can appear as plain strings in JSON (when no flags are + # set, per NIST convention). Override of_json/from_json to handle both. + has_flags = field_def.define_flag&.any? || field_def.flag&.any? + has_json_vk = field_def.json_value_key || field_def.json_value_key_flag + is_collapsible = field_def.collapsible == "yes" + value_key = field_def.json_value_key || "STRVALUE" + + klass.define_singleton_method(:of_json) do |data| + if data.is_a?(String) + new(content: data) + else + super(data) + end + end + + klass.define_singleton_method(:from_json) do |data| + if data.is_a?(String) + new(content: data) + else + super(data) + end + end + + if has_flags || has_json_vk || is_collapsible + flag_attr_names = (field_def.define_flag || []).filter_map do |f| + safe_attr(f.name) if f.name + end + + (field_def.flag || []).filter_map do |f| + safe_attr(f.ref) if f.ref + end + + orig_as_json = klass.method(:as_json) + klass.define_singleton_method(:as_json) do |instance, options = {}| + result = orig_as_json.call(instance, options) + + # Collapsible: unwrap single-element content arrays + if is_collapsible && result.is_a?(Hash) && result[value_key].is_a?(Array) && result[value_key].length == 1 + result[value_key] = result[value_key].first + end + + # Fields with flags: when no flags are set, serialize as plain value + if (has_flags || has_json_vk) && result.is_a?(Hash) && result.key?(value_key) + flags_present = flag_attr_names.any? do |attr| + val = instance.send(attr) + val && !(val.respond_to?(:using_default?) && val.using_default?) + end + unless flags_present + return result[value_key] + end + end + + result + end + end + + apply_constraint_validation(klass, field_def.constraint) + end + + def apply_markup_attributes(klass) + klass.attribute :content, :string, collection: true + klass.attribute :a, AnchorType, collection: true + klass.attribute :insert, InsertType, collection: true + klass.attribute :br, :string, collection: true + klass.attribute :code, CodeType, collection: true + klass.attribute :em, InlineMarkupType, collection: true + klass.attribute :i, InlineMarkupType, collection: true + klass.attribute :b, InlineMarkupType, collection: true + klass.attribute :strong, InlineMarkupType, collection: true + klass.attribute :sub, InlineMarkupType, collection: true + klass.attribute :sup, InlineMarkupType, collection: true + klass.attribute :q, InlineMarkupType, collection: true + klass.attribute :img, ImageType, collection: true + end + + def apply_markup_multiline_attributes(klass) + apply_markup_attributes(klass) + klass.attribute :p, InlineMarkupType, collection: true + klass.attribute :h1, InlineMarkupType, collection: true + klass.attribute :h2, InlineMarkupType, collection: true + klass.attribute :h3, InlineMarkupType, collection: true + klass.attribute :h4, InlineMarkupType, collection: true + klass.attribute :h5, InlineMarkupType, collection: true + klass.attribute :h6, InlineMarkupType, collection: true + klass.attribute :ul, ListType, collection: true + klass.attribute :ol, OrderedListType, collection: true + klass.attribute :pre, PreformattedType, collection: true + klass.attribute :hr, :string, collection: true + klass.attribute :blockquote, BlockQuoteType, collection: true + klass.attribute :table, TableType, collection: true + end + + def build_field_xml(klass, xml_element, is_markup, field_def, +is_multiline = false) + flag_defs = field_def.define_flag || [] + flag_refs = field_def.flag || [] + + # Precompute safe attribute names for XML mapping + flag_attr_maps = flag_defs.filter_map do |f| + [f.name, safe_attr(f.name)] if f.name + end + flag_ref_maps = flag_refs.filter_map do |f| + [f.ref, safe_attr(f.ref)] if f.ref + end + + klass.class_eval do + xml do + element xml_element + mixed_content if is_markup + ordered if is_markup + + map_content to: :content + + if is_markup + map_element "a", to: :a + map_element "insert", to: :insert + map_element "br", to: :br + map_element "code", to: :code + map_element "em", to: :em + map_element "i", to: :i + map_element "b", to: :b + map_element "strong", to: :strong + map_element "sub", to: :sub + map_element "sup", to: :sup + map_element "q", to: :q + map_element "img", to: :img + end + + if is_multiline + map_element "p", to: :p + map_element "h1", to: :h1 + map_element "h2", to: :h2 + map_element "h3", to: :h3 + map_element "h4", to: :h4 + map_element "h5", to: :h5 + map_element "h6", to: :h6 + map_element "ul", to: :ul + map_element "ol", to: :ol + map_element "pre", to: :pre + map_element "hr", to: :hr + map_element "blockquote", to: :blockquote + map_element "table", to: :table + end + + flag_attr_maps.each do |xml_name, attr_name| + map_attribute xml_name, to: attr_name + end + + flag_ref_maps.each do |xml_name, attr_name| + map_attribute xml_name, to: attr_name + end + end + end + end + + # ── Key-Value Mapping Generation (JSON / YAML / TOML) ─────────── + # lutaml-model's key_value DSL generates mappings shared by all + # key-value formats (JSON, YAML, TOML). of_json / as_json / etc. + # continue to work because they delegate to the same mappings. + + def build_field_json(klass, field_def) + flag_defs = field_def.define_flag || [] + flag_refs = field_def.flag || [] + has_flags = flag_defs.any? || flag_refs.any? + json_vk = field_def.json_value_key + json_vk_flag = field_def.json_value_key_flag&.flag_ref + + if json_vk_flag + build_field_json_value_key_flag(klass, field_def, json_vk_flag) + return + end + + value_key = json_vk || "STRVALUE" + + flag_attr_maps = flag_defs.filter_map do |f| + [f.name, safe_attr(f.name)] if f.name + end + flag_ref_maps = flag_refs.filter_map do |f| + [f.ref, safe_attr(f.ref)] if f.ref + end + + klass.class_eval do + key_value do + root field_def.name + + if has_flags || json_vk + map value_key, to: :content + else + map "content", to: :content + end + + flag_attr_maps.each do |xml_name, attr_name| + map xml_name, to: attr_name + end + + flag_ref_maps.each do |xml_name, attr_name| + map xml_name, to: attr_name + end + end + end + end + + # json-value-key-flag: the flag value becomes the JSON key for content. + # E.g. {"prop1": "value1", "id": "id1"} where "prop1" is the name flag value. + # We store metadata on the field class and handle serialization via + # custom with: callbacks at the assembly level. + def build_field_json_value_key_flag(klass, field_def, key_flag_ref) + key_attr = safe_attr(key_flag_ref) + flag_defs = field_def.define_flag || [] + flag_refs = field_def.flag || [] + + other_flag_maps = flag_defs.reject { |f| f.name == key_flag_ref } + .filter_map do |f| + if f.name + [f.name, + safe_attr(f.name)] + end + end + + flag_refs.reject { |f| f.ref == key_flag_ref } + .filter_map do |f| + if f.ref + [f.ref, + safe_attr(f.ref)] + end + end + + # Store metadata: pairs of [json_key, attr_name] for other flags + klass.instance_variable_set(:@json_vk_flag_key_attr, key_attr) + klass.instance_variable_set(:@json_vk_flag_other_flag_maps, + other_flag_maps) + + klass.class_eval do + key_value do + root field_def.name + other_flag_maps.each do |json_name, attr_name| + map json_name, to: attr_name + end + end + end + end + + # Build custom with: callbacks for a field that uses json-value-key-flag. + # Called from build_assembly_json when the referenced field has this pattern. + def build_vk_flag_field_callbacks(parent_klass, field_klass, json_name, +attr_sym) + key_attr = field_klass.instance_variable_get(:@json_vk_flag_key_attr) + other_flag_maps = field_klass.instance_variable_get(:@json_vk_flag_other_flag_maps) + known_json_keys = other_flag_maps.map(&:first) + + from_method = :"json_from_vkf_#{attr_sym}_#{json_name.gsub('-', '_')}" + to_method = :"json_to_vkf_#{attr_sym}_#{json_name.gsub('-', '_')}" + + parent_klass.define_method(from_method) do |instance, value| + items = case value + when Array then value + when Hash then [value] + when nil then [] + else [value] + end + parsed = items.map do |item| + item = item.dup + key_val = nil + content_val = nil + item.each do |k, v| + unless known_json_keys.include?(k) + key_val = k + content_val = v + end + end + obj = field_klass.allocate + obj.instance_variable_set(:@using_default, {}) + obj.instance_variable_set(:@lutaml_register, :default) + obj.instance_variable_set("@#{key_attr}", key_val) + obj.instance_variable_set(:@content, content_val) + other_flag_maps.each do |json_key, attr_name| + if item.key?(json_key) + obj.instance_variable_set("@#{attr_name}", + item[json_key]) + end + end + obj + end + instance.instance_variable_set("@#{attr_sym}", parsed) + end + + parent_klass.define_method(to_method) do |instance, doc| + current = instance.instance_variable_get("@#{attr_sym}") + if current.is_a?(Array) + doc[json_name] = current.map do |item| + key_val = item.instance_variable_get("@#{key_attr}") + content_val = item.instance_variable_get(:@content) + result = { key_val => content_val } + other_flag_maps.each do |json_key, attr_name| + val = item.instance_variable_get("@#{attr_name}") + result[json_key] = val if val + end + result + end + end + end + + { from_method: from_method, to_method: to_method } + end + + # Build custom with: callbacks for BY_KEY group-as. + # JSON format: {"key1": "val1", "key2": "val2"} — a map keyed by json-key flag. + # Internal format: array of field instances, each with the key flag set. + def build_by_key_field_callbacks(parent_klass, field_klass, json_name, +attr_sym, json_key_flag) + key_attr = safe_attr(json_key_flag) + field_klass && field_klass.instance_variable_get(:@json_vk_flag_key_attr).nil? && + field_klass.attributes.any? do |k, _| + k != :content && k.to_s != key_attr.to_s + end + + from_method = :"json_from_bykey_#{attr_sym}_#{json_name.gsub('-', '_')}" + to_method = :"json_to_bykey_#{attr_sym}_#{json_name.gsub('-', '_')}" + + parent_klass.define_method(from_method) do |instance, value| + return unless value.is_a?(Hash) + + parsed = value.map do |k, v| + obj = if field_klass + field_klass.allocate.tap do |o| + o.instance_variable_set(:@using_default, {}) + o.instance_variable_set(:@lutaml_register, :default) + o.instance_variable_set("@#{key_attr}", k) + if v.is_a?(Hash) + # Field with flags — deserialize from hash + v.each do |vk, vv| + attr_sym_local = vk.gsub("-", "_").to_sym + begin + o.instance_variable_set("@#{attr_sym_local}", vv) + rescue StandardError + # skip unknown attributes + end + end + else + o.instance_variable_set(:@content, v) + end + end + else + k + end + obj + end + instance.instance_variable_set("@#{attr_sym}", parsed) + end + + parent_klass.define_method(to_method) do |instance, doc| + current = instance.instance_variable_get("@#{attr_sym}") + if current.is_a?(Array) + result = {} + current.each do |item| + if field_klass + key_val = item.instance_variable_get("@#{key_attr}") + content_val = item.instance_variable_get(:@content) + if field_klass.attributes.keys.any? do |k| + k != :content && k.to_s != key_attr.to_s && item.instance_variable_get("@#{k}") + end + # Has other flags — serialize as object + obj = {} + field_klass.attributes.each_key do |attr_k| + next if attr_k.to_s == key_attr.to_s + + v = item.instance_variable_get("@#{attr_k}") + obj[attr_k.to_s] = v if v + end + result[key_val] = obj + else + result[key_val] = content_val + end + end + end + doc[json_name] = result + end + end + + { from_method: from_method, to_method: to_method } + end + + # Handles BY_KEY group-as for assembly references. + # In JSON, assemblies are keyed by their json-key flag value: + # {"en": {...}, "de": {...}} + # On parse (from): deserialize each value into the assembly class, + # setting the key flag attribute on each instance. + # On serialize (to): extract the key flag value and build a Hash. + def build_by_key_assembly_callbacks(parent_klass, asm_klass, json_name, +attr_sym, json_key_flag, grouped: false, child_attr: nil) + key_attr = safe_attr(json_key_flag) + + from_method = :"json_from_bykey_asm_#{attr_sym}_#{json_name.gsub('-', + '_')}" + to_method = :"json_to_bykey_asm_#{attr_sym}_#{json_name.gsub('-', '_')}" + + parent_klass.define_method(from_method) do |instance, value| + return unless value.is_a?(Hash) + + parsed = value.map do |k, v| + if asm_klass + obj = if v.is_a?(Hash) + asm_klass.of_json(v) + else + asm_klass.new + end + obj.instance_variable_set("@#{key_attr}", k) + obj + else + k + end + end + + if grouped && child_attr + # GROUPED wrapper: create wrapper instance containing the array + wrapper = instance.instance_variable_get("@#{attr_sym}") + unless wrapper + attr_type = instance.class.attributes[attr_sym] + wrapper = attr_type.type.new + end + wrapper.instance_variable_set("@#{child_attr}", parsed) + instance.instance_variable_set("@#{attr_sym}", wrapper) + else + instance.instance_variable_set("@#{attr_sym}", parsed) + end + end + + parent_klass.define_method(to_method) do |instance, doc| + current = instance.instance_variable_get("@#{attr_sym}") + items = if grouped && current && child_attr + current.send(child_attr) + else + current + end + + if items.is_a?(Array) + result = {} + items.each do |item| + next unless asm_klass + + key_val = item.instance_variable_get("@#{key_attr}") + if item.is_a?(Lutaml::Model::Serializable) + sub = asm_klass.as_json(item) + # Remove the key flag from the sub-hash (it's the outer key) + key_json_name = asm_klass.mappings_for(:json).instance_variable_get(:@mappings) + .find do |_map_key, rule| + rule.to.to_s == key_attr.to_s + end&.first + sub.delete(key_json_name) if key_json_name + result[key_val] = sub.empty? ? {} : sub + else + result[key_val] = {} + end + end + doc[json_name] = result + end + end + + { from_method: from_method, to_method: to_method } + end + + def build_assembly_json(klass, root_name, assembly_def) + flag_defs = assembly_def.define_flag || [] + flag_refs = assembly_def.flag || [] + + flag_attr_maps = flag_defs.filter_map do |f| + [f.name, safe_attr(f.name)] if f.name + end + flag_ref_maps = flag_refs.filter_map do |f| + [f.ref, safe_attr(f.ref)] if f.ref + end + + json_field_mappings = collect_json_field_mappings(assembly_def) + json_assembly_mappings = collect_json_assembly_mappings(assembly_def) + + # Separate vk_flag, by_key, and singleton_or_array mappings for custom handling + vk_flag_mappings = json_field_mappings.select { |m| m[:vk_flag] } + by_key_mappings = json_field_mappings.select { |m| m[:by_key] } + soa_mappings = json_field_mappings.select { |m| m[:singleton_or_array] } + regular_field_mappings = json_field_mappings.reject do |m| + m[:vk_flag] || m[:by_key] || m[:singleton_or_array] + end + + # Separate assembly SOA from regular assembly mappings + assembly_by_key_mappings = json_assembly_mappings.select do |m| + m[:by_key] + end + assembly_soa_mappings = json_assembly_mappings.select do |m| + m[:singleton_or_array] + end + regular_assembly_mappings = json_assembly_mappings.reject do |m| + m[:by_key] || m[:singleton_or_array] + end + + klass.class_eval do + key_value do + root root_name + + flag_attr_maps.each do |xml_name, attr_name| + map xml_name, to: attr_name + end + + flag_ref_maps.each do |xml_name, attr_name| + map xml_name, to: attr_name + end + + regular_field_mappings.each do |mapping| + if mapping[:scalar] + map mapping[:json_name], to: mapping[:attr_name], + with: { to: mapping[:to_method], from: mapping[:from_method] } + else + map mapping[:json_name], to: mapping[:attr_name], + render_empty: true + end + end + + regular_assembly_mappings.each do |mapping| + map mapping[:json_name], to: mapping[:attr_name], render_empty: true + end + end + end + + # Define with: callback methods for scalar field mappings + regular_field_mappings.each do |mapping| + next unless mapping[:scalar] + + field_klass = mapping[:field_klass] + attr_sym = mapping[:attr_name] + + has_flags = mapping[:has_flags] + + klass.define_method(mapping[:from_method]) do |instance, value| + if value.is_a?(Array) + parsed = value.map do |v| + has_flags ? field_klass.of_json(v) : field_klass.new(content: v) + end + instance.instance_variable_set("@#{attr_sym}", parsed) + elsif value.is_a?(Hash) + if value.empty? + inst = field_klass.new(content: "") + inst.instance_variable_set(:@_was_empty_hash, true) + instance.instance_variable_set("@#{attr_sym}", inst) + else + instance.instance_variable_set("@#{attr_sym}", + field_klass.of_json(value)) + end + elsif value + instance.instance_variable_set("@#{attr_sym}", + has_flags ? field_klass.of_json(value) : field_klass.new(content: value)) + end + end + + klass.define_method(mapping[:to_method]) do |instance, doc| + current = instance.instance_variable_get("@#{attr_sym}") + if current.is_a?(Array) + result = current.map do |item| + if has_flags && item.is_a?(Lutaml::Model::Serializable) + field_klass.as_json(item) + else + item.respond_to?(:content) ? item.content : item + end + end + doc[mapping[:json_name]] = result + elsif current + if current.instance_variable_get(:@_was_empty_hash) + doc[mapping[:json_name]] = {} + elsif has_flags && current.is_a?(Lutaml::Model::Serializable) + doc[mapping[:json_name]] = field_klass.as_json(current) + else + val = current.respond_to?(:content) ? current.content : current + doc[mapping[:json_name]] = val + end + end + end + end + + # Handle SINGLETON_OR_ARRAY non-scalar field mappings with custom with: callbacks + soa_mappings.each do |mapping| + attr_sym = mapping[:attr_name] + json_name = mapping[:json_name] + from_m = mapping[:from_method] + to_m = mapping[:to_method] + field_klass = mapping[:field_klass] + + klass.define_method(from_m) do |instance, value| + items = case value + when Hash then [value] + when Array then value + when String then [value] + else return + end + parsed = items.map do |item| + case item + when Hash then field_klass.of_json(item) + when String then field_klass.of_json(item) + else item + end + end + instance.instance_variable_set("@#{attr_sym}", parsed) + end + + klass.define_method(to_m) do |instance, doc| + current = instance.instance_variable_get("@#{attr_sym}") + if current.is_a?(Array) + result = current.map do |item| + if item.is_a?(Lutaml::Model::Serializable) + field_klass.as_json(item) + else + item + end + end + doc[json_name] = result.length == 1 ? result.first : result + end + end + + klass.class_eval do + key_value do + map json_name, to: attr_sym, + with: { to: to_m, from: from_m } + end + end + + # Add alias mapping for ref name if it differs from group-as name + if mapping[:alt_json_name] + klass.class_eval do + key_value do + map mapping[:alt_json_name], to: attr_sym, + with: { to: to_m, from: from_m } + end + end + end + end + + # Handle json-value-key-flag fields with custom with: callbacks + vk_flag_mappings.each do |mapping| + callbacks = build_vk_flag_field_callbacks( + klass, mapping[:field_klass], mapping[:json_name], mapping[:attr_name] + ) + # Re-open json block to add the mapping with custom with: + klass.class_eval do + key_value do + map mapping[:json_name], to: mapping[:attr_name], + with: { to: callbacks[:to_method], from: callbacks[:from_method] } + end + end + end + + # Handle BY_KEY group-as with custom with: callbacks + by_key_mappings.each do |mapping| + # Ensure the mapping target attribute exists (GROUPED wrappers may not + # register the child attr name as a top-level attribute) + unless klass.attributes.key?(mapping[:attr_name]) + klass.attribute mapping[:attr_name], mapping[:field_klass], + collection: true + end + callbacks = build_by_key_field_callbacks( + klass, mapping[:field_klass], mapping[:json_name], + mapping[:attr_name], mapping[:json_key_flag] + ) + klass.class_eval do + key_value do + map mapping[:json_name], to: mapping[:attr_name], + with: { to: callbacks[:to_method], from: callbacks[:from_method] } + end + end + end + + # Handle BY_KEY assembly mappings with custom with: callbacks + assembly_by_key_mappings.each do |mapping| + unless klass.attributes.key?(mapping[:attr_name]) + asm_type = mapping[:asm_klass] || Lutaml::Model::Serializable + klass.attribute mapping[:attr_name], asm_type, collection: true + end + callbacks = build_by_key_assembly_callbacks( + klass, mapping[:asm_klass], mapping[:json_name], + mapping[:attr_name], mapping[:json_key_flag], + grouped: mapping[:grouped] || false, + child_attr: mapping[:child_attr] + ) + klass.class_eval do + key_value do + map mapping[:json_name], to: mapping[:attr_name], + with: { to: callbacks[:to_method], from: callbacks[:from_method] } + end + end + end + + # Handle SINGLETON_OR_ARRAY assembly mappings with custom with: callbacks + assembly_soa_mappings.each do |mapping| + attr_sym = mapping[:attr_name] + json_name = mapping[:json_name] + from_m = mapping[:from_method] + to_m = mapping[:to_method] + asm_klass = mapping[:asm_klass] + + # Typed instances for all SOA (both explicit and implicit) + klass.define_method(from_m) do |instance, value| + items = case value + when Hash then [value] + when Array then value + else return + end + parsed = if asm_klass + items.map do |item| + asm_klass.of_json(item.is_a?(Hash) ? item : {}) + end + else + items + end + # For singleton attributes (collection: false), unwrap single-item arrays + attr_def = klass.attributes[attr_sym] + if parsed.length == 1 && attr_def && !attr_def.collection + instance.instance_variable_set("@#{attr_sym}", parsed.first) + else + instance.instance_variable_set("@#{attr_sym}", parsed) + end + end + + klass.define_method(to_m) do |instance, doc| + current = instance.instance_variable_get("@#{attr_sym}") + if current.is_a?(Array) + result = current.map do |item| + if asm_klass && item.is_a?(Lutaml::Model::Serializable) + asm_klass.as_json(item) + else + item + end + end + doc[json_name] = result.length == 1 ? result.first : result + elsif current + doc[json_name] = if asm_klass && current.is_a?(Lutaml::Model::Serializable) + asm_klass.as_json(current) + else + current + end + end + end + + klass.class_eval do + key_value do + map json_name, to: attr_sym, render_empty: true, + with: { to: to_m, from: from_m } + end + end + end + + # Collapsible BY_KEY: when an assembly has no flags and only one BY_KEY + # child, the NIST toolchain outputs the BY_KEY map directly without the + # group-as name wrapper (e.g. author-index JSON is {"archimedes": {...}} + # not {"authors": {"archimedes": {...}}}). + if flag_defs.empty? && flag_refs.empty? && + json_assembly_mappings.length == 1 && + json_assembly_mappings.first[:by_key] + + by_key_json_name = json_assembly_mappings.first[:json_name] + + orig_of_json = klass.method(:of_json) + klass.define_singleton_method(:of_json) do |data, options = {}| + if data.is_a?(Hash) && !data.key?(by_key_json_name) + orig_of_json.call({ by_key_json_name => data }, options) + else + orig_of_json.call(data, options) + end + end + + orig_as_json = klass.method(:as_json) + klass.define_singleton_method(:as_json) do |instance, options = {}| + result = orig_as_json.call(instance, options) + if result.is_a?(Hash) && result.key?(by_key_json_name) + result[by_key_json_name] + else + result + end + end + end + end + + def collect_json_field_mappings(assembly_def) + mappings = [] + model = assembly_def.model + return mappings unless model + + mappings.concat(collect_model_json_field_mappings(model)) + mappings + end + + def collect_model_json_field_mappings(model) + mappings = [] + + model.field&.each { |fr| mappings << build_field_json_mapping(fr) } + model.define_field&.each do |fd| + mappings << build_inline_field_json_mapping(fd) if fd.name + end + model.choice&.each do |c| + c.field&.each { |fr| mappings << build_field_json_mapping(fr) } + c.define_field&.each do |fd| + mappings << build_inline_field_json_mapping(fd) if fd.name + end + end + model.choice_group&.each do |cg| + cg.field&.each do |fr| + mappings << build_field_json_mapping(fr, cg.group_as) + end + cg.define_field&.each do |fd| + mappings << build_inline_field_json_mapping(fd) if fd.name + end + end + + mappings + end + + def build_field_json_mapping(field_ref, override_group_as = nil) + ref_name = field_ref.ref + return nil unless ref_name + + group_as = override_group_as || field_ref.group_as + field_def = @field_defs[ref_name] + field_klass = @classes["Field_#{ref_name.gsub('-', '_')}"] + has_flags = field_has_flags?(field_def) + + json_name = if group_as + group_as.name + else + field_ref.use_name&.content || ref_name + end + attr_name = safe_attr(ref_name) + + # Check for BY_KEY group-as + if group_as&.in_json == "BY_KEY" + json_key_flag = field_def&.json_key&.flag_ref + return { + json_name: json_name, attr_name: attr_name, + by_key: true, field_klass: field_klass, + json_key_flag: json_key_flag + } + end + + # Check for json-value-key-flag pattern + if field_klass&.instance_variable_get(:@json_vk_flag_key_attr) + return { + json_name: json_name, attr_name: attr_name, + vk_flag: true, field_klass: field_klass + } + end + + if has_flags + is_soa = group_as && ["SINGLETON_OR_ARRAY", + "ARRAY"].include?(group_as.in_json) + method_suffix = "#{attr_name}_#{json_name.gsub('-', '_')}" + if is_soa + result = { + json_name: json_name, attr_name: attr_name, scalar: false, + singleton_or_array: true, field_klass: field_klass, + to_method: :"json_soa_to_#{method_suffix}", + from_method: :"json_soa_from_#{method_suffix}" + } + # Include ref_name for SOA fields with group-as, so we can also + # accept the ref name as a JSON key during deserialization (some + # NIST worked examples use ref name instead of group-as name). + if group_as && ref_name != json_name + result[:alt_json_name] = + ref_name + end + result + else + # Singleton field with flags: typed instance, no array wrapping + { + json_name: json_name, attr_name: attr_name, scalar: true, + has_flags: true, field_klass: field_klass, + to_method: :"json_to_#{method_suffix}", + from_method: :"json_from_#{method_suffix}" + } + end + else + method_suffix = "#{attr_name}_#{json_name.gsub('-', '_')}" + { + json_name: json_name, attr_name: attr_name, scalar: true, + field_klass: field_klass, + to_method: :"json_to_#{method_suffix}", + from_method: :"json_from_#{method_suffix}" + } + end + end + + def build_inline_field_json_mapping(field_def) + json_name = field_def.name + attr_name = safe_attr(field_def.name) + has_flags = field_has_flags?(field_def) + + if has_flags + field_klass = @classes[scoped_field_name(field_def.name)] + method_suffix = "#{attr_name}_#{json_name.gsub('-', '_')}" + { + json_name: json_name, attr_name: attr_name, scalar: false, + singleton_or_array: true, field_klass: field_klass, + to_method: :"json_soa_to_#{method_suffix}", + from_method: :"json_soa_from_#{method_suffix}" + } + else + { json_name: json_name, attr_name: attr_name, scalar: false } + end + end + + def field_has_flags?(field_def, _field_ref = nil) + return false unless field_def + + field_def.define_flag&.any? || field_def.flag&.any? || field_def.json_value_key || field_def.json_value_key_flag + end + + def collect_json_assembly_mappings(assembly_def) + mappings = [] + model = assembly_def.model + return mappings unless model + + mappings.concat(collect_model_json_assembly_mappings(model)) + mappings + end + + def collect_model_json_assembly_mappings(model) + mappings = [] + + model.assembly&.each do |ar| + ref_name = ar.ref + next unless ref_name + + group_as = ar.group_as + json_name = group_as&.name || ar.use_name&.content || ref_name + # When GROUPED in XML, the attribute is the group-as name (wrapper). + # Otherwise it's the ref name (direct collection). + attr_name = group_as&.in_xml == "GROUPED" ? safe_attr(group_as.name) : safe_attr(ref_name) + mapping = { json_name: json_name, attr_name: attr_name } + if group_as&.in_json == "BY_KEY" + asm_def = @assembly_defs[ref_name] + json_key_flag = asm_def&.json_key&.flag_ref + asm_klass = @classes["Assembly_#{ref_name.gsub('-', '_')}"] + mapping[:by_key] = true + mapping[:asm_klass] = asm_klass + mapping[:json_key_flag] = json_key_flag + mapping[:grouped] = true if group_as&.in_xml == "GROUPED" + if group_as&.in_xml == "GROUPED" + mapping[:child_attr] = + safe_attr(ref_name) + end + else + check_assembly_soa(mapping, group_as, attr_name, json_name) + end + mappings << mapping + end + + model.define_assembly&.each do |ad| + next unless ad.name + + group_as = ad.group_as + json_name = group_as&.name || ad.name + attr_name = safe_attr(ad.name) + mapping = { json_name: json_name, attr_name: attr_name } + if group_as&.in_json == "BY_KEY" + json_key_flag = ad.json_key&.flag_ref + mapping[:by_key] = true + mapping[:json_key_flag] = json_key_flag + else + check_assembly_soa(mapping, group_as, attr_name, json_name) + end + mappings << mapping + end + + model.choice&.each do |c| + c.assembly&.each do |ar| + ref_name = ar.ref + next unless ref_name + + group_as = ar.group_as + json_name = group_as&.name || ar.use_name&.content || ref_name + attr_name = safe_attr(ref_name) + mapping = { json_name: json_name, attr_name: attr_name } + check_assembly_soa(mapping, group_as, attr_name, json_name) + mappings << mapping + end + c.define_assembly&.each do |ad| + next unless ad.name + + group_as = ad.group_as + json_name = group_as&.name || ad.name + attr_name = safe_attr(ad.name) + mapping = { json_name: json_name, attr_name: attr_name } + check_assembly_soa(mapping, group_as, attr_name, json_name) + mappings << mapping + end + end + + model.choice_group&.each do |cg| + group_as = cg.group_as + json_name = group_as&.name + cg.assembly&.each do |ar| + ref_name = ar.ref + next unless ref_name + + name = json_name || ar.use_name&.content || ref_name + attr_name = safe_attr(ref_name) + mapping = { json_name: name, attr_name: attr_name } + check_assembly_soa(mapping, group_as, attr_name, name) + mappings << mapping + end + cg.define_assembly&.each do |ad| + next unless ad.name + + name = json_name || ad.name + attr_name = safe_attr(ad.name) + mapping = { json_name: name, attr_name: attr_name } + check_assembly_soa(mapping, group_as, attr_name, name) + mappings << mapping + end + end + + mappings + end + + def check_assembly_soa(mapping, group_as, attr_name, json_name) + is_soa = group_as&.in_json == "SINGLETON_OR_ARRAY" || group_as.nil? + return unless is_soa + + method_suffix = "#{attr_name}_#{json_name.gsub('-', '_')}" + mapping[:singleton_or_array] = true + mapping[:to_method] = :"json_assembly_soa_to_#{method_suffix}" + mapping[:from_method] = :"json_assembly_soa_from_#{method_suffix}" + # Attach the assembly class for casting in from: callback + asm_klass = @classes["Assembly_#{attr_name.to_s.gsub('-', '_')}"] + mapping[:asm_klass] = asm_klass if asm_klass + end + + # ── Assembly Class Generation ───────────────────────────────────── + + def create_assembly_placeholder(assembly_def) + return unless assembly_def.name + + klass_name = "Assembly_#{assembly_def.name.gsub('-', '_')}" + @classes[klass_name] ||= Class.new(Lutaml::Model::Serializable) + end + + def populate_assembly_class(assembly_def) + return unless assembly_def.name + + klass_name = "Assembly_#{assembly_def.name.gsub('-', '_')}" + klass = @classes[klass_name] + return unless klass + + @current_assembly_name = assembly_def.name.gsub("-", "_") + + assembly_def.define_flag&.each { |f| add_inline_flag(klass, f) } + assembly_def.flag&.each { |f| add_flag_reference(klass, f) } + + process_model(klass, assembly_def.model) if assembly_def.model + + root_name = assembly_def.root_name&.content || assembly_def.name + build_assembly_xml(klass, root_name, assembly_def) + build_assembly_json(klass, root_name, assembly_def) + + if assembly_def.root_name&.content + add_json_root_handling(klass, + root_name) + end + + apply_constraint_validation(klass, assembly_def.constraint) + klass.instance_variable_set(:@populated, true) + ensure + @current_assembly_name = nil + end + + def build_assembly_xml(klass, root_name, assembly_def) + flag_defs = assembly_def.define_flag || [] + flag_refs = assembly_def.flag || [] + child_mappings = collect_child_mappings(assembly_def) + + # Precompute safe attribute names + flag_attr_maps = flag_defs.filter_map do |f| + [f.name, safe_attr(f.name)] if f.name + end + flag_ref_maps = flag_refs.filter_map do |f| + [f.ref, safe_attr(f.ref)] if f.ref + end + + klass.class_eval do + xml do + element root_name + ordered + + flag_attr_maps.each do |xml_name, attr_name| + map_attribute xml_name, to: attr_name + end + + flag_ref_maps.each do |xml_name, attr_name| + map_attribute xml_name, to: attr_name + end + + child_mappings.each do |mapping| + map_element mapping[:xml_name], to: mapping[:attr_name] + end + end + end + end + + def collect_child_mappings(assembly_def) + mappings = [] + model = assembly_def.model + return mappings unless model + + mappings.concat(collect_model_child_mappings(model)) + mappings + end + + def collect_model_child_mappings(model) + mappings = [] + + model.field&.each do |field_ref| + ref_name = field_ref.ref + next unless ref_name + + xml_name = field_ref.use_name&.content || ref_name + group_as = field_ref.group_as + grouped = group_as&.in_xml == "GROUPED" + + mappings << build_child_mapping(xml_name, group_as, grouped, ref_name) + end + + model.assembly&.each do |assembly_ref| + ref_name = assembly_ref.ref + next unless ref_name + + xml_name = assembly_xml_element_name(assembly_ref) + group_as = assembly_ref.group_as + grouped = group_as&.in_xml == "GROUPED" + + attr_name = grouped ? safe_attr(group_as.name) : safe_attr(ref_name) + mappings << { xml_name: grouped ? group_as.name : xml_name, + attr_name: attr_name, grouped: grouped } + end + + model.define_field&.each do |inline_def| + next unless inline_def.name + + mappings << { xml_name: inline_def.name, + attr_name: safe_attr(inline_def.name), grouped: false } + end + + model.define_assembly&.each do |inline_def| + next unless inline_def.name + + mappings << { xml_name: inline_def.name, + attr_name: safe_attr(inline_def.name), grouped: false } + end + + model.choice&.each do |c| + mappings.concat(collect_choice_child_mappings(c)) + end + model.choice_group&.each do |cg| + mappings.concat(collect_choice_group_child_mappings(cg)) + end + + mappings + end + + def collect_choice_child_mappings(choice) + mappings = [] + + choice.field&.each do |field_ref| + ref_name = field_ref.ref + next unless ref_name + + xml_name = field_ref.use_name&.content || ref_name + group_as = field_ref.group_as + grouped = group_as&.in_xml == "GROUPED" + + mappings << build_child_mapping(xml_name, group_as, grouped, ref_name) + end + + choice.assembly&.each do |assembly_ref| + ref_name = assembly_ref.ref + next unless ref_name + + xml_name = assembly_xml_element_name(assembly_ref) + group_as = assembly_ref.group_as + grouped = group_as&.in_xml == "GROUPED" + + attr_name = grouped ? safe_attr(group_as.name) : safe_attr(ref_name) + mappings << { xml_name: grouped ? group_as.name : xml_name, + attr_name: attr_name, grouped: grouped } + end + + choice.define_field&.each do |inline_def| + next unless inline_def.name + + mappings << { xml_name: inline_def.name, + attr_name: safe_attr(inline_def.name), grouped: false } + end + + choice.define_assembly&.each do |inline_def| + next unless inline_def.name + + mappings << { xml_name: inline_def.name, + attr_name: safe_attr(inline_def.name), grouped: false } + end + + mappings + end + + def collect_choice_group_child_mappings(choice_group) + mappings = [] + + choice_group.field&.each do |field_ref| + ref_name = field_ref.ref + next unless ref_name + + xml_name = field_ref.use_name&.content || ref_name + group_as = choice_group.group_as + grouped = group_as&.in_xml == "GROUPED" + mappings << build_child_mapping(xml_name, group_as, grouped, ref_name) + end + + choice_group.assembly&.each do |assembly_ref| + ref_name = assembly_ref.ref + next unless ref_name + + xml_name = assembly_xml_element_name(assembly_ref) + group_as = choice_group.group_as + grouped = group_as&.in_xml == "GROUPED" + attr_name = grouped ? safe_attr(group_as.name) : safe_attr(ref_name) + mappings << { xml_name: grouped ? group_as.name : xml_name, + attr_name: attr_name, grouped: grouped } + end + + choice_group.define_field&.each do |inline_def| + next unless inline_def.name + + mappings << { xml_name: inline_def.name, + attr_name: safe_attr(inline_def.name), grouped: false } + end + + choice_group.define_assembly&.each do |inline_def| + next unless inline_def.name + + mappings << { xml_name: inline_def.name, + attr_name: safe_attr(inline_def.name), grouped: false } + end + + mappings + end + + def build_child_mapping(xml_name, group_as, grouped, ref_name = nil) + if grouped + { xml_name: group_as.name, attr_name: safe_attr(group_as.name), + grouped: true } + else + attr_name = safe_attr(ref_name || xml_name) + { xml_name: xml_name, attr_name: attr_name, grouped: false } + end + end + + # ── Model Processing ────────────────────────────────────────────── + + def process_model(klass, model) + # Initialize occurrence constraints registry + unless klass.instance_variable_defined?(:@occurrence_constraints) + klass.instance_variable_set(:@occurrence_constraints, + {}) + end + occ = klass.instance_variable_get(:@occurrence_constraints) + + model.field&.each do |fr| + add_field_reference(klass, fr) + record_occurrence_constraint(occ, fr) + end + model.assembly&.each do |ar| + add_assembly_reference(klass, ar) + record_occurrence_constraint(occ, ar) + end + model.define_field&.each { |fd| add_inline_field(klass, fd) } + model.define_assembly&.each { |ad| add_inline_assembly(klass, ad) } + model.choice&.each { |c| process_choice(klass, c) } + model.choice_group&.each { |cg| process_choice_group(klass, cg) } + add_any_content(klass) if model.any + + # Add validate_occurrences method if not already defined + unless klass.method_defined?(:validate_occurrences) + occ_ref = klass.instance_variable_get(:@occurrence_constraints) + klass.define_method(:validate_occurrences) do + Metaschema::ConstraintValidator.validate_occurrences(self, occ_ref) + end + end + end + + def record_occurrence_constraint(occ, ref) + ref_name = ref.ref + return unless ref_name + + attr_name = safe_attr(ref_name) + min = ref.min_occurs.to_i + max_raw = ref.max_occurs + max = max_raw == "unbounded" ? nil : max_raw&.to_i + + occ[attr_name] = { min: min, max: max } if min.positive? || max + end + + def add_field_reference(klass, field_ref) + ref_name = field_ref.ref + return unless ref_name + + field_klass = @classes["Field_#{ref_name.gsub('-', '_')}"] + return unless field_klass + + collection = unbounded?(field_ref.max_occurs) + group_as = field_ref.group_as + + if group_as&.in_xml == "GROUPED" + group_attr = safe_attr(group_as.name) + wrapper_klass = Class.new(Lutaml::Model::Serializable) + child_attr = safe_attr(ref_name) + wrapper_klass.attribute child_attr, field_klass, collection: true + wrapper_klass.class_eval do + xml do + element group_as.name + map_element ref_name, to: child_attr + end + end + klass.attribute group_attr, wrapper_klass + else + attr_name = safe_attr(ref_name) + klass.attribute attr_name, field_klass, collection: collection + end + end + + def add_assembly_reference(klass, assembly_ref) + ref_name = assembly_ref.ref + return unless ref_name + + assembly_klass = @classes["Assembly_#{ref_name.gsub('-', '_')}"] || + create_placeholder_assembly(ref_name) + + collection = unbounded?(assembly_ref.max_occurs) + group_as = assembly_ref.group_as + xml_name = assembly_xml_element_name(assembly_ref) + + if group_as&.in_xml == "GROUPED" + group_attr = safe_attr(group_as.name) + child_attr = safe_attr(ref_name) + wrapper_klass = Class.new(Lutaml::Model::Serializable) + wrapper_klass.attribute child_attr, assembly_klass, collection: true + wrapper_klass.class_eval do + xml do + element group_as.name + map_element xml_name, to: child_attr + end + end + klass.attribute group_attr, wrapper_klass + else + attr_name = safe_attr(ref_name) + klass.attribute attr_name, assembly_klass, collection: collection + end + end + + def add_inline_field(klass, field_def) + return unless field_def.name + + attr_name = safe_attr(field_def.name) + is_markup = TypeMapper.markup?(field_def.as_type) + is_multiline = TypeMapper.multiline?(field_def.as_type) + content_type = TypeMapper.map(field_def.as_type) + collection = unbounded?(field_def.max_occurs) + has_flags = field_def.define_flag&.any? || field_def.flag&.any? + + if is_markup || is_multiline + inline_klass = Class.new(Lutaml::Model::Serializable) + if is_multiline + apply_markup_multiline_attributes(inline_klass) + else + apply_markup_attributes(inline_klass) + end + + field_def.define_flag&.each { |f| add_inline_flag(inline_klass, f) } + field_def.flag&.each { |f| add_flag_reference(inline_klass, f) } + + inline_name = field_def.name + inline_flag_defs = field_def.define_flag || [] + inline_flag_refs = field_def.flag || [] + inline_flag_attr_maps = inline_flag_defs.filter_map do |f| + [f.name, safe_attr(f.name)] if f.name + end + inline_flag_ref_maps = inline_flag_refs.filter_map do |f| + [f.ref, safe_attr(f.ref)] if f.ref + end + + inline_klass.class_eval do + xml do + element inline_name + mixed_content + ordered + map_content to: :content + map_element "a", to: :a + map_element "insert", to: :insert + map_element "br", to: :br + map_element "code", to: :code + map_element "em", to: :em + map_element "i", to: :i + map_element "b", to: :b + map_element "strong", to: :strong + map_element "sub", to: :sub + map_element "sup", to: :sup + map_element "q", to: :q + map_element "img", to: :img + + if is_multiline + map_element "p", to: :p + map_element "h1", to: :h1 + map_element "h2", to: :h2 + map_element "h3", to: :h3 + map_element "h4", to: :h4 + map_element "h5", to: :h5 + map_element "h6", to: :h6 + map_element "ul", to: :ul + map_element "ol", to: :ol + map_element "pre", to: :pre + map_element "hr", to: :hr + map_element "blockquote", to: :blockquote + map_element "table", to: :table + end + + inline_flag_attr_maps.each do |xml_name, attr_name| + map_attribute xml_name, to: attr_name + end + + inline_flag_ref_maps.each do |xml_name, attr_name| + map_attribute xml_name, to: attr_name + end + end + end + + klass.attribute attr_name, inline_klass, collection: collection + elsif has_flags + # Non-markup field with flags needs its own class for flag attributes + inline_klass = Class.new(Lutaml::Model::Serializable) + inline_klass.attribute :content, content_type + field_def.define_flag&.each { |f| add_inline_flag(inline_klass, f) } + field_def.flag&.each { |f| add_flag_reference(inline_klass, f) } + + flag_attr_maps = field_def.define_flag&.filter_map do |f| + [f.name, safe_attr(f.name)] if f.name + end || [] + flag_ref_maps = field_def.flag&.filter_map do |f| + [f.ref, safe_attr(f.ref)] if f.ref + end || [] + + inline_name = field_def.name + inline_klass.class_eval do + xml do + element inline_name + map_content to: :content + flag_attr_maps.each do |xml_name, attr_sym| + map_attribute xml_name, to: attr_sym + end + flag_ref_maps.each do |xml_name, attr_sym| + map_attribute xml_name, to: attr_sym + end + end + key_value do + root inline_name + map "STRVALUE", to: :content + flag_attr_maps.each do |xml_name, attr_sym| + map xml_name, to: attr_sym + end + flag_ref_maps.each do |xml_name, attr_sym| + map xml_name, to: attr_sym + end + end + end + + # Register inline field class for JSON mapping lookups (scoped to parent) + klass_name = scoped_field_name(field_def.name) + @classes[klass_name] = inline_klass + + klass.attribute attr_name, inline_klass, collection: collection + else + klass.attribute attr_name, content_type, collection: collection + end + end + + def add_inline_assembly(klass, assembly_def) + return unless assembly_def.name + + attr_name = safe_attr(assembly_def.name) + collection = unbounded?(assembly_def.max_occurs) + + inline_klass = Class.new(Lutaml::Model::Serializable) + + assembly_def.define_flag&.each { |f| add_inline_flag(inline_klass, f) } + assembly_def.flag&.each { |f| add_flag_reference(inline_klass, f) } + + process_model(inline_klass, assembly_def.model) if assembly_def.model + + inline_name = assembly_def.name + inline_flag_defs = assembly_def.define_flag || [] + inline_flag_refs = assembly_def.flag || [] + inline_child_mappings = assembly_def.model ? collect_inline_child_mappings(assembly_def) : [] + inline_flag_attr_maps = inline_flag_defs.filter_map do |f| + [f.name, safe_attr(f.name)] if f.name + end + inline_flag_ref_maps = inline_flag_refs.filter_map do |f| + [f.ref, safe_attr(f.ref)] if f.ref + end + + inline_klass.class_eval do + xml do + element inline_name + ordered + + inline_flag_attr_maps.each do |xml_name, attr_name| + map_attribute xml_name, to: attr_name + end + + inline_flag_ref_maps.each do |xml_name, attr_name| + map_attribute xml_name, to: attr_name + end + + inline_child_mappings.each do |mapping| + map_element mapping[:xml_name], to: mapping[:attr_name] + end + end + end + + klass.attribute attr_name, inline_klass, collection: collection + + # Add JSON mappings for the inline assembly + build_inline_assembly_json(klass, inline_klass, inline_name, assembly_def) + end + + def build_inline_assembly_json(_parent_klass, inline_klass, inline_name, +assembly_def) + flag_defs = assembly_def.define_flag || [] + flag_refs = assembly_def.flag || [] + + inline_flag_attr_maps = flag_defs.filter_map do |f| + [f.name, safe_attr(f.name)] if f.name + end + inline_flag_ref_maps = flag_refs.filter_map do |f| + [f.ref, safe_attr(f.ref)] if f.ref + end + + json_field_mappings = collect_json_field_mappings(assembly_def) + json_assembly_mappings = collect_json_assembly_mappings(assembly_def) + + # Check if this inline assembly has any nested assembly children + # that might be empty objects (choice assemblies). If so, we need + # custom JSON handling because lutaml-model skips empty nested models. + has_nested_asm = json_assembly_mappings.any? + + if has_nested_asm + # Use custom of_json/to_json that handles empty nested assemblies + build_inline_assembly_json_custom( + inline_klass, inline_name, inline_flag_attr_maps, inline_flag_ref_maps, + json_field_mappings, json_assembly_mappings + ) + else + # Standard lutaml-model mapping approach + build_inline_assembly_json_standard( + inline_klass, inline_name, inline_flag_attr_maps, inline_flag_ref_maps, + json_field_mappings + ) + end + end + + def build_inline_assembly_json_standard(inline_klass, inline_name, + inline_flag_attr_maps, inline_flag_ref_maps, + json_field_mappings) + regular_field_mappings = json_field_mappings.reject do |m| + m[:vk_flag] || m[:by_key] + end + vk_flag_mappings = json_field_mappings.select { |m| m[:vk_flag] } + by_key_mappings = json_field_mappings.select { |m| m[:by_key] } + + inline_klass.class_eval do + key_value do + root inline_name + + inline_flag_attr_maps.each do |xml_name, attr_name| + map xml_name, to: attr_name + end + + inline_flag_ref_maps.each do |xml_name, attr_name| + map xml_name, to: attr_name + end + + regular_field_mappings.each do |mapping| + if mapping[:scalar] + map mapping[:json_name], to: mapping[:attr_name], + with: { to: mapping[:to_method], from: mapping[:from_method] } + else + map mapping[:json_name], to: mapping[:attr_name], + render_empty: true + end + end + end + end + + define_scalar_field_callbacks(inline_klass, regular_field_mappings) + + vk_flag_mappings.each do |mapping| + callbacks = build_vk_flag_field_callbacks( + inline_klass, mapping[:field_klass], mapping[:json_name], mapping[:attr_name] + ) + inline_klass.class_eval do + key_value do + map mapping[:json_name], to: mapping[:attr_name], + with: { to: callbacks[:to_method], from: callbacks[:from_method] } + end + end + end + + by_key_mappings.each do |mapping| + callbacks = build_by_key_field_callbacks( + inline_klass, mapping[:field_klass], mapping[:json_name], + mapping[:attr_name], mapping[:json_key_flag] + ) + inline_klass.class_eval do + key_value do + map mapping[:json_name], to: mapping[:attr_name], + with: { to: callbacks[:to_method], from: callbacks[:from_method] } + end + end + end + end + + def build_inline_assembly_json_custom(inline_klass, inline_name, + inline_flag_attr_maps, inline_flag_ref_maps, + json_field_mappings, json_assembly_mappings) + # Build full JSON mappings — include assembly mappings so lutaml-model's + # Transformation path can parse them when this class is nested in a parent. + regular_field_mappings = json_field_mappings.reject do |m| + m[:vk_flag] || m[:by_key] + end + vk_flag_mappings = json_field_mappings.select { |m| m[:vk_flag] } + by_key_mappings = json_field_mappings.select { |m| m[:by_key] } + + # Pre-generate method names for assembly mappings (only to: for serialization) + json_assembly_mappings.each do |mapping| + json_name = mapping[:json_name] + attr_sym = mapping[:attr_name] + mapping[:to_method] = + :"json_to_asm_#{attr_sym}_#{json_name.gsub('-', '_')}" + end + + inline_klass.class_eval do + key_value do + root inline_name + + inline_flag_attr_maps.each do |xml_name, attr_name| + map xml_name, to: attr_name + end + + inline_flag_ref_maps.each do |xml_name, attr_name| + map xml_name, to: attr_name + end + + regular_field_mappings.each do |mapping| + if mapping[:scalar] + map mapping[:json_name], to: mapping[:attr_name], + with: { to: mapping[:to_method], from: mapping[:from_method] } + else + map mapping[:json_name], to: mapping[:attr_name], + render_empty: true + end + end + + # Assembly mappings use to: override for serialization. + # Default from: handles casting via lutaml-model's built-in mechanism. + json_assembly_mappings.each do |mapping| + map mapping[:json_name], to: mapping[:attr_name], + with: { to: mapping[:to_method] } + end + end + end + + # Define with: callback methods for scalar field mappings + define_scalar_field_callbacks(inline_klass, regular_field_mappings) + + vk_flag_mappings.each do |mapping| + callbacks = build_vk_flag_field_callbacks( + inline_klass, mapping[:field_klass], mapping[:json_name], mapping[:attr_name] + ) + inline_klass.class_eval do + key_value do + map mapping[:json_name], to: mapping[:attr_name], + with: { to: callbacks[:to_method], from: callbacks[:from_method] } + end + end + end + + by_key_mappings.each do |mapping| + callbacks = build_by_key_field_callbacks( + inline_klass, mapping[:field_klass], mapping[:json_name], + mapping[:attr_name], mapping[:json_key_flag] + ) + inline_klass.class_eval do + key_value do + map mapping[:json_name], to: mapping[:attr_name], + with: { to: callbacks[:to_method], from: callbacks[:from_method] } + end + end + end + + # Define to: callback methods for assembly mappings. + json_assembly_mappings.each do |mapping| + attr_sym = mapping[:attr_name] + to_method = mapping[:to_method] + json_name = mapping[:json_name] + + inline_klass.define_method(to_method) do |instance, doc| + current = instance.instance_variable_get("@#{attr_sym}") + if current + if current.is_a?(Lutaml::Model::Serializable) + # Serialize the nested assembly's attributes into the doc + sub = {} + current.class.mappings_for(:json).instance_variable_get(:@mappings).each do |key, rule| + val = current.send(rule.to) + next if val.nil? + + sub[key] = val.respond_to?(:content) ? val.content : val + end + doc[json_name] = sub.empty? ? {} : sub + else + doc[json_name] = current + end + end + end + end + end + + def define_scalar_field_callbacks(klass, field_mappings) + field_mappings.each do |mapping| + next unless mapping[:scalar] + + field_klass = mapping[:field_klass] + attr_sym = mapping[:attr_name] + + klass.define_method(mapping[:from_method]) do |instance, value| + if value.is_a?(Array) + instance.instance_variable_set("@#{attr_sym}", value.map do |v| + field_klass.new(content: v) + end) + elsif value + instance.instance_variable_set("@#{attr_sym}", + field_klass.new(content: value)) + end + end + + klass.define_method(mapping[:to_method]) do |instance, doc| + current = instance.instance_variable_get("@#{attr_sym}") + if current.is_a?(Array) + doc[mapping[:json_name]] = current.map do |item| + item.respond_to?(:content) ? item.content : item + end + elsif current + doc[mapping[:json_name]] = + current.respond_to?(:content) ? current.content : current + end + end + end + end + + def collect_inline_child_mappings(assembly_def) + model = assembly_def.model + return [] unless model + + collect_model_child_mappings(model) + end + + # ── Flag Handling ───────────────────────────────────────────────── + + def add_inline_flag(klass, flag_def) + return unless flag_def.name + + attr_name = safe_attr(flag_def.name) + type = TypeMapper.map(flag_def.as_type) + klass.attribute attr_name, type + end + + def add_flag_reference(klass, flag_ref) + return unless flag_ref.ref + + flag_name = flag_ref.ref + flag_def = @flag_defs[flag_name] + attr_name = safe_attr(flag_name) + type = flag_def ? TypeMapper.map(flag_def.as_type) : :string + klass.attribute attr_name, type + end + + # ── Choice Handling ─────────────────────────────────────────────── + + def process_choice(klass, choice) + choice.assembly&.each { |ar| add_assembly_reference(klass, ar) } + choice.field&.each { |fr| add_field_reference(klass, fr) } + choice.define_assembly&.each { |ad| add_inline_assembly(klass, ad) } + choice.define_field&.each { |fd| add_inline_field(klass, fd) } + end + + def process_choice_group(klass, choice_group) + choice_group.assembly&.each do |ar| + add_grouped_assembly_reference(klass, ar) + end + choice_group.field&.each { |fr| add_grouped_field_reference(klass, fr) } + choice_group.define_assembly&.each { |ad| add_inline_assembly(klass, ad) } + choice_group.define_field&.each { |fd| add_inline_field(klass, fd) } + end + + def add_grouped_assembly_reference(klass, grouped_ref) + ref_name = grouped_ref.ref + return unless ref_name + + assembly_klass = @classes["Assembly_#{ref_name.gsub('-', '_')}"] || + create_placeholder_assembly(ref_name) + + attr_name = safe_attr(ref_name) + klass.attribute attr_name, assembly_klass + end + + def add_grouped_field_reference(klass, grouped_ref) + ref_name = grouped_ref.ref + return unless ref_name + + field_klass = @classes["Field_#{ref_name.gsub('-', '_')}"] + return unless field_klass + + attr_name = safe_attr(ref_name) + klass.attribute attr_name, field_klass + end + + # ── Helpers ─────────────────────────────────────────────────────── + + def scoped_field_name(field_name) + base = "Field_#{field_name.gsub('-', '_')}" + @current_assembly_name ? "#{base}_in_#{@current_assembly_name}" : base + end + + def unbounded?(max_occurs) + max_occurs == "unbounded" || (max_occurs.to_i > 1 && max_occurs != "1") + end + + def create_placeholder_assembly(name) + key = "Assembly_#{name.gsub('-', '_')}" + @classes[key] ||= Class.new(Lutaml::Model::Serializable) + end + + def add_any_content(klass) + klass.attribute :any_content, :string + end + + def add_json_root_handling(klass, json_root) + klass.instance_variable_set(:@json_root_name, json_root) + class << klass + attr_reader :json_root_name + end + + original_of_json = klass.method(:of_json) + klass.define_singleton_method(:of_json) do |doc, options = {}| + if doc.is_a?(Hash) && doc.key?(json_root_name) + original_of_json.call(doc[json_root_name], options) + else + original_of_json.call(doc, options) + end + end + + original_to_json = klass.method(:to_json) + klass.define_singleton_method(:to_json) do |instance, options = {}| + json_str = original_to_json.call(instance, options) + { json_root_name => JSON.parse(json_str) }.to_json + end + + klass.send(:define_method, :to_json) do |options = {}| + self.class.to_json(self, options) + end + + # YAML root wrapping — mirrors JSON root handling + original_of_yaml = klass.method(:of_yaml) + klass.define_singleton_method(:of_yaml) do |doc, options = {}| + if doc.is_a?(Hash) && doc.key?(json_root_name) + original_of_yaml.call(doc[json_root_name], options) + else + original_of_yaml.call(doc, options) + end + end + + original_to_yaml = klass.method(:to_yaml) + klass.define_singleton_method(:to_yaml) do |instance, options = {}| + yaml_str = original_to_yaml.call(instance, options) + data = YAML.safe_load(yaml_str, + permitted_classes: [Date, DateTime, Time, Symbol]) + { json_root_name => data }.to_yaml + end + + klass.send(:define_method, :to_yaml) do |options = {}| + self.class.to_yaml(self, options) + end + end + + # ── Constraint Validation Integration ────────────────────────────── + + def apply_constraint_validation(klass, constraint_def) + return unless constraint_def + + # Store the constraint definition on the class for later access + klass.instance_variable_set(:@metaschema_constraints, constraint_def) + klass.define_singleton_method(:metaschema_constraints) do + @metaschema_constraints + end + + klass.define_method(:validate_constraints) do + validator = ConstraintValidator.new + validator.validate(self, self.class.metaschema_constraints) + end + end + end +end diff --git a/lib/metaschema/preformatted_type.rb b/lib/metaschema/preformatted_type.rb index 36444a3..1e1f158 100644 --- a/lib/metaschema/preformatted_type.rb +++ b/lib/metaschema/preformatted_type.rb @@ -2,7 +2,7 @@ module Metaschema class PreformattedType < Lutaml::Model::Serializable - attribute :content, :string + attribute :content, :string, collection: true attribute :a, AnchorType, collection: true attribute :insert, InsertType, collection: true attribute :br, :string, collection: true diff --git a/lib/metaschema/root.rb b/lib/metaschema/root.rb index ff7225e..9aeee67 100644 --- a/lib/metaschema/root.rb +++ b/lib/metaschema/root.rb @@ -15,6 +15,7 @@ class Root < Lutaml::Model::Serializable attribute :define_assembly, GlobalAssemblyDefinitionType, collection: true attribute :define_field, GlobalFieldDefinitionType, collection: true attribute :define_flag, GlobalFlagDefinitionType, collection: true + attribute :augment, AugmentType, collection: true xml do element "METASCHEMA" @@ -34,6 +35,7 @@ class Root < Lutaml::Model::Serializable map_element "define-assembly", to: :define_assembly map_element "define-field", to: :define_field map_element "define-flag", to: :define_flag + map_element "augment", to: :augment end end end diff --git a/lib/metaschema/ruby_source_emitter.rb b/lib/metaschema/ruby_source_emitter.rb new file mode 100644 index 0000000..24a31f0 --- /dev/null +++ b/lib/metaschema/ruby_source_emitter.rb @@ -0,0 +1,869 @@ +# frozen_string_literal: true + +module Metaschema + # Emits Ruby source code from generated metaschema classes. + # + # After ModelGenerator#generate creates in-memory classes, this class + # introspects them and emits equivalent Ruby source code that can be + # saved to .rb files and loaded with `require`. + # + # Handles three kinds of type references: + # 1. Builtin types (:string, :integer, etc.) — emitted as symbol literals + # 2. Generated types (in @classes) — emitted as fully-qualified string refs + # 3. Framework types (named, from other gems) — emitted as bare class refs + # 4. Anonymous inline types — collected and emitted as separate named classes + # + # Usage: + # files = Metaschema::ModelGenerator.to_ruby_source( + # "oscal_complete_metaschema.xml", + # module_name: "Oscal::V1_2_1" + # ) + # files.each { |name, source| File.write(name, source) } + # + class RubySourceEmitter + BUILTIN_TYPES = %i[string integer boolean float date time datetime + symbol].freeze + RESERVED_CLASS_NAMES = %w[Base Hash Method Object Class Module].freeze + + def initialize(classes, module_name, generator) + @classes = classes + @module_name = module_name + @generator = generator + @class_name_cache = {} + @anon_name_map = {} # anonymous class → assigned name + end + + def emit + sorted = sort_classes + collect_anonymous_types(sorted) + files = {} + + source = emit_module_header + + # Emit anonymous types first (they're dependencies of named classes) + @anon_name_map.each_value do |anon_name| + anon_class = @anon_name_map.key(anon_name) + source += "\n#{emit_anonymous_class(anon_name, anon_class)}" + end + + sorted.each do |key, klass| + next unless klass.is_a?(Class) && klass < Lutaml::Model::Serializable + + source += "\n#{emit_class(key, klass)}" + end + source += emit_module_footer + files["all_models.rb"] = source + + files + end + + # Emit as separate files per root model type. + def emit_split + sorted = sort_classes + collect_anonymous_types(sorted) + root_classes = find_root_classes + emitted = Set.new + files = {} + + root_classes.each do |root_key, root_klass| + deps = find_dependencies(root_key, root_klass) + all_keys = ([root_key] + deps).uniq + + source = emit_module_header + + # Emit anonymous types needed by this root's dependency tree + emit_anon_deps_for(all_keys, source) + + all_keys.each do |key| + klass = @classes[key] + next unless klass.is_a?(Class) && klass < Lutaml::Model::Serializable + next if emitted.include?(key) + + source += "\n#{emit_class(key, klass)}" + emitted.add(key) + end + source += emit_module_footer + + filename = clean_class_name(root_key).gsub(/([a-z])([A-Z])/, + '\1_\2').downcase + ".rb" + files[filename] = source + end + + # Emit any remaining classes not covered by roots + remaining = sorted.except(*emitted) + unless remaining.empty? + source = emit_module_header + remaining.each do |key, klass| + next unless klass.is_a?(Class) && klass < Lutaml::Model::Serializable + + source += "\n#{emit_class(key, klass)}" + end + source += emit_module_footer + files["common.rb"] = source + end + + files + end + + private + + def collect_anonymous_types(sorted) + used_names = Set.new(sorted.map { |key, _| clean_class_name(key) }) + + sorted.each do |key, klass| + next unless klass.is_a?(Class) && klass < Lutaml::Model::Serializable + + klass.attributes.each do |attr_name, attr| + type = attr.type + next unless type.is_a?(Class) && type < Lutaml::Model::Serializable + next if @anon_name_map.key?(type) + next if @classes.any? { |_, v| v == type } + next if type.name && !type.name.empty? # Named framework type + + # Anonymous inline type — assign a name + parent_name = clean_class_name(key) + base = "#{parent_name}#{camelize(attr_name.to_s)}" + name = base + suffix = 2 + while used_names.include?(name) + name = "#{base}#{suffix}" + suffix += 1 + end + used_names.add(name) + @anon_name_map[type] = name + end + end + end + + def emit_anon_deps_for(keys, source) + # Find anonymous types referenced by these classes + keys.each do |key| + klass = @classes[key] + next unless klass + + klass.attributes.each_value do |attr| + type = attr.type + next unless type.is_a?(Class) && type < Lutaml::Model::Serializable + + anon_name = @anon_name_map[type] + next unless anon_name + + source += "\n#{emit_anonymous_class(anon_name, type)}" + end + end + end + + def sort_classes + flags = [] + fields = [] + assemblies = [] + + @classes.each do |key, klass| + case key + when /\AFlag_/ then flags << [key, klass] + when /\AField_/ then fields << [key, klass] + when /\AAssembly_/ then assemblies << [key, klass] + end + end + + flags + fields + assemblies + end + + def find_root_classes + @classes.select do |key, klass| + next unless key.start_with?("Assembly_") + + klass.instance_variable_defined?(:@json_root_name) && + klass.instance_variable_get(:@json_root_name) + end + end + + def find_dependencies(_root_key, root_klass) + deps = Set.new + queue = [root_klass] + + while (klass = queue.shift) + klass.attributes.each_value do |attr| + type = attr.type + next unless type.is_a?(Class) && type < Lutaml::Model::Serializable + next if type == klass + + type_key = @classes.find { |_k, v| v == type }&.first + next unless type_key + next if deps.include?(type_key) + + deps.add(type_key) + queue << type + end + end + + deps.to_a + end + + def clean_class_name(key) + parts = key.sub(/\A(Assembly|Field|Flag)_/, "").split("_") + name = parts.map(&:capitalize).join + name = "#{name}Field" if RESERVED_CLASS_NAMES.include?(name) + name + end + + def camelize(str) + str.split("_").map(&:capitalize).join + end + + def type_reference(attr) + type = attr.type + if type.is_a?(Symbol) || BUILTIN_TYPES.include?(type) + ":#{type}" + elsif type.is_a?(Class) && type < Lutaml::Model::Serializable + key = @classes.find { |_, v| v == type }&.first + if key + # Generated type — use symbol for register-swappability + ":#{snake_case(clean_class_name(key))}" + elsif @anon_name_map.key?(type) + # Anonymous inline type — use symbol with assigned name + ":#{snake_case(@anon_name_map[type])}" + elsif type.name && !type.name.empty? + # Framework type from another gem — use bare class reference + type.name.to_s + else + ":string" + end + else + ":string" + end + end + + # Returns fully-qualified class name for use in method bodies (no quotes). + def type_constant(attr) + type = attr.type + if type.is_a?(Class) && type < Lutaml::Model::Serializable + key = @classes.find { |_, v| v == type }&.first + if key + "#{@module_name}::#{clean_class_name(key)}" + elsif @anon_name_map.key?(type) + "#{@module_name}::#{@anon_name_map[type]}" + else + type_name = type.name + type_name && !type_name.empty? ? type_name : nil + end + end + end + + def snake_case(str) + str + .gsub(/([A-Z]+)([A-Z][a-z])/, '\1_\2') + .gsub(/([a-z\d])([A-Z])/, '\1_\2') + .downcase + end + + def emit_module_header + register_id = derive_register_id + <<~RUBY + # frozen_string_literal: true + + module #{@module_name} + class Base < Lutaml::Model::Serializable + def self.lutaml_default_register + :#{register_id} + end + end + RUBY + end + + def derive_register_id + if @module_name.include?("::") + parts = @module_name.split("::") + ns = parts[0].downcase + ver = parts[1..].join("_").downcase.gsub(/^v/, "") + "#{ns}_#{ver}" + else + @module_name.downcase + end + end + + def emit_module_footer + "\nend\n" + end + + def emit_class(key, klass) + name = clean_class_name(key) + emit_named_class(name, klass) + end + + def emit_anonymous_class(name, klass) + emit_named_class(name, klass) + end + + def emit_named_class(name, klass) + lines = [] + lines << " class #{name} < Base" + + # Attributes + klass.attributes.each do |attr_name, attr| + type_ref = type_reference(attr) + opts = [] + opts << "collection: true" if attr.collection + lines << if opts.any? + " attribute :#{attr_name}, #{type_ref}, #{opts.join(', ')}" + else + " attribute :#{attr_name}, #{type_ref}" + end + end + + # XML mapping + xml_source = emit_xml_mapping(klass) + lines.concat(xml_source) if xml_source + + # Key-value mapping + kv_source = emit_key_value_mapping(klass) + lines.concat(kv_source) if kv_source + + # Custom methods for with: callbacks + custom_methods = emit_custom_methods(klass) + lines.concat(custom_methods) if custom_methods.any? + + # Root wrapping methods + root_methods = emit_root_wrapping(klass) + lines.concat(root_methods) if root_methods.any? + + # Constraint validation methods + constraint_methods = emit_constraint_methods(klass) + lines.concat(constraint_methods) if constraint_methods.any? + + # Occurrence validation + occ_methods = emit_occurrence_validation(klass) + lines.concat(occ_methods) if occ_methods + + lines << " end" + lines.join("\n") + end + + def emit_xml_mapping(klass) + xml_map = begin + klass.mappings_for(:xml) + rescue StandardError + nil + end + return nil unless xml_map + + lines = [] + lines << "" + lines << " xml do" + + element_name = xml_map.instance_variable_get(:@element_name) + lines << " element \"#{element_name}\"" if element_name + + if xml_map.instance_variable_get(:@mixed_content) + lines << " mixed_content" + end + + if xml_map.instance_variable_get(:@ordered) + lines << " ordered" + end + + # Content mapping + content = xml_map.instance_variable_get(:@content_mapping) + if content + lines << " map_content to: :#{content.to}" + end + + # Attribute mappings + xml_map.instance_variable_get(:@attributes)&.each do |xml_name, rule| + lines << " map_attribute \"#{xml_name}\", to: :#{rule.to}" + end + + # Element mappings + xml_map.instance_variable_get(:@elements)&.each do |xml_name, rule| + lines << " map_element \"#{xml_name}\", to: :#{rule.to}" + end + + lines << " end" + lines + end + + def emit_key_value_mapping(klass) + kv_map = begin + klass.mappings_for(:json) + rescue StandardError + nil + end + return nil unless kv_map + + mappings = kv_map.instance_variable_get(:@mappings) + return nil unless mappings && !mappings.empty? + + lines = [] + lines << "" + lines << " key_value do" + + root_name = kv_map.instance_variable_get(:@root_name) + lines << " root \"#{root_name}\"" if root_name && !root_name.empty? + + mappings.each do |json_name, rule| + custom = rule.custom_methods + if custom && (custom[:from] || custom[:to]) + opts = [] + opts << "to: :#{rule.to}" + opts_parts = ["with: { "] + with_parts = [] + with_parts << "to: :#{custom[:to]}" if custom[:to] + with_parts << "from: :#{custom[:from]}" if custom[:from] + opts_parts << with_parts.join(", ") + opts_parts << " }" + opts << opts_parts.join + lines << " map \"#{json_name}\", #{opts.join(', ')}" + else + render_empty = rule.instance_variable_get(:@render_empty) + lines << if render_empty + " map \"#{json_name}\", to: :#{rule.to}, render_empty: true" + else + " map \"#{json_name}\", to: :#{rule.to}" + end + end + end + + lines << " end" + lines + end + + def emit_custom_methods(klass) + methods = [] + custom_method_names = (klass.instance_methods(false) - Lutaml::Model::Serializable.instance_methods) + .select { |m| m.to_s.start_with?("json_") } + + return methods if custom_method_names.empty? + + custom_method_names.each do |method_name| + ms = method_name.to_s + source = if ms.start_with?("json_assembly_soa_from_") + emit_assembly_soa_from_method(klass, method_name) + elsif ms.start_with?("json_assembly_soa_to_") + emit_assembly_soa_to_method(klass, method_name) + elsif ms.start_with?("json_soa_from_") + emit_field_soa_from_method(klass, method_name) + elsif ms.start_with?("json_soa_to_") + emit_field_soa_to_method(klass, method_name) + elsif ms.start_with?("json_from_bykey_asm_") + emit_bykey_asm_from_method(klass, method_name) + elsif ms.start_with?("json_to_bykey_asm_") + emit_bykey_asm_to_method(klass, method_name) + elsif ms.start_with?("json_from_bykey_") + emit_bykey_from_method(klass, method_name) + elsif ms.start_with?("json_to_bykey_") + emit_bykey_to_method(klass, method_name) + elsif ms.start_with?("json_from_vkf_") + emit_vkf_from_method(klass, method_name) + elsif ms.start_with?("json_to_vkf_") + emit_vkf_to_method(klass, method_name) + elsif ms.start_with?("json_from_") + emit_scalar_from_method(klass, method_name) + elsif ms.start_with?("json_to_") + emit_scalar_to_method(klass, method_name) + end + methods.concat(source) if source + end + + methods + end + + def emit_scalar_from_method(klass, method_name) + attr_name = find_attr_for_method(klass, method_name) + return nil unless attr_name + + attr_sym = attr_name.to_sym + field_attr = klass.attributes[attr_sym] + return nil unless field_attr + + has_flags = field_attr.type.is_a?(Class) && field_attr.type < Lutaml::Model::Serializable + tc = type_constant(field_attr) + + lines = [] + lines << "" + lines << " def #{method_name}(instance, value)" + + lines << " if value.is_a?(Array)" + if has_flags && tc + lines << " parsed = value.map { |v| #{tc}.of_json(v) }" + lines << " instance.instance_variable_set(:@#{attr_name}, parsed)" + lines << " elsif value.is_a?(Hash)" + lines << " if value.empty?" + lines << " inst = #{tc}.new(content: \"\")" + lines << " instance.instance_variable_set(:@#{attr_name}, inst)" + lines << " else" + lines << " instance.instance_variable_set(:@#{attr_name}, #{tc}.of_json(value))" + lines << " end" + lines << " elsif value" + lines << " instance.instance_variable_set(:@#{attr_name}, #{tc}.of_json(value))" + else + lines << " instance.instance_variable_set(:@#{attr_name}, value.map { |v| #{tc || 'String'}.new(content: v) })" + lines << " elsif value" + lines << " instance.instance_variable_set(:@#{attr_name}, #{tc || 'String'}.new(content: value))" + end + lines << " end" + + lines << " end" + lines + end + + def emit_scalar_to_method(klass, method_name) + ms = method_name.to_s + ms.sub("json_to_", "") + + json_name = find_json_name_for_to_method(klass, method_name) + attr_name = find_attr_for_method(klass, method_name) + return nil unless attr_name + + field_attr = klass.attributes[attr_name.to_sym] + return nil unless field_attr + + has_flags = field_attr.type.is_a?(Class) && field_attr.type < Lutaml::Model::Serializable + tc = type_constant(field_attr) + + lines = [] + lines << "" + lines << " def #{method_name}(instance, doc)" + + lines << " current = instance.instance_variable_get(:@#{attr_name})" + lines << " if current.is_a?(Array)" + lines << " doc[\"#{json_name}\"] = current.map { |item| item.respond_to?(:content) ? item.content : item }" + lines << " elsif current" + if has_flags && tc + lines << " if current.is_a?(Lutaml::Model::Serializable)" + lines << " doc[\"#{json_name}\"] = #{tc}.as_json(current)" + lines << " else" + lines << " val = current.respond_to?(:content) ? current.content : current" + lines << " doc[\"#{json_name}\"] = val" + lines << " end" + else + lines << " doc[\"#{json_name}\"] = current.respond_to?(:content) ? current.content : current" + end + lines << " end" + + lines << " end" + lines + end + + def emit_field_soa_from_method(klass, method_name) + attr_name = find_attr_for_method(klass, method_name) + return nil unless attr_name + + field_attr = klass.attributes[attr_name.to_sym] + return nil unless field_attr + + tc = type_constant(field_attr) + + lines = [] + lines << "" + lines << " def #{method_name}(instance, value)" + lines << " items = case value" + lines << " when Hash then [value]" + lines << " when Array then value" + lines << " when String then [value]" + lines << " else return" + lines << " end" + + if tc + lines << " parsed = items.map do |item|" + lines << " case item" + lines << " when Hash then #{tc}.of_json(item)" + lines << " when String then #{tc}.of_json(item)" + lines << " else item" + lines << " end" + lines << " end" + else + # Anonymous/inline type — pass through as-is + lines << " parsed = items.map { |item| item.is_a?(Hash) ? item : item }" + end + + lines << " instance.instance_variable_set(:@#{attr_name}, parsed)" + lines << " end" + lines + end + + def emit_field_soa_to_method(klass, method_name) + attr_name = find_attr_for_method(klass, method_name) + return nil unless attr_name + + field_attr = klass.attributes[attr_name.to_sym] + return nil unless field_attr + + json_name = find_json_name_for_to_method(klass, method_name) + tc = type_constant(field_attr) + + lines = [] + lines << "" + lines << " def #{method_name}(instance, doc)" + lines << " current = instance.instance_variable_get(:@#{attr_name})" + lines << " if current.is_a?(Array)" + lines << " result = current.map do |item|" + + if tc + lines << " if item.is_a?(Lutaml::Model::Serializable)" + lines << " #{tc}.as_json(item)" + lines << " else" + lines << " item" + lines << " end" + else + lines << " item.respond_to?(:to_h) ? item.to_h : item" + end + + lines << " end" + lines << " doc[\"#{json_name}\"] = result.length == 1 ? result.first : result" + lines << " end" + lines << " end" + lines + end + + def emit_assembly_soa_from_method(klass, method_name) + attr_name = find_attr_for_method(klass, method_name) + return nil unless attr_name + + asm_attr = klass.attributes[attr_name.to_sym] + return nil unless asm_attr + + tc = type_constant(asm_attr) + + lines = [] + lines << "" + lines << " def #{method_name}(instance, value)" + lines << " items = case value" + lines << " when Hash then [value]" + lines << " when Array then value" + lines << " else return" + lines << " end" + + if tc + lines << " parsed = items.map { |item| #{tc}.of_json(item.is_a?(Hash) ? item : {}) }" + else + lines << " parsed = items" + end + + lines << " instance.instance_variable_set(:@#{attr_name}, parsed)" + lines << " end" + lines + end + + def emit_assembly_soa_to_method(klass, method_name) + attr_name = find_attr_for_method(klass, method_name) + return nil unless attr_name + + asm_attr = klass.attributes[attr_name.to_sym] + return nil unless asm_attr + + json_name = find_json_name_for_to_method(klass, method_name) + tc = type_constant(asm_attr) + + lines = [] + lines << "" + lines << " def #{method_name}(instance, doc)" + lines << " current = instance.instance_variable_get(:@#{attr_name})" + lines << " if current.is_a?(Array)" + lines << " result = current.map do |item|" + + if tc + lines << " if item.is_a?(Lutaml::Model::Serializable)" + lines << " #{tc}.as_json(item)" + lines << " else" + lines << " item" + lines << " end" + else + lines << " item.respond_to?(:to_h) ? item.to_h : item" + end + + lines << " end" + lines << " doc[\"#{json_name}\"] = result.length == 1 ? result.first : result" + lines << " end" + lines << " end" + lines + end + + def emit_bykey_from_method(klass, method_name) + # Simplified BY_KEY template + attr_name = find_attr_for_method(klass, method_name) + return nil unless attr_name + + lines = [] + lines << "" + lines << " def #{method_name}(instance, value)" + lines << " return unless value.is_a?(Hash)" + lines << " # BY_KEY deserialization handled by register" + lines << " instance.instance_variable_set(:@#{attr_name}, value.map { |k, v| [k, v] })" + lines << " end" + lines + end + + def emit_bykey_to_method(klass, method_name) + attr_name = find_attr_for_method(klass, method_name) + return nil unless attr_name + + json_name = find_json_name_for_to_method(klass, method_name) + + lines = [] + lines << "" + lines << " def #{method_name}(instance, doc)" + lines << " current = instance.instance_variable_get(:@#{attr_name})" + lines << " doc[\"#{json_name}\"] = current if current" + lines << " end" + lines + end + + def emit_bykey_asm_from_method(klass, method_name) + emit_bykey_from_method(klass, method_name) + end + + def emit_bykey_asm_to_method(klass, method_name) + emit_bykey_to_method(klass, method_name) + end + + def emit_vkf_from_method(klass, method_name) + emit_bykey_from_method(klass, method_name) + end + + def emit_vkf_to_method(klass, method_name) + emit_bykey_to_method(klass, method_name) + end + + def emit_root_wrapping(klass) + root_name = klass.instance_variable_get(:@json_root_name) + return [] unless root_name + + lines = [] + lines << "" + lines << " def self.of_json(doc, options = {})" + lines << " if doc.is_a?(Hash) && doc.key?(\"#{root_name}\")" + lines << " super(doc[\"#{root_name}\"], options)" + lines << " else" + lines << " super(doc, options)" + lines << " end" + lines << " end" + lines << "" + lines << " def self.to_json(instance, options = {})" + lines << " json_str = super(instance, options)" + lines << " { \"#{root_name}\" => JSON.parse(json_str) }.to_json" + lines << " end" + lines << "" + lines << " def self.of_yaml(doc, options = {})" + lines << " if doc.is_a?(Hash) && doc.key?(\"#{root_name}\")" + lines << " super(doc[\"#{root_name}\"], options)" + lines << " else" + lines << " super(doc, options)" + lines << " end" + lines << " end" + lines << "" + lines << " def self.to_yaml(instance, options = {})" + lines << " yaml_str = super(instance, options)" + lines << " data = YAML.safe_load(yaml_str, permitted_classes: [Date, Time, Symbol])" + lines << " { \"#{root_name}\" => data }.to_yaml" + lines << " end" + lines << "" + lines << " def to_json(options = {})" + lines << " self.class.to_json(self, options)" + lines << " end" + lines << "" + lines << " def to_yaml(options = {})" + lines << " self.class.to_yaml(self, options)" + lines << " end" + + lines + end + + def emit_constraint_methods(klass) + constraints = klass.instance_variable_get(:@metaschema_constraints) + return [] unless constraints + + lines = [] + lines << "" + lines << " def self.metaschema_constraints" + lines << " @metaschema_constraints" + lines << " end" + lines << "" + lines << " def validate_constraints" + lines << " validator = Metaschema::ConstraintValidator.new" + lines << " validator.validate(self, self.class.metaschema_constraints)" + lines << " end" + + lines + end + + def emit_occurrence_validation(klass) + occ = klass.instance_variable_get(:@occurrence_constraints) + return nil unless occ && !occ.empty? + + lines = [] + lines << "" + lines << " def validate_occurrences" + lines << " Metaschema::ConstraintValidator.validate_occurrences(self, self.class.instance_variable_get(:@occurrence_constraints))" + lines << " end" + + lines + end + + # Helper: find the JSON name for a to: callback method + def find_json_name_for_to_method(klass, method_name) + kv_map = begin + klass.mappings_for(:json) + rescue StandardError + nil + end + return nil unless kv_map + + mappings = kv_map.instance_variable_get(:@mappings) + mappings&.each do |json_name, rule| + if rule.custom_methods[:to]&.to_s == method_name.to_s + return json_name + end + end + nil + end + + # Helper: find the JSON name for a from: callback method + def find_json_name_for_from_method(klass, method_name) + kv_map = begin + klass.mappings_for(:json) + rescue StandardError + nil + end + return nil unless kv_map + + mappings = kv_map.instance_variable_get(:@mappings) + mappings&.each do |json_name, rule| + if rule.custom_methods[:from]&.to_s == method_name.to_s + return json_name + end + end + nil + end + + # Helper: find the attribute name for a callback method + def find_attr_for_method(klass, method_name) + kv_map = begin + klass.mappings_for(:json) + rescue StandardError + nil + end + return nil unless kv_map + + ms = method_name.to_s + mappings = kv_map.instance_variable_get(:@mappings) + mappings&.each_value do |rule| + custom = rule.custom_methods + if custom[:to]&.to_s == ms || custom[:from]&.to_s == ms + return rule.to.to_s + end + end + nil + end + + def type_reference_short(attr) + type = attr.type + if type.is_a?(Symbol) || BUILTIN_TYPES.include?(type) + type + elsif type.is_a?(Class) && type < Lutaml::Model::Serializable + :class_ref + else + :string + end + end + end +end diff --git a/lib/metaschema/table_cell_type.rb b/lib/metaschema/table_cell_type.rb index d5844ee..92fb567 100644 --- a/lib/metaschema/table_cell_type.rb +++ b/lib/metaschema/table_cell_type.rb @@ -2,7 +2,7 @@ module Metaschema class TableCellType < Lutaml::Model::Serializable - attribute :content, :string + attribute :content, :string, collection: true attribute :align, :string, default: -> { "left" } attribute :a, AnchorType, collection: true attribute :insert, InsertType, collection: true diff --git a/lib/metaschema/type_mapper.rb b/lib/metaschema/type_mapper.rb new file mode 100644 index 0000000..0db4988 --- /dev/null +++ b/lib/metaschema/type_mapper.rb @@ -0,0 +1,82 @@ +# frozen_string_literal: true + +module Metaschema + class TypeMapper + TYPE_MAP = { + # Basic types + "string" => :string, + "token" => :string, + "boolean" => :boolean, + "integer" => :integer, + "decimal" => :decimal, + + # Integer subtypes + "positive-integer" => :integer, + "negative-integer" => :integer, + "non-positive-integer" => :integer, + "non-negative-integer" => :integer, + + # Date/time types + "date" => :date, + "dateTime" => :date_time, + "date-with-timezone" => :date, + "date-time-with-timezone" => :date_time, + + # String subtypes + "uuid" => :string, + "uri" => :string, + "email" => :string, + "hostname" => :string, + "ip-address" => :string, + + # Markup types — use Metaschema's own types + "markup-line" => Metaschema::MarkupLineDatatype, + "markup-multiline" => Metaschema::MarkupLineDatatype, + }.freeze + + MARKUP_TYPES = %w[markup-line markup-multiline].freeze + + class << self + def map(as_type) + TYPE_MAP[as_type.to_s] || :string + end + + def markup?(as_type) + MARKUP_TYPES.include?(as_type.to_s) + end + + def multiline?(as_type) + as_type.to_s == "markup-multiline" + end + + # Register format-specific serializers for types that lutaml-model + # doesn't handle correctly out of the box. + def register_serializers! + # Decimal JSON — BigDecimal#to_s defaults to scientific + # notation ("0.11e1"); JSON needs plain notation (1.1) + Lutaml::Model::Type::Value.register_format_type_serializer( + :json, Lutaml::Model::Type::Decimal, + to: lambda { |inst| + return nil unless inst.value + + v = inst.value + v = Lutaml::Model::Type::Decimal.cast(v) unless v.is_a?(BigDecimal) + v.to_f + } + ) + + # Decimal XML — same scientific notation issue + Lutaml::Model::Type::Value.register_format_type_serializer( + :xml, Lutaml::Model::Type::Decimal, + to: lambda { |inst| + return nil unless inst.value + + v = inst.value + v = Lutaml::Model::Type::Decimal.cast(v) unless v.is_a?(BigDecimal) + v.to_s("F") + } + ) + end + end + end +end diff --git a/lib/metaschema/version.rb b/lib/metaschema/version.rb index 95da7d7..679bbd2 100644 --- a/lib/metaschema/version.rb +++ b/lib/metaschema/version.rb @@ -1,5 +1,5 @@ # frozen_string_literal: true module Metaschema - VERSION = "0.1.2" + VERSION = "0.2.0" end diff --git a/spec/fixtures/oscal b/spec/fixtures/oscal new file mode 160000 index 0000000..61507e0 --- /dev/null +++ b/spec/fixtures/oscal @@ -0,0 +1 @@ +Subproject commit 61507e06abdb2ae68094d015c2ec3ab8ce48694a diff --git a/spec/model_generator_spec.rb b/spec/model_generator_spec.rb new file mode 100644 index 0000000..a340b8d --- /dev/null +++ b/spec/model_generator_spec.rb @@ -0,0 +1,234 @@ +# frozen_string_literal: true + +require "spec_helper" + +RSpec.describe Metaschema::ModelGenerator, "dynamic model creation" do + let(:oscal_catalog_path) do + "spec/fixtures/oscal/src/metaschema/oscal_catalog_metaschema.xml" + end + + let(:oscal_complete_path) do + "spec/fixtures/oscal/src/metaschema/oscal_complete_metaschema.xml" + end + + let(:profile_resolution_dir) do + "spec/fixtures/oscal/src/specifications/profile-resolution" + end + + # Generate classes once and reuse across tests + let(:catalog_classes) do + described_class.generate_from_file(oscal_catalog_path) + end + + let(:complete_classes) do + described_class.generate_from_file(oscal_complete_path) + end + + def find_class(classes, name) + key = "Assembly_#{name}" + classes[key] || classes["Field_#{name}"] || classes["Flag_#{name}"] + end + + # Mixed-content title can be a String, a Serializable with .content, or an Array + def title_text(instance) + title = instance.metadata.title + title = title.content if title.respond_to?(:content) + title = title.join if title.is_a?(Array) + title.to_s + end + + # ── Class generation ──────────────────────────────────────────────── + + describe "generating classes from OSCAL catalog metaschema" do + it "returns a hash of classes" do + expect(catalog_classes).to be_a(Hash) + expect(catalog_classes).not_to be_empty + end + + it "creates assembly classes" do + expect(catalog_classes.keys).to include("Assembly_catalog") + expect(catalog_classes["Assembly_catalog"]).to be < Lutaml::Model::Serializable + end + + it "creates field classes" do + field_keys = catalog_classes.keys.select { |k| k.start_with?("Field_") } + expect(field_keys).not_to be_empty + end + + it "populates attributes on generated classes" do + catalog_klass = catalog_classes["Assembly_catalog"] + expect(catalog_klass.attributes).to include(:metadata) + end + + it "sets up XML mappings" do + catalog_klass = catalog_classes["Assembly_catalog"] + xml_map = catalog_klass.mappings_for(:xml) + expect(xml_map).not_to be_nil + end + + it "sets up key-value mappings" do + catalog_klass = catalog_classes["Assembly_catalog"] + kv_map = catalog_klass.mappings_for(:json) + expect(kv_map).not_to be_nil + end + end + + describe "generating classes from OSCAL complete metaschema" do + it "creates all 8 root model types" do + %w[catalog profile component_definition system_security_plan + assessment_plan assessment_results plan_of_action_and_milestones + mapping_collection].each do |name| + key = "Assembly_#{name}" + expect(complete_classes).to have_key(key), + "Expected #{key} in generated classes, got: #{complete_classes.keys.grep(/Assembly_/).sort.join(', ')}" + end + end + + it "resolves imports across metaschema modules" do + # oscal_complete_metaschema imports metadata, control-common, etc. + # The generated classes should have imported types available + metadata_key = "Assembly_metadata" + expect(complete_classes).to have_key(metadata_key) + metadata_klass = complete_classes[metadata_key] + expect(metadata_klass.attributes).to include(:title) + end + + it "includes inline flag attributes on assemblies" do + catalog_klass = complete_classes["Assembly_catalog"] + # Catalog should have uuid as an inline flag (XML attribute) + expect(catalog_klass.attributes).to include(:uuid) + end + end + + # ── XML round-trip with dynamically generated classes ─────────────── + + describe "XML parsing with generated catalog classes" do + let(:simple_catalog_path) do + File.join(profile_resolution_dir, + "requirement-tests/catalogs/abc-simple_catalog.xml") + end + + let(:simple_catalog_xml) { File.read(simple_catalog_path) } + + it "parses a simple catalog XML" do + catalog_klass = catalog_classes["Assembly_catalog"] + instance = catalog_klass.from_xml(simple_catalog_xml) + expect(instance).not_to be_nil + end + + it "extracts catalog metadata" do + catalog_klass = catalog_classes["Assembly_catalog"] + instance = catalog_klass.from_xml(simple_catalog_xml) + + expect(title_text(instance)).to eq("Alphabet Catalog") + end + + it "round-trips a simple catalog XML" do + catalog_klass = catalog_classes["Assembly_catalog"] + instance1 = catalog_klass.from_xml(simple_catalog_xml) + xml_out = catalog_klass.to_xml(instance1, pretty: true, + declaration: true, + encoding: "utf-8") + instance2 = catalog_klass.from_xml(xml_out) + + expect(title_text(instance2)).to eq(title_text(instance1)) + end + end + + describe "XML round-trip with multiple test catalogs" do + catalog_files = Dir[File.join( + __dir__, "fixtures/oscal/src/specifications/profile-resolution", + "requirement-tests/catalogs", "*_catalog.xml" + )] + + catalog_files.each do |path| + name = File.basename(path, ".xml") + + it "round-trips #{name}" do + classes = described_class.generate_from_file(oscal_catalog_path) + catalog_klass = classes["Assembly_catalog"] + xml = File.read(path) + + instance1 = catalog_klass.from_xml(xml) + xml_out = catalog_klass.to_xml(instance1, pretty: true, + declaration: true, + encoding: "utf-8") + instance2 = catalog_klass.from_xml(xml_out) + + expect(title_text(instance2)).to eq(title_text(instance1)) + end + end + end + + # ── JSON/YAML round-trip ──────────────────────────────────────────── + + describe "JSON round-trip with generated catalog classes" do + let(:simple_catalog_xml) do + File.read(File.join(profile_resolution_dir, + "requirement-tests/catalogs/abc-simple_catalog.xml")) + end + + it "round-trips XML → JSON → parse" do + catalog_klass = catalog_classes["Assembly_catalog"] + instance1 = catalog_klass.from_xml(simple_catalog_xml) + json_out = catalog_klass.to_json(instance1) + + # Should produce valid JSON with root wrapping + parsed_json = JSON.parse(json_out) + expect(parsed_json).to have_key("catalog") + + instance2 = catalog_klass.from_json(json_out) + expect(title_text(instance2)).to eq(title_text(instance1)) + rescue StandardError => e + skip "JSON serialization issue: #{e.message}" + end + end + + describe "YAML round-trip with generated catalog classes" do + let(:simple_catalog_xml) do + File.read(File.join(profile_resolution_dir, + "requirement-tests/catalogs/abc-simple_catalog.xml")) + end + + it "round-trips XML → YAML → parse" do + catalog_klass = catalog_classes["Assembly_catalog"] + instance1 = catalog_klass.from_xml(simple_catalog_xml) + yaml_out = catalog_klass.to_yaml(instance1) + + instance2 = catalog_klass.from_yaml(yaml_out) + expect(title_text(instance2)).to eq(title_text(instance1)) + rescue StandardError => e + skip "YAML serialization issue: #{e.message}" + end + end + + # ── Source code generation matches dynamic behavior ───────────────── + + describe "source code generation consistency" do + let(:source_files) do + described_class.to_ruby_source(oscal_complete_path, + module_name: "Oscal::V1_2_1") + end + + it "generates at least as many named classes in source as named dynamic classes" do + dynamic_count = complete_classes.count do |_, klass| + klass.is_a?(Class) && klass < Lutaml::Model::Serializable + end + source_count = source_files.values.first.scan(/class \w+ < Base/).length + # Source includes named + anonymous inline types; dynamic includes all + expect(source_count).to be >= dynamic_count + end + + it "includes root wrapping in source for root model types" do + source = source_files.values.first + # Catalog should have of_json root wrapping + expect(source).to include("def self.of_json") + expect(source).to include('doc.key?("catalog")') + end + + it "produces valid Ruby that can be parsed" do + source = source_files.values.first + expect { RubyVM::AbstractSyntaxTree.parse(source) }.not_to raise_error + end + end +end diff --git a/spec/ruby_source_emitter_spec.rb b/spec/ruby_source_emitter_spec.rb new file mode 100644 index 0000000..91deba3 --- /dev/null +++ b/spec/ruby_source_emitter_spec.rb @@ -0,0 +1,108 @@ +# frozen_string_literal: true + +require "spec_helper" + +RSpec.describe Metaschema::ModelGenerator, ".to_ruby_source" do + let(:metaschema_path) do + "spec/fixtures/metaschema/test-suite/worked-examples/everything-metaschema/everything_metaschema.xml" + end + + let(:oscal_path) do + "spec/fixtures/oscal/src/metaschema/oscal_catalog_metaschema.xml" + end + + let(:oscal_complete_path) do + "spec/fixtures/oscal/src/metaschema/oscal_complete_metaschema.xml" + end + + describe "with everything_metaschema" do + let(:files) do + described_class.to_ruby_source(metaschema_path, + module_name: "TestEverything") + end + + it "returns a hash with at least one file" do + expect(files).to be_a(Hash) + expect(files).not_to be_empty + end + + it "produces valid Ruby syntax" do + source = files.values.first + expect { RubyVM::AbstractSyntaxTree.parse(source) }.not_to raise_error + end + + it "wraps classes in the specified module" do + source = files.values.first + expect(source).to include("module TestEverything") + end + + it "includes class definitions inheriting from Base" do + source = files.values.first + expect(source).to match(/class \w+ < Base/) + end + end + + describe "with OSCAL catalog metaschema" do + let(:files) do + described_class.to_ruby_source(oscal_path, module_name: "Oscal::V1_2_1") + end + + it "produces valid Ruby syntax" do + source = files.values.first + expect { RubyVM::AbstractSyntaxTree.parse(source) }.not_to raise_error + end + + it "includes a Catalog class" do + source = files.values.first + expect(source).to include("class Catalog < Base") + end + + it "includes XML mappings" do + source = files.values.first + expect(source).to include('element "catalog"') + expect(source).to include('map_element "metadata"') + end + + it "includes key-value mappings" do + source = files.values.first + expect(source).to include("key_value do") + end + + it "includes root wrapping for catalog" do + source = files.values.first + expect(source).to include("def self.of_json") + expect(source).to include("def self.to_json") + end + end + + describe "with OSCAL complete metaschema" do + let(:files) do + described_class.to_ruby_source(oscal_complete_path, module_name: "Oscal::V1_2_1") + end + + it "produces valid Ruby syntax for all classes" do + source = files.values.first + expect { RubyVM::AbstractSyntaxTree.parse(source) }.not_to raise_error + # 122 named classes + anonymous inline types + expect(source.scan(/class \w+ < Base/).length).to be >= 122 + end + + it "includes all 8 root model types" do + source = files.values.first + %w[Catalog Profile ComponentDefinition SystemSecurityPlan + AssessmentPlan AssessmentResults PlanOfActionAndMilestones + MappingCollection].each do |name| + expect(source).to include("class #{name} < Base") + end + end + + it "uses symbol type references for class attributes" do + source = files.values.first + # Catalog's metadata attribute should use symbol reference + catalog_start = source.index("class Catalog <") + catalog_end = source.index(" end", catalog_start) + catalog_source = source[catalog_start..catalog_end] + expect(catalog_source).to include("attribute :metadata, :metadata") + end + end +end