Skip to content

Bug Report: XML Parsing is not working correctly #2364

@tlfish1510

Description

@tlfish1510

I work on the NOAA CMR program. We use repository https://git.services.nesdis.noaa.gov/dissemination/catalog/cmr-repos/dev-cmr-base, and periodically migrate changes from this repository to it, in order to capture changes you make into ours.

I recently migrated the changes in this repository up to tag CMR-1.287.0-r25.4.2 (previous migration point was CMR-1.277.0-r25.2.5). Prior to the migration I was able to ingest our sample NOAA ISO file into CMR without issue. Now that I have migrated the latest code I am no longer able to ingest that same ISO file into CMR.

At first it appeared that the issue was that the warnings I was seeing became errors, but as it turns out the UMM-C object being derived by the code after the migration varies from what is derived from the code prior to the migration.

I am attaching the UMM-C objects that were generated along with the ISO I ingested:
GOES_ABI_L1B_RAD.xml - the ingested ISO
prior-UMM-C.json - the object generated from the pre-migration code
after-UMM-C.json - the object generated from the after-migration code

The variances are found within the RelatedUrls. ArchiveAndDistributionInformation and Platforms fields of the UMM-C object. In the 1st 2 fields, the specific variance is on the Format field. In the last field the pre-migration object had 3 platforms while the after-migration object lacked any.

The xpaths used in our code to obtain the Format field are:

  • /gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorFormat/gmd:MD_Format/gmd:name/gco:CharacterString
  • /gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributionFormat/gmd:MD_Format/gmd:name/gco:CharacterString

The xpaths used in our code to obtain the Platforms field are:

  • /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords
  • /gmi:MI_Metadata/gmi:acquisitionInformation/gmi:MI_AcquisitionInformation/gmi:platform/gmi:MI_Platform, and within this xpath: gmi:instrument/gmi:MI_Instrument

after-UMM-C.json
GOES_ABI_L1B_RAD.xml
prior-UMM-C.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions