ARVOR Floats and SVP Drifters Ocean Converters #1009

mgharamti · 2025-11-25T23:24:14Z

Description:

This PR adds two new in-situ ocean converters (ARVOR profiling floats and SVP surface drifters). It also introduces a reusable CSV parsing utility in parse_args_mod. Both converters make use of this CSV interface, which simplifies code. In addition, documentation has been added to the converters.

The CSV parsing utilities build on already existing parsing infrastructure (like a wrapper). The functionality mimics our netcdf handling in the sense that a file is opened, and data is accessed with a single call before closing the file. A few helper functions have been also added. These can be used to access the header, to inquire if a field exists, to find the dimensions, etc.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

Documentation changes needed?

My change requires a change to the documentation.
- I have updated the documentation accordingly.

Tests

Tested both converters using actual raw ASCII data files.

Checklist for merging

Updated changelog entry
Documentation updated
Update conf.py

Checklist for release

Merge into main
Create release from the main branch with appropriate tag
Delete feature-branch

Testing Datasets

ARVOR: /glade/derecho/scratch/gharamti/inacawo/DART/observations/obs_converters/ARVOR/work/obs_files.txt
SVP: /glade/derecho/scratch/gharamti/inacawo/DART/observations/obs_converters/SVP/work/obs_files.txt

nancycollins · 2025-12-05T18:58:20Z

moha - i'm going to file a review on the code in just a bit, but up front i wanted to say that it's great to pull out the CSV parsing into a module so it can be reused, tested and updated independently of the calling code.

if you were willing to do a bit more work on this, i think that the CSV routines are self-contained enough to merit their own separate module. they can call code from the parse module, but i think they're different enough to stand alone. let me know what you think about this. i'll put other more specific comments into my review.

also - do you have any tests you used on this code that could be added to the repo?

nancycollins

the converters themselves are easy to read and understand, which is good. i had a few comments - the biggest one is probably moving the csv routines to their own module.

assimilation_code/modules/utilities/parse_args_mod.f90

nancycollins · 2025-12-05T18:07:44Z

assimilation_code/modules/utilities/parse_args_mod.f90

+cf%delim = detect_delim(line)
+
+call split_fields(line, cf%delim, cf%ncols, cf%fields)
+call close_file(iunit)


i would leave the file open here in csv_open(), leave it open in all subsequent calls, and close it in csv_close(). you can add the iunit to the same structure and reuse it until close is called. you can call "rewind()" if you need to start reading at the beginning of the file in subsequent calls.

assimilation_code/modules/utilities/parse_args_mod.f90

nancycollins · 2025-12-05T18:14:01Z

assimilation_code/modules/utilities/parse_args_mod.f90

+cf%ncols    = 0
+cf%delim    = ','
+cf%fields   = ''
+cf%is_open  = .false.


if you add iunit to the cf structure, close cf%iunit here.

assimilation_code/modules/utilities/parse_args_mod.f90

nancycollins · 2025-12-05T18:48:46Z

observations/obs_converters/ARVOR/readme.rst

+      file_out          = 'obs_seq.arvor',
+      obs_error_temp    = 0.02,             ! temperature error standard deviation (C)
+      obs_error_sal     = 0.02,             ! salinity error standard deviation (PSU)
+      avg_obs_per_file  = 500000,           ! pre-allocation hint


i'd say this is more than a 'hint' because i don't see anywhere that the converter can recover if there are more obs than were originally allocated for. maybe use 'limit' instead of 'hint'?

observations/obs_converters/SVP/readme.rst

nancycollins · 2025-12-05T18:53:52Z

observations/obs_converters/ARVOR/readme.rst

+   * - ``avg_obs_per_file``
+     - integer
+     - ``500000``
+     - Estimate of valid obs per file.


Add a second sentence something like 'Used for pre-allocation. Number of files times this number must be larger than the total number of output observations.'

observations/obs_converters/SVP/readme.rst

nancycollins · 2025-12-05T18:56:01Z

observations/obs_converters/SVP/svp_to_obs.f90

+
+! Open csv file and get dims
+call csv_open(filename, cf, routine)
+nobs = cf%nrows


ditto the comment about an accessor function here. i think this is the only one missing.

mgharamti · 2025-12-05T19:41:59Z

Nancy, thanks for the review. I should be able to address all of the comments. I'll also move the routines to their own module as suggested. The data I used for testing can be found here:
For ARVOR:

/glade/work/gharamti/inacawo/data_snippets/arvorc/
/glade/work/gharamti/inacawo/data_snippets/arvori/
For SVP:
/glade/work/gharamti/inacawo/data_snippets/svp/20251006/

These are the same as those listed in ops_files.txt (in my PR description). Do you want me to add some of those ascii files to the repo?

nancycollins · 2025-12-07T18:27:47Z

hi moha - thanks. no, i don't think they need to be added to the repo. i just wanted to see some of the input files so maybe i could make a couple of simple test programs that mimic what the read routines are expected to parse.

nancycollins · 2025-12-07T22:27:56Z

i made a small test program and pushed it to my fork of your code here:

https://github.com/nancycollins/moha/tree/insitu_ocean_converters/developer_tests/utilities

it's called csv_read_test.f90 (and a corresponding update to work/quickbuild.sh). i think it should be added to your pull request but i'm rusty with github so i left it there in my repo. it works fine in a couple test cases but the csv field read code doesn't cope correctly with embedded blanks in data fields (test 3 fails).

nancycollins · 2025-12-08T16:48:12Z

i went back to the parse_args_mod and made a new routine get_csv_words_from_string() which must be passed a string and a delimiter character, and it returns a word count and word array. it handles embedded blanks and quoted fields so they can contain the delimiter character inside the field. i pushed this to my fork and also added a parse_csv_test.f90 and parse_csv_test.in test for this.

hkershaw-brown · 2025-12-29T13:28:50Z

Nancy's pull request to Moha's pull request is here:
mgharamti#1

mgharamti · 2026-01-07T18:30:10Z

Ran the converters with the recent changes and everything worked as intended. I also added documentation of the new module.
Happy to address any remaining issues.

hkershaw-brown

Hi Moha,

Looking good, I put a comment in on reverting parse_args_mod, and I couple of comments in the doc. Then main one is people should know about the \escape.

I'll test the build and run Nancy's tests next.

Cheers,
Helne

assimilation_code/modules/utilities/parse_args_mod.f90

hkershaw-brown · 2026-01-07T18:41:26Z

assimilation_code/modules/utilities/read_csv_mod.rst

+Other modules used
+------------------
+
+::
+
+   types_mod
+   utilities_mod


Question on this, do you find it helpful for the docs to list the other modules used?

I feel like this gets added to documentation because people have added it to documentation previously. I'd remove it unless you think it does help people reading the documentation.
I never trust this to be up-to-date and would look at the code to check the module usage.

Not, not really very helpful. As you mentioned, I put it in to mimic documentation of other modules. I'll remove.

assimilation_code/modules/utilities/read_csv_mod.rst

hkershaw-brown · 2026-01-07T18:48:52Z

assimilation_code/modules/utilities/read_csv_mod.rst

+rules.  This routine is exposed primarily to support consistent parsing behavior
+in other code.


I think get_csv_words_from_string is exposed for the test programs included in this pull request

Suggested change

rules. This routine is exposed primarily to support consistent parsing behavior

in other code.

rules.

observations/obs_converters/ARVOR/arvor_to_obs.f90

observations/obs_converters/SVP/svp_to_obs.f90

developer_tests/utilities/csv_read_test.f90

hkershaw-brown · 2026-01-07T19:34:41Z

.gitignore

 convert_goes_ABI_L1b
 MOD29E1D_to_obs
 hf_to_obs



Add new executables to .gitignore

Suggested change

arvor_to_obs

svp_to_obs

hkershaw-brown · 2026-01-07T20:13:50Z

yup revert it.

…

On Wed, Jan 7, 2026 at 3:11 PM Moha Gharamti ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ On assimilation_code/modules/utilities/parse_args_mod.f90 <#1009 (comment)>: Helen, I think I brought the PR up-to-date with main. Please let me know if that's not the case. Regarding parse_args_mod, the only changes that I think might be useful is the comments on top: ! if you need to fix any bugs in this code, also look at the ! get_csv_from_string() routine in the read_csv_mod module. ! it is derived from these routines. Do you think, I should still revert it? — Reply to this email directly, view it on GitHub <#1009 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEY6JHYQSHUH7K3HWBQBTWL4FVR6PAVCNFSM6AAAAACNGKYRWWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTMMZWGYZDSNBYHA> . You are receiving this because your review was requested.Message ID: ***@***.***>

hkershaw-brown · 2026-01-07T20:15:19Z

This is out of date with NCAR main, the info is at the bottom of the pull request:

This update introduces a new set of general-purpose CSV utilities to `parse_args_mod` for use across DART observation converters and other modules that ingest ASCII/tabular data. New utilities added: - `csv_file_type`: cached CSV handle storing filename, nrows, ncols, delimiter, and header fields. - `csv_open`/`csv_close`: initialize/reset CSV handle and preload header/dimensions. - `csv_get_field_char` - `csv_get_field_int` - `csv_get_field_real` Unified interface through `csv_get_field` for retrieving column strings, integers, or reals. - Normalization of delimiters (, or ;) with support for empty fields. - `csv_get_obs_num`: count data rows (excluding header) - `csv_find_field`: header lookup - Other internal helpers such as `split_fields`, `detect_delim`, `normalize_delims` These routines provide a reusable framework that is modeled after our existing NetCDF utilities.

A new ocean converter that uses profiling floats. The converter harvests temperature and salinity data at different depths and time. Depths are converted from pressure in dbar to height in meters. The converter uses the csv parsing utilities to read data from the raw input files.

This is an ocean conveter that uses surface drifters. It collects SST and surface currents data. It uses the csv parsing utilities to read the incoming ASCII files.

- `csv_get_field_index`: Get column index of a field - `csv_field_exists`: Check if field exists in file - `csv_print_header`: print the field names (my favorite) Additional debugging statements in the converters

it must be told what the delimiter is (generally comma or semicolon) and splits up the fields based on the delimiter. it handles quotes inside the fields to allow the delimiter to be part of the string. added a test program and test input file.

Stripped all csv routines from the parse_args_mod and added them into their own csv module. Improved the opening and closing logic. Now, the file is opened once and rewinded for reading different variables. Content of csv file structure is now private. Added necessary accessor functions to retrieve data.

Cleaned up parse_args_mod and slightly modified the new converters codes to use the new read_csv_mod. Also, made small readme changes.

remove the routine that adds spaces and call the new parse routine directly. add an option on open to specify the delimiter which is passed through to the detect routine. make the test program use the testeverything code. it now handles fields with embedded spaces and alternative delimiters.

tests for the csv parsing routine

moved the csv parse routine into the csv module. added more tests and made them easier to understand what was being tested.

re-enabled the 2 tests that provoke a (correct) fatal error. added a set_term_level() routine to the utilities mod.

Also removed unused routines from the converter

csv tests that changed term level removed from pull NCAR#1009

see issue NCAR#998 unknown why we have two lists

hkershaw-brown

Approved!
Nice work Moha and Nancy, awesome have the csv utilities with this.

mgharamti requested review from hkershaw-brown and nancycollins November 25, 2025 23:24

mgharamti added Enhancement New feature or request obs_converters converting observations to DART format labels Nov 25, 2025

nancycollins reviewed Dec 5, 2025

View reviewed changes

nancycollins mentioned this pull request Dec 12, 2025

change parse_args_test to be more automated. #1023

Merged

13 tasks

nancycollins mentioned this pull request Dec 27, 2025

Insitu ocean converters mgharamti/DART#1

Merged

15 tasks

hkershaw-brown requested changes Jan 7, 2026

View reviewed changes

hkershaw-brown reviewed Jan 7, 2026

View reviewed changes

observations/obs_converters/ARVOR/arvor_to_obs.f90 Outdated Show resolved Hide resolved

hkershaw-brown reviewed Jan 7, 2026

View reviewed changes

observations/obs_converters/SVP/svp_to_obs.f90 Outdated Show resolved Hide resolved

hkershaw-brown reviewed Jan 7, 2026

View reviewed changes

developer_tests/utilities/csv_read_test.f90 Show resolved Hide resolved

hkershaw-brown reviewed Jan 7, 2026

View reviewed changes

mgharamti force-pushed the insitu_ocean_converters branch from b6a7683 to b03bb73 Compare January 7, 2026 20:07

mgharamti added 8 commits January 7, 2026 13:31

Documentation of the ARVOR converter

ad748be

SVP Ocean Converter

9a36cbe

This is an ocean conveter that uses surface drifters. It collects SST and surface currents data. It uses the csv parsing utilities to read the incoming ASCII files.

Documentation of the SVP converter

357c393

Added new converters to toctree

617b448

More csv helper functions

2812b61

- `csv_get_field_index`: Get column index of a field - `csv_field_exists`: Check if field exists in file - `csv_print_header`: print the field names (my favorite) Additional debugging statements in the converters

General cleaning

6f5445b

nancycollins and others added 19 commits January 7, 2026 13:31

added a test for an empty column

2e67527

code test fails if there are blanks inside a field.

f835d0a

add new test to build script

d4ed7eb

change test program to use the testeverything utilities

a447f0f

Adjusted converters to use new module

321dc85

Cleaned up parse_args_mod and slightly modified the new converters codes to use the new read_csv_mod. Also, made small readme changes.

add new csv module and calls to test

7c55702

make csv parse test self-contained and use testeverything

64b2fb9

tests for the csv parsing routine

remove unneeded input file from tests

6810381

moved csv parse routine and added more tests

b5f33ba

moved the csv parse routine into the csv module. added more tests and made them easier to understand what was being tested.

update .gitignore to add new executables

62d933e

removed debug lines for production code

432dac1

removed old todo lines - not an issue in this code anymore

c86ac90

add a way to continue the test after a fatal error

62f5124

re-enabled the 2 tests that provoke a (correct) fatal error. added a set_term_level() routine to the utilities mod.

Small fixes

39cabd9

Documentation of the new CSV module

66f78ff

Cleanup of the module documentation

db2580e

Also removed unused routines from the converter

mgharamti force-pushed the insitu_ocean_converters branch from b03bb73 to db2580e Compare January 7, 2026 20:31

mgharamti and others added 4 commits January 7, 2026 13:35

Revert parse_args_mod.f90 to match main (CSV moved to read_csv_mod)

e280040

Cleaned svp_to_obs and updated gitignore

1d7f477

Removed tests that cause program to hang

f059eca

revert utiliies_mod.f90 back to main

328eac4

csv tests that changed term level removed from pull NCAR#1009

hkershaw-brown added the release! bundle with next release label Jan 8, 2026

hkershaw-brown added 3 commits January 8, 2026 14:05

doc-fix: names of converter readme files in toc

f71159e

doc-fix: add converters to second list of avaiable converters

c7f2733

see issue NCAR#998 unknown why we have two lists

doc: add index entry so read_csv_mod shows up when searching 'csv'

fb085a1

hkershaw-brown approved these changes Jan 8, 2026

View reviewed changes

hkershaw-brown merged commit c70e184 into NCAR:main Jan 8, 2026
4 checks passed

		rules. This routine is exposed primarily to support consistent parsing behavior
		in other code.

	rules. This routine is exposed primarily to support consistent parsing behavior
	in other code.
	rules.

ARVOR Floats and SVP Drifters Ocean Converters #1009

ARVOR Floats and SVP Drifters Ocean Converters #1009

Conversation

mgharamti commented Nov 25, 2025

Description:

Types of changes

Documentation changes needed?

Tests

Checklist for merging

Checklist for release

Testing Datasets

Uh oh!

nancycollins commented Dec 5, 2025

Uh oh!

nancycollins left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mgharamti commented Dec 5, 2025

Uh oh!

nancycollins commented Dec 7, 2025

Uh oh!

nancycollins commented Dec 7, 2025

Uh oh!

nancycollins commented Dec 8, 2025

Uh oh!

hkershaw-brown commented Dec 29, 2025

Uh oh!

mgharamti commented Jan 7, 2026

Uh oh!

hkershaw-brown left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hkershaw-brown commented Jan 7, 2026 via email

Uh oh!

hkershaw-brown commented Jan 7, 2026

Uh oh!

hkershaw-brown left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees