-
Notifications
You must be signed in to change notification settings - Fork 167
ARVOR Floats and SVP Drifters Ocean Converters #1009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
moha - i'm going to file a review on the code in just a bit, but up front i wanted to say that it's great to pull out the CSV parsing into a module so it can be reused, tested and updated independently of the calling code. if you were willing to do a bit more work on this, i think that the CSV routines are self-contained enough to merit their own separate module. they can call code from the parse module, but i think they're different enough to stand alone. let me know what you think about this. i'll put other more specific comments into my review. also - do you have any tests you used on this code that could be added to the repo? |
nancycollins
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the converters themselves are easy to read and understand, which is good. i had a few comments - the biggest one is probably moving the csv routines to their own module.
| cf%delim = detect_delim(line) | ||
|
|
||
| call split_fields(line, cf%delim, cf%ncols, cf%fields) | ||
| call close_file(iunit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would leave the file open here in csv_open(), leave it open in all subsequent calls, and close it in csv_close(). you can add the iunit to the same structure and reuse it until close is called. you can call "rewind()" if you need to start reading at the beginning of the file in subsequent calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| cf%ncols = 0 | ||
| cf%delim = ',' | ||
| cf%fields = '' | ||
| cf%is_open = .false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you add iunit to the cf structure, close cf%iunit here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| file_out = 'obs_seq.arvor', | ||
| obs_error_temp = 0.02, ! temperature error standard deviation (C) | ||
| obs_error_sal = 0.02, ! salinity error standard deviation (PSU) | ||
| avg_obs_per_file = 500000, ! pre-allocation hint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd say this is more than a 'hint' because i don't see anywhere that the converter can recover if there are more obs than were originally allocated for. maybe use 'limit' instead of 'hint'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| * - ``avg_obs_per_file`` | ||
| - integer | ||
| - ``500000`` | ||
| - Estimate of valid obs per file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a second sentence something like 'Used for pre-allocation. Number of files times this number must be larger than the total number of output observations.'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
|
||
| ! Open csv file and get dims | ||
| call csv_open(filename, cf, routine) | ||
| nobs = cf%nrows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto the comment about an accessor function here. i think this is the only one missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
Nancy, thanks for the review. I should be able to address all of the comments. I'll also move the routines to their own module as suggested. The data I used for testing can be found here:
These are the same as those listed in ops_files.txt (in my PR description). Do you want me to add some of those ascii files to the repo? |
|
hi moha - thanks. no, i don't think they need to be added to the repo. i just wanted to see some of the input files so maybe i could make a couple of simple test programs that mimic what the read routines are expected to parse. |
|
i made a small test program and pushed it to my fork of your code here: https://github.com/nancycollins/moha/tree/insitu_ocean_converters/developer_tests/utilities it's called csv_read_test.f90 (and a corresponding update to work/quickbuild.sh). i think it should be added to your pull request but i'm rusty with github so i left it there in my repo. it works fine in a couple test cases but the csv field read code doesn't cope correctly with embedded blanks in data fields (test 3 fails). |
|
i went back to the parse_args_mod and made a new routine get_csv_words_from_string() which must be passed a string and a delimiter character, and it returns a word count and word array. it handles embedded blanks and quoted fields so they can contain the delimiter character inside the field. i pushed this to my fork and also added a parse_csv_test.f90 and parse_csv_test.in test for this. |
|
Nancy's pull request to Moha's pull request is here: |
|
Ran the converters with the recent changes and everything worked as intended. I also added documentation of the new module. |
hkershaw-brown
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Moha,
Looking good, I put a comment in on reverting parse_args_mod, and I couple of comments in the doc. Then main one is people should know about the \escape.
I'll test the build and run Nancy's tests next.
Cheers,
Helne
| Other modules used | ||
| ------------------ | ||
|
|
||
| :: | ||
|
|
||
| types_mod | ||
| utilities_mod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question on this, do you find it helpful for the docs to list the other modules used?
I feel like this gets added to documentation because people have added it to documentation previously. I'd remove it unless you think it does help people reading the documentation.
I never trust this to be up-to-date and would look at the code to check the module usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not, not really very helpful. As you mentioned, I put it in to mimic documentation of other modules. I'll remove.
| rules. This routine is exposed primarily to support consistent parsing behavior | ||
| in other code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think get_csv_words_from_string is exposed for the test programs included in this pull request
| rules. This routine is exposed primarily to support consistent parsing behavior | |
| in other code. | |
| rules. |
| convert_goes_ABI_L1b | ||
| MOD29E1D_to_obs | ||
| hf_to_obs | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add new executables to .gitignore
| arvor_to_obs | |
| svp_to_obs | |
b6a7683 to
b03bb73
Compare
|
yup revert it.
…On Wed, Jan 7, 2026 at 3:11 PM Moha Gharamti ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
On assimilation_code/modules/utilities/parse_args_mod.f90
<#1009 (comment)>:
Helen, I think I brought the PR up-to-date with main. Please let me know
if that's not the case.
Regarding parse_args_mod, the only changes that I think might be useful
is the comments on top:
! if you need to fix any bugs in this code, also look at the
! get_csv_from_string() routine in the read_csv_mod module.
! it is derived from these routines.
Do you think, I should still revert it?
—
Reply to this email directly, view it on GitHub
<#1009 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEY6JHYQSHUH7K3HWBQBTWL4FVR6PAVCNFSM6AAAAACNGKYRWWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTMMZWGYZDSNBYHA>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
This update introduces a new set of general-purpose CSV utilities to `parse_args_mod` for use across DART observation converters and other modules that ingest ASCII/tabular data. New utilities added: - `csv_file_type`: cached CSV handle storing filename, nrows, ncols, delimiter, and header fields. - `csv_open`/`csv_close`: initialize/reset CSV handle and preload header/dimensions. - `csv_get_field_char` - `csv_get_field_int` - `csv_get_field_real` Unified interface through `csv_get_field` for retrieving column strings, integers, or reals. - Normalization of delimiters (, or ;) with support for empty fields. - `csv_get_obs_num`: count data rows (excluding header) - `csv_find_field`: header lookup - Other internal helpers such as `split_fields`, `detect_delim`, `normalize_delims` These routines provide a reusable framework that is modeled after our existing NetCDF utilities.
A new ocean converter that uses profiling floats. The converter harvests temperature and salinity data at different depths and time. Depths are converted from pressure in dbar to height in meters. The converter uses the csv parsing utilities to read data from the raw input files.
This is an ocean conveter that uses surface drifters. It collects SST and surface currents data. It uses the csv parsing utilities to read the incoming ASCII files.
- `csv_get_field_index`: Get column index of a field - `csv_field_exists`: Check if field exists in file - `csv_print_header`: print the field names (my favorite) Additional debugging statements in the converters
it must be told what the delimiter is (generally comma or semicolon) and splits up the fields based on the delimiter. it handles quotes inside the fields to allow the delimiter to be part of the string. added a test program and test input file.
Stripped all csv routines from the parse_args_mod and added them into their own csv module. Improved the opening and closing logic. Now, the file is opened once and rewinded for reading different variables. Content of csv file structure is now private. Added necessary accessor functions to retrieve data.
Cleaned up parse_args_mod and slightly modified the new converters codes to use the new read_csv_mod. Also, made small readme changes.
remove the routine that adds spaces and call the new parse routine directly. add an option on open to specify the delimiter which is passed through to the detect routine. make the test program use the testeverything code. it now handles fields with embedded spaces and alternative delimiters.
tests for the csv parsing routine
moved the csv parse routine into the csv module. added more tests and made them easier to understand what was being tested.
re-enabled the 2 tests that provoke a (correct) fatal error. added a set_term_level() routine to the utilities mod.
Also removed unused routines from the converter
b03bb73 to
db2580e
Compare
csv tests that changed term level removed from pull NCAR#1009
hkershaw-brown
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved!
Nice work Moha and Nancy, awesome have the csv utilities with this.

Description:
This PR adds two new in-situ ocean converters (ARVOR profiling floats and SVP surface drifters). It also introduces a reusable CSV parsing utility in
parse_args_mod. Both converters make use of this CSV interface, which simplifies code. In addition, documentation has been added to the converters.The CSV parsing utilities build on already existing parsing infrastructure (like a wrapper). The functionality mimics our netcdf handling in the sense that a file is opened, and data is accessed with a single call before closing the file. A few helper functions have been also added. These can be used to access the header, to inquire if a field exists, to find the dimensions, etc.
Types of changes
Documentation changes needed?
Tests
Tested both converters using actual raw ASCII data files.
Checklist for merging
Checklist for release
Testing Datasets
ARVOR:
/glade/derecho/scratch/gharamti/inacawo/DART/observations/obs_converters/ARVOR/work/obs_files.txtSVP:
/glade/derecho/scratch/gharamti/inacawo/DART/observations/obs_converters/SVP/work/obs_files.txt