Skip to content

Ingest New Sources from Sanghi23#588

Merged
kelle merged 16 commits intoSIMPLE-AstroDB:mainfrom
canavarrete01:ingestsources
Apr 22, 2025
Merged

Ingest New Sources from Sanghi23#588
kelle merged 16 commits intoSIMPLE-AstroDB:mainfrom
canavarrete01:ingestsources

Conversation

@canavarrete01
Copy link
Contributor

Short description: Include what type of data being ingested and appropriate references.

Link to relevant issue: Closes #

For data ingests:
[x ] includes script used for ingest
[ ] includes modified JSON files
[ ] Add new tests
[ ] Update the Versions table

@canavarrete01 canavarrete01 marked this pull request as ready for review April 14, 2025 19:54
@canavarrete01 canavarrete01 changed the title Ingest New Sources from Sanghi23 (In Progress) Ingest New Sources from Sanghi23 Apr 14, 2025
@canavarrete01
Copy link
Contributor Author

Added a new script to ingest all alternative names from the original Ultracool sheet

@canavarrete01 canavarrete01 requested review from Copilot and kelle and removed request for Copilot April 15, 2025 15:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Files not reviewed (1)
  • scripts/ingests/sanghi23/NewSources-23.csv: Language not supported
Comments suppressed due to low confidence (1)

scripts/ingests/sanghi23/IngestAllNames.py:39

  • Using the string 'Null' for null checks might be unreliable; consider using a standard null check (e.g., against None) or verifying that the string is non-empty.
if name != db_source[0] and name != "Null":

Comment on lines +53 to +55
print(f"Total sources add: {n_added}/43")
print(f"Total sources skipped: {n_skipped}/43")

Copy link

Copilot AI Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider changing 'Total sources add:' to 'Total sources added:' for correct grammar.

Suggested change
print(f"Total sources add: {n_added}/43")
print(f"Total sources skipped: {n_skipped}/43")
print(f"Total sources added: {n_added}/43")

Copilot uses AI. Check for mistakes.
# # Ingest Alternative Sources ---
print(f"Ingesting alternative names for {n_added} sources.")
for _, source in newsources.iterrows():
if source['name_simbadable'] != "Null":
Copy link

Copilot AI Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the string literal 'Null' for null checks may lead to errors if the data representation changes. Consider checking against None or an empty string for robustness.

Suggested change
if source['name_simbadable'] != "Null":
if source['name_simbadable'] is not None and source['name_simbadable'] != "":

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@kelle kelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a PR to your branch to add some counters. As I said there, if your numbers agree with what I got -- 'Added 243 names to the database, 559 names were already present.' -- then go ahead and commit the 243 JSON files. Let's get the alt names merged and then tackle the new sources in their own PR.

@kelle
Copy link
Collaborator

kelle commented Apr 15, 2025

  • merge the latest changes to main to your branch.
  • once you generate the JSON files, run pytest
  • modify the necessary tests
  • push!

@canavarrete01
Copy link
Contributor Author

  • merge the latest changes to main to your branch.
  • once you generate the JSON files, run pytest
  • modify the necessary tests
  • push!

These have all been resolved now! @kelle

@kelle
Copy link
Collaborator

kelle commented Apr 17, 2025

I don't see the JSON files.

Copy link
Collaborator

@kelle kelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script should use the simbad-able name when available. E.g., the YU names are not simbad-able and the 2MASS names should be used as the source name instead. (The YU names should be added as OtherNames.)

Copy link
Collaborator

@kelle kelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMAZING !

@kelle kelle merged commit a13b8a4 into SIMPLE-AstroDB:main Apr 22, 2025
3 checks passed
lesliech1004 pushed a commit to lesliech1004/SIMPLE-db that referenced this pull request Jul 23, 2025
* Ingesting Sources

* Minor updates to Ingest script

* Ingest Sources and Sources' Alt Names

* add counters

* Updates to tests

* Adding json files

* Updating YU2 Sources

* Deleted old YU files

* Update test

* version update

---------

Co-authored-by: kelle <kellecruz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants