Optimize rdf histogram #5104
Replies: 1 comment
-
|
Hi @rhowardstone , thank you for your PR #5128 (where I commented). I feel you're being owed a slightly longer response and the discussions is a good place for that. We are currently deliberating how to handle PRs that are primarily AI-generated. In a way, your PR and communication is a good example for what a "good" (non-slop) contribution can look like: You're communicating clearly how the code was generated but you're not outsourcing the communication to an LLM. My personal opinion at the moment: You're identifying the "understanding" part as a limitation and this is getting to one part of the problem: If we as open source developers take time to review and bring in our personal expertise then we hope to also cultivate a relationship with other developers, who, maybe, eventually may contribute more to the project — keeping an open source project going is a constant struggle. We are really not interested in providing prompts for someone else to feed to a LLM. Another concern is how LLMs may generate code from sources that are license-incompatible with the target; it may also violate licenses by not properly attributing. For instance, if your generated code is really coming from a GPL-licensed code base, then we could not include it in MDAnalysis because it would immediately enforce the GPL on our LGPL code. We're not the only open source project who is facing these questions; see, for instance a Draft SPEC on options for use of AI in project contributions scientific-python/summit-2025#35 . Comments welcome :-). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
Totally new to this community, and not very experienced on GitHub either, so I apologize if I'm making some faux pas here, but I am having some astounding results "nanny coding" (I take a much more structured, verifiable approach than one could reasonably deem 'vibes') with the Claude-code cli tool (only ever Opus 4.1; I have Max).
I heard your histogramming function was quite slow (#3435), so I ('we'?) spent a few hours trying to understand and improve. I'm not quite sure it 'fixes' that issue, entirely, but it does appear to help! My PR#5103 says the "codecov" is failing: by my understanding, this is simply because Numba isn't enabled in the test environment?
I'm happy to respond to any feedback you are kind enough to take the time to provide, given your knowledge of the codebase (code changes are.. I won't say "easy", but.. they're not the bottleneck anymore -- understanding is)! Or, on the other hand, to "beat it" -- if, you don't take kindly to AI-assisted PRs in this community! My apologies, if so!
If you're curious, here is a template that (when combined with post-agent verification, general watchfulness, skepticism, and a red-flag approach to the phrase, 'simpler solution') has been able to produce some rather impressive results, rather quickly. A version of this was used for my PR:
CLAUDE.md
(you are welcome to use this however you like -- or, to laugh at me for trying!)
Thank you for your time,
-Rye
Beta Was this translation helpful? Give feedback.
All reactions