-
Notifications
You must be signed in to change notification settings - Fork 556
Allow explicitly passing the frequency to predict_df #449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
abdulfatir
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @shchur! Left some comments. Let's chat about this when you have time.
src/chronos/df_utils.py
Outdated
| ) | ||
| # Use provided freq if available, otherwise use inferred freq | ||
| if freq is None: | ||
| freq = inferred_freq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If freq is not None, should we verify that inferred_freq matches it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand, the whole point is to skip running the frequency inference when freq is provided. Should we still do it?
| prediction_timestamps_flat = pd.DatetimeIndex(future_df[timestamp_column]) | ||
| for col in future_df.columns.drop([id_column, timestamp_column]): | ||
| future_covariates_dict[col] = future_df[col].to_numpy() | ||
| if validate_inputs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about removing this. This was nice check for incorrect slicing of future data. I am wondering if we should only allow freq with validate_inputs=False? I feel like freq in intended really for cases where you know what you are doing. For general use, automatic inference is probably the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I've updated the PR to only allow freq: str when validate_inputs=False. The logic should also be much simpler in this case
| regular frequency, and item IDs match between past and future data. Setting to False disables these checks. | ||
| [ADVANCED] When True (default), validates dataframes before prediction. Setting to False removes the | ||
| validation overhead, but may silently lead to wrong predictions if data is misformatted. When False, you | ||
| must ensure: (1) all dataframes are sorted by (id_column, timestamp_column); (2) future_df (if provided) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also say that items must be regularly spaced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed in all places in the code.
| Name of column containing timestamps | ||
| validate_inputs | ||
| When True, the dataframe(s) will be validated be conversion | ||
| [ADVANCED] When True (default), validates dataframes before prediction. Setting to False removes the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Add a comment about regularly-spaced series.
Issue #, if available: #425
Description of changes:
freq: str | Noneparameter topredict_dfmethods. This can only be set in combination withvalidate_inputs=False. If specified, the user-providedfreqwill be used instead of the tryin to infer thefreqfrom the data.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.