Skip to content

add special dates to 02_holiday#654

Open
khuyentran1401 wants to merge 4 commits intomainfrom
add-date-feature-to-tutorial
Open

add special dates to 02_holiday#654
khuyentran1401 wants to merge 4 commits intomainfrom
add-date-feature-to-tutorial

Conversation

@khuyentran1401
Copy link
Contributor

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@elessar-for-github
Copy link

Changes made by this pull request:

@github-actions
Copy link
Contributor

github-actions bot commented Jun 18, 2025

Experiment Results

Experiment 1: air-passengers

Description:

variable experiment
h 12
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 12.6793 11.0623 47.8333 76
mape 0.027 0.0232 0.0999 0.1425
mse 213.936 199.132 2571.33 10604.2
total_time 0.7329 0.5385 0.0043 0.0033

Plot:

Experiment 2: air-passengers

Description:

variable experiment
h 24
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 58.1031 58.4587 71.25 115.25
mape 0.1257 0.1267 0.1552 0.2358
mse 4040.21 4110.79 5928.17 18859.2
total_time 0.4993 0.5608 0.0037 0.0035

Plot:

Experiment 3: electricity-multiple-series

Description:

variable experiment
h 24
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 178.293 268.13 269.23 1331.02
mape 0.0234 0.0311 0.0304 0.1692
mse 121589 219485 213677 4.68961e+06
total_time 0.4772 0.5966 0.0046 0.0044

Plot:

Experiment 4: electricity-multiple-series

Description:

variable experiment
h 168
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 465.497 346.972 398.956 1119.26
mape 0.062 0.0436 0.0512 0.1583
mse 835021 403760 656723 3.17316e+06
total_time 0.5848 0.5116 0.005 0.0045

Plot:

Experiment 5: electricity-multiple-series

Description:

variable experiment
h 336
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 558.673 459.757 602.926 1340.95
mape 0.0697 0.0565 0.0787 0.17
mse 1.22723e+06 739114 1.61572e+06 6.04619e+06
total_time 0.6848 0.6481 0.0051 0.0046

Plot:

@khuyentran1401 khuyentran1401 requested a review from Copilot June 18, 2025 23:45
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

@khuyentran1401 khuyentran1401 requested a review from ngupta23 June 18, 2025 23:46
@@ -2,7 +2,7 @@
"cells": [
Copy link
Member

@ngupta23 ngupta23 Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #7.            start=df.iloc[0][time_col], end=df.iloc[-1][time_col], freq="D"

This assumes data is sorted which it might not be. maybe use min() and max() if the time column (assuming it is datetime format).

Also, just take the frequency as an argument and use it here. This will make the function generic so users can use it for other frequencies also. If you do this, rename the variable monthly_features to just features


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngupta23 I noticed that the original tutorial uses "D" for dates_range but "M" for resample, so should I keep freq="D"?

def add_date_features_to_dataframe(df, date_extractor, time_col="month", freq="M"):
    # Create a copy of the dataframe
    df = df.copy()

    # Create date range
    dates_range = pd.date_range(
        start=df[time_col].min(), end=df[time_col].max(), freq="D"
    )

    # Get date feature indicators and resample to specified frequency
    features_df = date_extractor(dates_range)
    features = features_df.resample(freq).max()
    features = features.reset_index(names=time_col)

    # Merge with input dataframe
    result_df = df.merge(features)

    return result_df

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. I think it kept D for date range since the holidays will not be perfectly aligned with W and M freq (could occur on any day of the week/month. I guess your changes will work with resampling to freq, but keep date_range to use D.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@@ -2,7 +2,7 @@
"cells": [
Copy link
Member

@ngupta23 ngupta23 Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change "current" to "historical"?


Reply via ReviewNB

@@ -2,7 +2,7 @@
"cells": [
Copy link
Member

@ngupta23 ngupta23 Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #1.    fcst_df_holidays = nixtla_client.forecast(

Change to long horizon model to remove warning?


Reply via ReviewNB

@@ -2,7 +2,7 @@
"cells": [
Copy link
Member

@ngupta23 ngupta23 Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #4.        freq="M",

Change M to ME to remove the warning from the output.


Reply via ReviewNB

@@ -2,7 +2,7 @@
"cells": [
Copy link
Member

@ngupta23 ngupta23 Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #1.    nixtla_client.weights_x.plot.barh(x='features', y='weights', figsize=(10, 10))

At some point we had discussed to streamline this and not use the weights_x attribute, instead standardizing on the use of shap which is also what is shown in the interpretibility tutorial. Can you check with @marcopeix and @elephaint if we want to continue to show this?


Reply via ReviewNB

@@ -2,7 +2,7 @@
"cells": [
Copy link
Member

@ngupta23 ngupta23 Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove "also"


Reply via ReviewNB

@@ -2,7 +2,7 @@
"cells": [
Copy link
Member

@ngupta23 ngupta23 Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #4.        freq="M",

Same comments as above - change to ME and maybe use the long-horizon model?


Reply via ReviewNB

@@ -2,7 +2,7 @@
"cells": [
Copy link
Member

@ngupta23 ngupta23 Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #1.    nixtla_client.weights_x.plot.barh(x="features", y="weights", figsize=(10, 6))

Same comment as above - maybe use Shap?


Reply via ReviewNB

@khuyentran1401 khuyentran1401 enabled auto-merge (squash) June 19, 2025 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants