Skip to content

Fix TA-Lib integration and clean up indicator library selection#359

Merged
BruceYanghy merged 3 commits intoAI4Finance-Foundation:masterfrom
LRAbbade:data
Mar 25, 2026
Merged

Fix TA-Lib integration and clean up indicator library selection#359
BruceYanghy merged 3 commits intoAI4Finance-Foundation:masterfrom
LRAbbade:data

Conversation

@LRAbbade
Copy link
Copy Markdown
Contributor

@LRAbbade LRAbbade commented Mar 7, 2026

Addressing a few issues with the TA-Lib path (select_stockstats_talib=1):

  • It didn't use the user's tech_indicator_list — MACD/RSI/CCI/DX were always hardcoded
  • Output columns used rsi, cci, dx rather than the stockstats-style rsi_30, cci_30, dx_30, so switching libraries would break downstream code
  • No NaN handling before TA-Lib calls, which doesn't tolerate gaps well
  • Common indicators like Bollinger Bands and SMAs weren't implemented

This is a follow-up to FinRL PR #1344 (already ported in #353), which fixed the auto_adjust / OHL adjustment in YahooDownloader. That addressed the data ingestion side — this one covers the indicator computation layer.

Further improvements:

  • Replaced the int flag with an IndicatorLib enum — IndicatorLib.STOCKSTATS / IndicatorLib.TALIB instead of 0/1.
  • Split add_technical_indicator into separate methods per backend instead of one large if/else block.
  • stockstats path now operates on a copy to avoid StockDataFrame.retype() mutating the original DataFrame.
  • Exposed drop_na_timesteps through the public API in data_processor.py.

LRAbbade and others added 3 commits March 7, 2026 17:57
- improve technical indicator lib selection
- unify technical indicators processing design
- fix some bugs with the data processing
@LRAbbade
Copy link
Copy Markdown
Contributor Author

LRAbbade commented Mar 22, 2026

Ok I understand this would cause backwards compatiblity issues, but I do think the legibility is significantly improved, so why don't we meet in the middle? Did a small update so it can still be called with 0,1, but the enum would also work.

from meta.data_processor import DataProcessor
from meta.data_processors._base import DataSource, IndicatorLib
from finrl.config import INDICATORS

p = DataProcessor(
    data_source=DataSource.yahoofinance,
    start_date='2025-01-01',
    end_date='2026-01-01',
    time_interval="1d",
)
p.download_data(ticker_list=['NVDA', 'MSFT', 'META'])
p.clean_data()
p.add_technical_indicator(INDICATORS, select_stockstats_talib=0)
p.dataframe

Example of old code using stockstats lib:

image

Example of old code using TA-Lib lib:

image

Example of new code using stockstats lib:

image

Example of new code using TA-Lib lib:

image

Note the difference in length of the TA-Lib result is because of dropping nans, which is why I exposed drop_na_timesteps in DataProcessor.add_technical_indicator

@BruceYanghy
Copy link
Copy Markdown
Member

Thanks for the fixes — this addresses several real issues in the TA‑Lib path.

Key wins:

  • TA‑Lib now respects the user-supplied tech_indicator_list instead of hardcoding a few indicators.
  • Output column names align with stockstats conventions (rsi_30/cci_30/dx_30), which avoids downstream breakage when switching libs.
  • NaN handling and common indicators (BBANDS, SMA) are added.
  • Cleaner separation between stockstats/talib paths, with legacy 0/1 selection still supported.

I’m in favor of merging once CI is green. The only behavior change to be mindful of is the default drop_na_timesteps and the TA‑Lib indicator whitelist (e.g., rsi vs rsi_30). If CI passes, this looks good to go.

@BruceYanghy BruceYanghy merged commit 133de6d into AI4Finance-Foundation:master Mar 25, 2026
0 of 2 checks passed
@LRAbbade LRAbbade deleted the data branch March 29, 2026 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants