The recent changes to LANGUAGE_STANDARDS and the ExcelReader class have broken some existing tests in test_extract.py and test_language_parser.py. The goal is to update these tests to match the new desired behaviors.
test_extract.py: TheExcelReaderwas refactored significantly to remove global forward filling (ffill()) for merged cells. That responsibility was intentionally moved toDataCleaner. Therefore,test_read_merged_cellsis no longer a valid test forExcelReaderand will be updated to expectNaNfor merged cells, or removed.test_language_parser.py: The regional codes for Chinese and Japanese changed fromB1~B3toCN_B1~CN_B3andC1~C2toJP_C1~JP_C2. The assertions intest_chinese_codes_exist,test_japanese_codes_exist, andtest_parse_b1_chinesestill expect the old keys. These will be updated.
- Modify
test_read_merged_cellsso it correctly asserts that the second row'snationvalue for a merged cell isnan(or technicallyNone/float('nan')), demonstrating thatExcelReaderis no longer incorrectly forward-filling data, which is now handled downstream.
- Modify
test_chinese_codes_existto loop over["CN_B1", "CN_B2", "CN_B3"]. - Modify
test_japanese_codes_existto loop over["JP_C1", "JP_C2"]. - Modify
test_parse_b1_chineseto assertlevel_code == "CN_B1"instead of"B1".
- Automated Tests:
- Run
python -m pytest tests/ -v - All tests should pass without any
AssertionError.
- Run