-
Notifications
You must be signed in to change notification settings - Fork 49
Open
Description
The problem of identifying missing values in dataset. The sample as following
data = {
'Age': [35, 42, 30, 29, 51, 38],
'Gender': [1, 0, 1, 0, 0, 1],
'Income': [45000, np.nan, 65000, 48000, 70000, np.nan],
'Education': ['2', '1', '1', '', '1', '2'],
'Satisfaction': [4.5, 3.2, np.nan, 4.8, 3.9, 4.1]
}
data = pd.DataFrame(data)
cat_col = ['Gender', 'Education']
group = 'Gender'
data_table1 = tb.TableOne(data, categorical = cat_col,
groupby = group, pval= True, htest_name=True, decimals=3)
print(data_table1)
The output for Education with missing values has a separate category. If I convert to str, then the 'None' will be a separate category.
Please advise how to deal with. Thank.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels