카테고리 없음

pandas categorical lightgbm

늘근이 2018. 8. 15. 00:00
df[col] = df[col].astype('category')

https://www.kaggle.com/c/home-credit-default-risk/discussion/59873



If you are using the sklearn API of lightgbm, then your life is even easier: you just need to set types of categorical features in a pd.DataFrame to 'category' and those will automatically be label-encoded and passed to the internal category handling in lightgbm. In practice, this behaviour is triggered by categorical_feature='auto', that is the default. Note, that such features will be handled as non-ordered categorical, i.e. if there is an order (e.g. ticket classes 'economy', 'business', 'first'), then you might still be better off using label encoding and do not notify lightgbm about origin of the feature being categorical.


for col in categorical_vars:
    df[col] = pd.Categorical(df[col].cat.codes+1, categories=['A', 'B', 'C', ... ])