- Pandas Data Type
(1) Inquiry: df.dtypes
(2) Setup: df.as
data = pd.DataFrame([3,4])
print(data.dtypes)
new_data = data.astype('category')
print(new_data.dtypes)
int64
and category
will be printed
1.1 Category Data Type
The category data type in pandas is a hybrid data type. It looks and behaves like a string in many instances but internally is represented by an array of integers. This allows the data to be sorted in a custom order and to more efficiently store the data.
In the following example, we should a case where category
column is set up:
import pandas as pd
from pandas.api.types import CategoricalDtype
sales_1 = [{'account': 'Jones LLC', 'Status': 'Gold', 'Jan': 150, 'Feb': 200, 'Mar': 140},
{'account': 'Alpha Co', 'Status': 'Gold', 'Jan': 200, 'Feb': 210, 'Mar': 215},
{'account': 'Blue Inc', 'Status': 'Silver', 'Jan': 50, 'Feb': 90, 'Mar': 95 }]
df_1 = pd.DataFrame(sales_1)
status_type = CategoricalDtype(categories=['Silver', 'Gold'], ordered=True)
df_1['Status'] = df_1['Status'].astype(status_type)