Pandas Note (1): Data Type

ifeelfree
1 min readNov 20, 2020

--

  1. Pandas Data Type

(1) Inquiry: df.dtypes

(2) Setup: df.as

data = pd.DataFrame([3,4]) 
print(data.dtypes)
new_data = data.astype('category')
print(new_data.dtypes)

int64 and category will be printed

1.1 Category Data Type

The category data type in pandas is a hybrid data type. It looks and behaves like a string in many instances but internally is represented by an array of integers. This allows the data to be sorted in a custom order and to more efficiently store the data.

In the following example, we should a case where category column is set up:

import pandas as pd
from pandas.api.types import CategoricalDtype

sales_1 = [{'account': 'Jones LLC', 'Status': 'Gold', 'Jan': 150, 'Feb': 200, 'Mar': 140},
{'account': 'Alpha Co', 'Status': 'Gold', 'Jan': 200, 'Feb': 210, 'Mar': 215},
{'account': 'Blue Inc', 'Status': 'Silver', 'Jan': 50, 'Feb': 90, 'Mar': 95 }]
df_1 = pd.DataFrame(sales_1)
status_type = CategoricalDtype(categories=['Silver', 'Gold'], ordered=True)
df_1['Status'] = df_1['Status'].astype(status_type)

--

--

No responses yet