[Python] Seaborn

Python

[Python] Seaborn

SangRok Jung 2022. 10. 1. 18:33

Seaborn

Seaborn은 Matplotlib의 기능과 스타일을 확장한 파이썬 시각화 도구의 고급 버전입니다.
비교적 단순한 인터페이스의 제공으로 초심자에게도 어렵지 않습니다.
Anaconda 설치시 함께 설치됩니다.

▶ 불러오기

import seaborn as sns

▶ 그래프 생성

fig = plt.figure(figsize=(15,5))
ax1 = fig.add_subplot(1,2,1)
ax2 = fig.add_subplot(1,2,2)

sns.regplot(x='age', y='fare', data=titanic, ax=ax1, order=2)
# order : 다항회귀
sns.regplot(x='age', y='fare', data=titanic, ax=ax2, fit_reg=False) #회귀선 숨김

ax1.set_ylim(0, 550)
ax2.set_ylim(0, 550)
ax1.set_xlim(0, 80)
ax2.set_xlim(0, 80)

plt.show()

회귀선 산점도

회귀선이 있는 산점도 입니다.

▷ 생성문

sns.regplot()

▶ 데이터를 불러옵니다.

import seaborn as sns
titanic = sns.load_dataset("titanic")

▶ 그래프를 생성합니다.

fig = plt.figure(figsize=(15,5))
ax1 = fig.add_subplot(1,2,1)
ax2 = fig.add_subplot(1,2,2)

sns.regplot(x='age', y='fare', data=titanic, ax=ax1, order=2)
# order : 다항회귀
sns.regplot(x='age', y='fare', data=titanic, ax=ax2, fit_reg=False) #회귀선 숨김

ax1.set_ylim(0, 550)
ax2.set_ylim(0, 550)
ax1.set_xlim(0, 80)
ax2.set_xlim(0, 80)

히스토그램 커널밀도

단변수의 데이터 분포를 확인할 때 사용합니다.
기본값으로 히스토그램과 커널 밀도 함수를 출력합니다.

▶ 생성문

sns.distplot()

▶ 다른 설정값의 그래프 3개를 생성합니다.

fig = plt.figure(figsize=(15,5))
ax1 = fig.add_subplot(1,3,1)
ax2 = fig.add_subplot(1,3,2)
ax3 = fig.add_subplot(1,3,3)

sns.distplot(titanic["fare"], ax=ax1)
sns.distplot(titanic.fare, ax=ax2, hist=False)
sns.distplot(titanic.fare, ax=ax3, kde=False)

ax1.set_title("fare kde/hist")
ax2.set_title("fare kde")
ax3.set_title("fare hist")

히트맵

대표적인 시각화 도구 입니다.

2개의 범주형 변수를 각각 x, y축에 넣고 데이터를 매트릭스 형태로 분류합니다.

▷ 생성문

sns.heatmap()

▶ 테이블을 생성합니다.

table = titanic.pivot_table(index=['sex'], columns=['class'], aggfunc='size')

▶ 그래프를 생성합니다.

sns.heatmap(table,
           annot=True,
           fmt='d',
           cmap='coolwarm',
           lw=1,
           cbar=False)

범주형 데이터의 산점도

범주형 변수에 들어 있는 각 범주별 데이터의 분포를 확인합니다.

stripplot : 데이터 포인트가 중복되어 범주별 분포를 그립니다.
swarmplot : 데이터의 분산까지 고려하여, 데이터 포인트가 서로 중복되지 않도록 그립니다. 데이터가 퍼져 있는 정도록 입체적으로 표현합니다.

▶ 생성문

sns.stripplot()
sns.swarmplot()

▷ 데이터를 불러옵니다.

import seaborn as sns
titanic = sns.load_dataset("titanic")

▷ 그래프를 생성합니다.

fig = plt.figure(figsize=(15,5))

ax1 = fig.add_subplot(1,2,1)
ax2 = fig.add_subplot(1,2,2)

sns.stripplot(x=titanic["class"], y=titanic.age, data=titanic, ax=ax1)
sns.swarmplot(x=titanic["class"], y=titanic.age, data=titanic, ax=ax2)

ax1.set_title('stripplot')
ax2.set_title('swarmplot')

막대 그래프

데이터의 개수가 아닌 평균을 계산하여 반환합니다.
막대그래프 위에 덧그려진 검은 선은 95%의 신뢰구간입니다.

▷ 생성문

sns.barplot()

▶ 데이터를 불러옵니다.

import seaborn as sns
titanic = sns.load_dataset("titanic")

▶ 그래프를 생성합니다.

fig = plt.figure(figsize=(15, 5))

ax1 = fig.add_subplot(1, 3, 1)
ax2 = fig.add_subplot(1, 3, 2)
ax3 = fig.add_subplot(1, 3, 3)

sns.barplot(x=titanic["sex"], y=titanic.survived, data=titanic, ax=ax1)
sns.barplot(x=titanic["sex"], y=titanic.survived, data=titanic, ax=ax2, hue=titanic["class"])
sns.barplot(x=titanic["sex"], y=titanic.survived, data=titanic, ax=ax3, hue=titanic["class"], dodge=False)

ax1.set_title("barplot")
ax2.set_title("barplot, hue=clasee")
ax3.set_title("barplot, hue=class, dodge=False")

▶ countplot()

각 범주에 속하는 데이터의 개수를 막대 그래프로 표현합니다.

▷ 데이터를 불러옵니다.

import seaborn as sns
titanic = sns.load_dataset("titanic")

▶ 그래프를 생성합니다.

fig = plt.figure(figsize=(15, 5))

ax1 = fig.add_subplot(1, 3, 1)
ax2 = fig.add_subplot(1, 3, 2)
ax3 = fig.add_subplot(1, 3, 3)

sns.countplot(x=titanic["class"], palette="Set3", data=titanic, ax=ax1)
sns.countplot(x=titanic["class"], palette="Set3", data=titanic, ax=ax2, hue=titanic["who"])
sns.countplot(x=titanic["class"], palette="Set3", data=titanic, ax=ax3, hue=titanic["who"], dodge=False)

ax1.set_title("titanic")
ax2.set_title("titanic, hue=sex")
ax3.set_title("barplot, hue=sex, dodge=False")

boxplot(), violinplot()

범주형 데이터 분포와 주요 통계 지표 함께 제공합니다.
분산 파악이 어렵습니다.
커널 밀도 함수 그래프를 y축 방향에 추가하여 boxplot()에서 violinplot()이 됩니다.

▶ 그래프를 생성합니다.

데이터를 불러오는것은 위와 동일합니다.
split : 두 데이터를 합쳐 대조되게 합니다.
paltte : 설정된 컬러셋 값을 불러옵니다.

fig = plt.figure(figsize=(15, 15))

ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)
ax4 = fig.add_subplot(2, 2, 4)

sns.boxplot(x="alive", y="age", palette="Set3", data=titanic, ax=ax1)
sns.boxplot(x="alive", y="age", palette="Set3", data=titanic, ax=ax2, hue="sex")
sns.violinplot(x="alive", y="age", palette="Set3", data=titanic, ax=ax3)
sns.violinplot(x="alive", y="age", palette="Set3", data=titanic, ax=ax4, hue='sex', split=True)

ax1.set_title("boxplot")
ax2.set_title("boxplot, hue=sex")
ax3.set_title("violinplot")
ax4.set_title("violinplot, hue=sex")

jointplot()

산점도를 기본으로 표시하고 x-y축에 각 변수에 대한 히스토그램을 동시에 표현합니다.
두 변수와의 관계와 데이터가 분산되어 있는 정도를 한눈에 파악하기 유리합니다.

fig = plt.figure(figsize=(15, 15))

ax1 = sns.jointplot(x="fare", y="age", color='green', data=titanic)
ax2 = sns.jointplot(x="fare", y="age", color="MediumBlue", data=titanic, kind='reg')
ax3 = sns.jointplot(x="fare", y="age", color="Indigo", data=titanic, kind='hex')
ax4 = sns.jointplot(x="fare", y="age", color="Coral", data=titanic, kind='kde')

ax1.fig.suptitle("jointplot")
ax2.fig.suptitle("jointplot, kind=reg")
ax3.fig.suptitle("jointplot, kind=hex")
ax4.fig.suptitle("jointplot, kind=kde")

ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)
ax4 = fig.add_subplot(2, 2, 4)

FacetGrid()

행, 열 방향으로 서로 다른 조건을 적용하여 여러 개의 서브 플롯이 생성됩니다.
각 서브 플롯에 적용할 그래프 종류를 map() 메서드를 이용하여 그리그 객체에 전달합니다.

▶ 그래프를 생성합니다.

데이터를 불러오는것은 위와 동일합니다.

g = sns.FacetGrid(data=titanic, row='survived', col='who')
g.map(plt.hist, 'age')

plt.show()

pairplot()

인자로 전달되는 데이터 프레임의 열(변수)을 두 개씩 짝을 지을 수 있는 모든 조합에 대해 표현합니다.

만들어진 짝의 개수 만큼 화면을 그리드로 나눕니다.

▶ 그래프를 생성합니다.

pair = titanic[['age', 'pclass', 'fare', 'survived']]
sns.pairplot(pair)

'Python' 카테고리의 다른 글

[Python] 정규화 (0)	2022.10.06
[Python] 데이터 사전처리 (0)	2022.10.01
[Python] Boxplot (0)	2022.09.30
[Python] Pie (0)	2022.09.30
[Python] Scatterplot (0)	2022.09.30

현재글[Python] Seaborn

꾸준함이 말미암아

[Python] Seaborn

Seaborn

회귀선 산점도

히스토그램 커널밀도

히트맵

범주형 데이터의 산점도

막대 그래프

▶ countplot()

boxplot(), violinplot()

jointplot()

FacetGrid()

pairplot()

'Python' 카테고리의 다른 글

'Python'의 다른글

티스토리툴바

[Python] Seaborn

Seaborn

회귀선 산점도

히스토그램 커널밀도

히트맵

범주형 데이터의 산점도

막대 그래프

▶ countplot()

boxplot(), violinplot()

jointplot()

FacetGrid()

pairplot()

'Python' 카테고리의 다른 글

'Python'의 다른글

관련글

티스토리툴바