Skip to main content
Matplotlib/Seaborn

Matplotlib/Seaborn #

seaborn depends on matplotlib.

(Maybe) Useful resources #

General #

Doc: Matplotlib Application Interfaces (APIs) — Matplotlib 3.7.2 documentation

Start plotting:

import matplotlib.pyplot as plt
import seaborn as sns

plt.clf()
fig, ax = plt.subplots()

# If want subplots:
fig, ax = plt.subplots(1,3)

End plotting: (credit)

# Show legend:
ax.legend()

# If there are multiple subplots:
fig.tight_layout()

# Save as png
fig.savefig("fig-1")

# Save as pdf (keep words selectable)
fig.savefig("fig-1.pdf")

# Show on screen
fig.show()

Add Straight Lines #

Vertical line at given position:

# Add dashed vertical line at x=2190:
ax.axvline(2190, linestyle="--", color="lightgrey")

Horizontal line at given position:

# Add dotted horizontal line at y=20:
ax.axhline(20, linestyle=":", color="lightgrey")

Line connecting any two points:

# Draw a line between (x1,y1) and (x2,y2):
ax.plot([x1, x2], [y1, y2])

Colours #

Any colormap can be reversed by adding _r after its name, e.g. cmap="RdBu_r" makes the smaller value blue and the bigger value red.

Title #

Seaborn:

sns.someplot(...).set_title("This is a title")

Axis #

Set the x-axis and y-axis to same scale:

plt.axis('equal')
fig, ax = plt.subplots()

ax.plot(...)

Set limit of each axis:

fig, ax = plt.subplots()

# x-axis
ax.set_xlim(left=0, right=5)

# y-axis
ax.set_ylim(bottom=0, top=5)

Set axis to only use integer ticks: (credit)

ax.yaxis.get_major_locator().set_params(integer=True)

Ticks #

Adjust label formats:

(In this examle, the numbers will have thousand separators and/or 4 digit decimals.)

import matplotlib as mpl

fig, ax = plt.subplots()

ax.xaxis.set_major_formatter(mpl.ticker.StrMethodFormatter("{x:,.4f}"))
ax.yaxis.set_major_formatter(mpl.ticker.StrMethodFormatter("{x:,.0f}"))

Manually specify tick position & corresponding labels:

fig, ax = plt.subplots()

pos = [1000, 2000, 2190, 3000, 4000]
lab = ["1,000", "2,000", r"$X^*$", "3,000", "4,000"]
ax.set_xticks(pos)
ax.set_xticklabels(lab)

Set label that includes LaTeX maths:

fig, ax = plt.subplots()

ax.set_xlabel(r"$x_1$")
# if want to break lines, put outside r:
ax.set_xlabel("Something\n" + r"$x_1$")

Erase all labels:

ax.axes.get_yaxis().set_ticklabels([])

Adjust position of labels:

# Make the second label of x-axis a little bit left:
ax.xaxis.get_majorticklabels()[1].set_horizontalalignment("right")

Annotation #

Docs:

Usage: (credit)

fig, ax = plt.subplots()
sns.boxplot(ax=ax, data=df, order=["Good", "Ok", "Bad"])

ax.annotate(
    r"$N=" + str(len(df[df["Rate"]=="Good"])) + "$",
    xy=(0,0.75),    # The actual coordinates of the axes
    ha='center',    # Short for horizontalalignment of **kwargs for text
    va='bottom',    # Short for verticalalignment of **kwargs for text
    fontsize=12,
)

ax.annotate(
    r"$N=" + str(len(df[df["Rate"]=="Ok"])) + "$",
    xy=(1,0.75), ha='center', va='bottom', fontsize=12,
)

ax.annotate(
    r"$N=" + str(len(df[df["Rate"]=="Bad"])) + "$",
    xy=(2,0.75), ha='center', va='bottom', fontsize=12,
)

Plot With Subplots #

Generally the above-mentioned methods will work with adding indices like: ax[0,1].set_xlabel(r"$x_1$"), etc.

Plot into subplots with Seaborn #

Doc: Overview of seaborn plotting functions — seaborn 0.12.2 documentation

Also works for single plots:

fig, ax = plt.subplots()
# draws on top of previous layers
sns.histplot(ax=ax, ...)

# overwrites previous layers
ax = sns.histplot(...)

Ref: ds-micro-tutorials/data-analysis/subplotting.ipynb

fig1, axes1 = plt.subplots(2, 2, figsize=(10, 10))
fig1.suptitle("Big title")

sns.histplot(ax=axes1[0,0], data=df1, x="Rating", binwidth=0.1)
axes1[0,0].set_title("Subtitle 1-1")

sns.histplot(ax=axes1[0,1], data=df2, x="Rating", binwidth=0.1)
axes1[0,1].set_title("Subtitle 1-2")

sns.histplot(ax=axes1[1,0], data=df3, x="Rating", binwidth=0.1)
axes1[1,0].set_title("Subtitle 2-1")

sns.histplot(ax=axes1[1,1], data=df4, x="Rating", binwidth=0.1)
axes1[1,1].set_title("Subtitle 2-2")

fig1.tight_layout()
fig1.savefig(os.path.join("plot", "fig1.pdf"))
fig1.show() # or if in Jupyter: fig1

Title #

Main title: (credit)

fig, ax = plt.subplots(2,2)

plt.subplots_adjust(top=0.9)
grid.fig.suptitle("This is a big title")

Sub titles: ax[0, 0].set_title('Axis [0, 0]')

fig, ax = plt.subplots(2,2)

ax[1, 0].set_title(r"$X_2$ vs $Y_1$")

Total size #

If there are three subplots:

fig, ax = plt.subplots(1, 3, figsize=(12, 4))

Seaborn #

Use **kwargs #

Doc: Frequently asked questions — seaborn 0.12.2 documentation

For a list of all options see the **kwargs portion of: matplotlib.axes.Axes.plot — Matplotlib 3.7.2 documentation

Plot With jointplot #

Make the figure:

import matplotlib.pyplot as plt
import seaborn as sns

g = sns.jointplot(x=x[1], y=y, kind="hex")

Show the figure:

g.fig

Axis #

g.set_axis_labels(r"$x_1$", "y")

Ticks #

Set formatter:

import matplotlib as mpl

g.ax_joint.get_yaxis().set_major_formatter(mpl.ticker.StrMethodFormatter('{x:,.0f}'))
g.ax_joint.get_xaxis().set_major_formatter(mpl.ticker.StrMethodFormatter('{x:,.4f}'))

Order categorical axis #

Visualizing categorical data — seaborn 0.12.2 documentation

Add param in plot function: order=["Good", "Ok", "Bad"] (also works with string variables)

Heatmap #

Ref: Plotting a diagonal correlation matrix — seaborn 0.12.2 documentation

corr = df.corr()
mask = np.triu(np.ones_like(corr, dtype=bool))
# These two are identical:
cmap = sns.color_palette("light:#4c72b0", as_cmap=True)
cmap = sns.color_palette("blend:#FFF,#4c72b0", as_cmap=True)
# If want to check:
cmap

fig1, ax1 = plt.subplots(figsize=(11, 9))
sns.heatmap(
    corr,
    ax=ax1,
    mask=mask,
    cmap=cmap,
    vmax=1, vmin=0,     # do not set in the first pass!!
    square=True,
    linewidths=.5,
    cbar_kws={"shrink": .5},
)

Pair plot #

Docs:

No regression:

fig, ax = plt.subplots()

sns.pairplot(
    ax=ax,
    data=df,
    kind="kde",                             # if want kdeplot everywhere
    diag_kind="kde",                        # if want kdeplot diagonals
    diag_kws=dict(linewidth=0),             # for the histplot
    plot_kws=dict(linewidth=0, alpha=0.3),  # for the scatterplot
)

Use regression: (credit)

fig, ax = plt.subplots()

sns.pairplot(
    ax=ax,
    data=df,
    kind="reg",
    diag_kws=dict(linewidth=0),
    plot_kws=dict(
        line_kws=dict(linewidth=2, color="black"),  # for the regplot
        scatter_kws=dict(linewidth=0, alpha=0.3),   # for the scatterplot
    ),
)

Scatter plot #

Doc: seaborn.scatterplot — seaborn 0.12.2 documentation

fig, ax = plt.subplots()

sns.scatterplot(
    ax=ax,
    data=df,
    x="X",
    y="Y",
    hue="Type",     # For marker colours
    style="Type",   # For marker shapes
    linewidth=0,
    alpha=0.3,
)

Violin plot #

Ref: alpha does not work with violinplot · Issue #622 · mwaskom/seaborn

fig4, ax4 = plt.subplots()

sns.violinplot(
    ax=ax4,
    data=data_df, x="Categories", y="Numbers",
    inner=None,          # Hide miniature boxplots inside
    order=order=["Good", "Ok", "Bad"],
    linewidth=0,         # Cannot be used with inners other than None
    palette="viridis",
    # color="#069AF3",   # All the same colour
    bw=0.05,             # Reduce smoothing
    # These do not work for some reason:
    # showmeans=False,
    # showmedians=True,
)

plt.setp(ax4.collections, alpha=.5)

Boxplot; Controlling zorder of plot layers #

Slightly modified from: python - How to display boxplot in front of violinplot in seaborn - seaborn zorder? - Stack Overflow

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10, 2)).melt(var_name='group')

ax = sns.violinplot(
    data=df, x='group', y='value',
    color="#af52f4",
    inner=None,
    linewidth=0,
)

sns.boxplot(
    ax=ax,
    data=df, x='group', y='value',
    saturation=0.5,                 # Colour of box face (filling)
    width=0.4,                      # Scale Width of box
    palette='rocket',
    # flierprops={"marker": "o"},   # Set outliers' markers
    showfliers = False,             # Hide outliers
    linewidth=2,
    boxprops={"facecolor": "none", "zorder": 2},  # Make the boxes completely transparent
)

plt.show()

Problem with plots #

Data to Viz | A collection of graphic pitfalls

Boxplot #

Problem with boxplots: Hidden Data Under Boxplot, The Boxplot and its pitfalls

  • Show jitter dots
  • Show number of observations
  • Use Violin plot instead