part of Course 133 Navigating Matplotlib


thumbnail of a scatterplot using large circular markers
ax.scatter(x, y, s=80)
thumbnail of a scatterplot using circular markers of differing sizes
sizes = np.random.sample(size=x.size)
ax.scatter(x, y, s=sizes)
thumbnail of a scatterplot using triangular markers
ax.scatter(x, y, marker="v")
thumbnail of a scatterplot using a variety of markers
ax.scatter(x1, y1, marker="o")
ax.scatter(x2, y2, marker="x")
ax.scatter(x3, y3, marker="s")
thumbnail of a scatterplot using orange markers
ax.scatter(x, y, c="orange")
thumbnail of a scatterplot using colorful markers
ax.scatter(x, y, c=x-y)
thumbnail of a scatterplot using transparent markers
ax.scatter(x, y, alpha=.05)

Make your points expressive

Being able to show individual data points is a powerful way to communicate. Being able to change their appearance can make the story they tell much richer.

A note on terminology: In Matplotlib, plotted points are called "markers". In plotting, "points" already refers to a unit of measure, so calling data points "markers" disambiguates them. Also, as we'll see, markers can be far richer than a dot, which earns them a more expressive name.

We'll get all set up and create a few data points to work with. If any part if this is confusing, take a quick look at why it's here.

import matplotlib
matplotlib.use("agg")
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(-1, 1)
y = x + np.random.normal(size=x.size)

fig = plt.figure()
ax = fig.gca()

Change the size

ax.scatter(x, y, s=80)

Using the s argument, you can set the size of your markers, in points squared. If you want a marker 10 points high, choose s=100.

scatterplot using large circular markers

Make every marker a different size

The real power of the scatter() function somes out when we want to modify markers individually.

sizes = (np.random.sample(size=x.size) * 10) ** 2
ax.scatter(x, y, s=sizes)

Here we created an array of sizes, one for each marker.

scatterplot using circular markers of differing sizes

Change the marker style

Sometimes a circle just doesn't set the right tone. Luckliy Matplotlib has you covered. There are dozens of options, plus the ability to create custom shapes of any type.

ax.scatter(x, y, marker="v")

Using the marker argument and the right character code, you can choose whichever style you like. Here are a few of the common ones.

scatterplot using triangular markers

Make multiple marker types

Having differently shaped markers is a great way to distinguish between different groups of data points. If your control group is all circles and your experimental group is all X's the difference pops out, even to colorblind viewers.

N = x.size // 3
ax.scatter(x[:N], y[:N], marker="o")
ax.scatter(x[N: 2 * N], y[N: 2 * N], marker="x")
ax.scatter(x[2 * N:], y[2 * N:], marker="s")

There's no way to specify multiple marker styles in a single scatter() call, but we can separate our data out into groups and plot each marker style separately. Here we chopped our data up into three equal groups.

scatterplot using a variety of markers

Change the color

Another great way to make your markers express your data story is by changing their color.

ax.scatter(x, y, c="orange")

The c argument, together with any of the color names ( as described in the post on lines) lets you change your markers to whatever shade of the rainbow you like.

scatterplot using orange markers

Change the color of each marker

If you want to get extra fancy, you can control the color of each point individually. This is what makes scatter() special.

ax.scatter(x, y, c=x-y)

One way to go about this is to specify a set of numerical values for the color, one for each data point. Matplotlib automatically takes them and translates them to a nice color scale.

scatterplot using colorful markers

Make markers transparent

This is particularly useful when you have lots of overlapping markers and you would like to get a sense of their density.

x = np.linspace(-1, 1, num=1e5)
y = x + np.random.normal(size=x.size)

To illustrate this, we first create a lot of data points.

ax.scatter(x, y, marker=".", alpha=.05, edgecolors="none")

Then by setting the alpha argument to something small, each individual point only contributes a small about of digital ink to the picture. Only in places where lots of points overlap is the result a solid color. alpha=1 represents no transparency and is the default.

The edgecolors="none" is necessary to remove the marker outlines. For some marker types at least, the alpha argument doesn't apply to the outlines, only the solid fill.

scatterplot using transparent markers

and even more...

If you are curious and want to explore all the other crazy things you can do with markers and scatterplots, check out the API.

We've only scratched the surface. Want to see what else you can change in your plot? Come take a look at the full set of tutorials.