part of Course 133 Navigating Matplotlib
Make your points expressive
Being able to show individual data points is a powerful way to communicate. Being able to change their appearance can make the story they tell much richer.
A note on terminology: In Matplotlib, plotted points are called "markers". In plotting, "points" already refers to a unit of measure, so calling data points "markers" disambiguates them. Also, as we'll see, markers can be far richer than a dot, which earns them a more expressive name.
We'll get all set up and create a few data points to work with. If any part if this is confusing, take a quick look at why it's here.
import matplotlib
matplotlib.use("agg")
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-1, 1)
y = x + np.random.normal(size=x.size)
fig = plt.figure()
ax = fig.gca()
Change the size
ax.scatter(x, y, s=80)
Using the s
argument, you can set the size of
your markers, in points squared. If you want a marker 10 points
high, choose s=100
.
Make every marker a different size
The real power of the scatter()
function somes out when
we want to modify markers individually.
sizes = (np.random.sample(size=x.size) * 10) ** 2
ax.scatter(x, y, s=sizes)
Here we created an array of sizes, one for each marker.
Change the marker style
Sometimes a circle just doesn't set the right tone. Luckliy Matplotlib has you covered. There are dozens of options, plus the ability to create custom shapes of any type.
ax.scatter(x, y, marker="v")
Using the marker
argument and the right character code,
you can choose whichever style
you like. Here are a few of the common ones.
- ".": point
- "o": circle
- "s": square
- "^": triangle
- "v": upside down triangle
- "+": plus
- "x": X
Make multiple marker types
Having differently shaped markers is a great way to distinguish between different groups of data points. If your control group is all circles and your experimental group is all X's the difference pops out, even to colorblind viewers.
N = x.size // 3
ax.scatter(x[:N], y[:N], marker="o")
ax.scatter(x[N: 2 * N], y[N: 2 * N], marker="x")
ax.scatter(x[2 * N:], y[2 * N:], marker="s")
There's no way to specify multiple marker styles in a single
scatter()
call, but we can separate our data out
into groups and plot each marker style separately. Here we chopped
our data up into three equal groups.
Change the color
Another great way to make your markers express your data story is by changing their color.
ax.scatter(x, y, c="orange")
The c
argument, together with any of the color names
(
as described in the post on lines) lets you change your
markers to whatever shade of the rainbow you like.
Change the color of each marker
If you want to get extra fancy, you can control the color of
each point individually. This is what makes scatter()
special.
ax.scatter(x, y, c=x-y)
One way to go about this is to specify a set of numerical values for the color, one for each data point. Matplotlib automatically takes them and translates them to a nice color scale.
Make markers transparent
This is particularly useful when you have lots of overlapping markers and you would like to get a sense of their density.
x = np.linspace(-1, 1, num=1e5)
y = x + np.random.normal(size=x.size)
To illustrate this, we first create a lot of data points.
ax.scatter(x, y, marker=".", alpha=.05, edgecolors="none")
Then by setting the alpha
argument to something small,
each individual point only contributes a small about of digital ink
to the picture. Only in places where lots of points overlap is the
result a solid color. alpha=1
represents no transparency
and is the default.
The edgecolors="none"
is necessary to remove the marker
outlines. For some marker types at least, the alpha
argument doesn't apply to the outlines, only the solid fill.
and even more...
If you are curious and want to explore all the other crazy things you can do with markers and scatterplots, check out the API.
We've only scratched the surface. Want to see what else you can change in your plot? Come take a look at the full set of tutorials.