import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')

Line Plot

One of the simplest types of plots in Matplotlib is line, which draws a line defined by a function, such as y=f(x)

Properties of Line plot

  • x,y: Data points, horizontal and vertical. Typically, these parameters are 1D arrays.
  • fmt: A format string, e.g. ‘ro’ for red circles. We will later discuss this parameter in greater depth. fmt = '[marker][line][color]'
  • label: A string, which can be used in the legend.
  • linestyle: The style of the line, which can be ‘-’, ‘–’, ‘-.’, or ':'
  • linewidth: The width of the line.
  • marker: Marker style
  • color: Marker color

Line Styles

  1. ‘-’: solid line style|
  2. ‘–’: dashed line style|
  3. ‘-.’: dash-dot line style|
  4. ‘:’: dotted line style|

Color

  1. ‘b’ : blue
  2. ‘g’ : green
  3. ‘r’ : red
  4. ‘c’ : cyan
  5. ‘m’ : magenta
  6. ‘y’ : yellow
  7. ‘b’ : black
  8. ‘w’ : white

Marker

  1. ‘.’: point marker
  2. ‘o’: circle marker
  3. ‘v’: triangle_down marker
  4. ‘^’: triangle_up marker
  5. ‘*’: star marker
  6. ‘s’: square marker
  7. ‘+’: plus marker
x = np.linspace(-4,4,200)
fig,ax = plt.subplots()
ax.plot(x,np.sin(x))
[<matplotlib.lines.Line2D at 0x7f4053096eb8>]
fig,ax = plt.subplots()
ax.plot(x,np.sin(x),x,np.cos(x))
[<matplotlib.lines.Line2D at 0x7f405276cba8>,
 <matplotlib.lines.Line2D at 0x7f4052795160>]
fig, axe = plt.subplots(dpi=100)
axe.plot(x, x + 2, linestyle='-', color='r', marker='x', label="line1")
axe.plot(x, x + 3, linestyle='-', color='c', marker='s', label="line2")
axe.plot(x, x + 4, linestyle='-', color='m', marker='|', label="line3")
axe.plot(x, x + 5, linestyle='--', color='b', label="line4")
axe.plot(x, x + 6, linestyle='-.', color='y', label="line5")
axe.plot(x, x + 7, linestyle=':', color='b', label="line6")
axe.legend()
<matplotlib.legend.Legend at 0x7f4052296390>

Scatter Plot

A scatter plot is used to plot data points on a figure based on the horizontal and vertical axes. Scatter plots can be used to show the relationship between two variables. They can also show how data clusters in a dataset. We can use scatter to show the relationship between variables, or to show the distribution of data.

Properties of Scatter plot

  • x,y: Data points, horizontal and vertical. Commonly, these * parameters are 1D arrays.
  • s: A scalar or array, used to set the size of the marker.
  • c: A single color string or a sequence of colors, used to set the color of the marker.
  • marker: Sets the shape of the marker. We went over a marker list in the previous lesson…
  • cmp: Colormap is a way to map color, used only when c is an array of float.
  • alpha: The blending value, which we can set between 0 (transparent) and 1 (opaque).
rng = np.random.RandomState(32)
x = rng.randn(50)
y = rng.randn(50)

fig,ax = plt.subplots(dpi=200)

colors = rng.randn(50)
size = rng.randn(50)*500

# ax.grid()
ax.set_xlabel('Your X label')
ax.set_ylabel('Your Y label')
ax.set_title('Title of Graph')

ax.scatter(x=x,y=y,c=colors,s=size,alpha=0.5)
/usr/local/lib/python3.6/dist-packages/matplotlib/collections.py:885: RuntimeWarning: invalid value encountered in sqrt
  scale = np.sqrt(self._sizes) * dpi / 72.0 * self._factor
<matplotlib.collections.PathCollection at 0x7f40521b0a20>

Bar Chart

Bar chart are used to display values associated with categorical data.

Properties of Bar chart

  • x : Controls the sequence of scalars, or the x coordinates of the bars.
  • height : Controls the sequence of scalars, or the heights of the bars. In short, this parameter displays the value(s) of our data.
  • width : Sets the width of the bar. The default value is 0.8.
  • bottom : Sets the y coordinate(s) of the bases of the bars. The default value is 0.
  • color : Sets the color of the bar faces.
  • orientation : Determines the orientation of the bars, {‘vertical’, ‘horizontal’}. The default value is ‘vertical’.
fig, ax= plt.subplots(dpi=200)

label = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug"]
values = [100, 200, 300, 150, 440, 700, 350, 505]
ax.bar(label, values)
<BarContainer object of 8 artists>

More than One bar in a chart

Many times, multiple sets of data are bound to the same variable. In these cases, we need to show the data together on the same chart for comparison. On a bar chart, we can do this by using two sets of bars.

label = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug"]
values1 = [100, 200, 300, 150, 440, 700, 350, 505]
values2 = [200, 250, 360, 180, 640, 780, 520, 580]
values3 = [100, 200, 300, 150, 440, 700, 350, 505]

index = np.arange(len(label))

fig,ax = plt.subplots(dpi=200)

ax.bar(index,values1,width=0.3)
ax.bar(index+0.3,values2,width=0.3)
ax.bar(index+0.6,values3,width=0.3)

ax.set_xticks(index+0.3)
ax.set_xticklabels(label)
[Text(0, 0, 'Jan'),
 Text(0, 0, 'Feb'),
 Text(0, 0, 'Mar'),
 Text(0, 0, 'Apr'),
 Text(0, 0, 'May'),
 Text(0, 0, 'Jun'),
 Text(0, 0, 'Jul'),
 Text(0, 0, 'Aug')]

Stacked Bar Chart

Stacking the bars is a useful feature that allows us to stack multiple bars on top of each other.

label = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug"]
values1 = np.array([100, 200, 300, 150, 440, 700, 350, 505])
values2 = np.array([200, 250, 360, 180, 640, 780, 520, 580])
values3 = np.array([100, 200, 300, 150, 440, 700, 350, 505])

index = np.arange(len(label))

fig,ax = plt.subplots(dpi=100)

ax.bar(index,values1)
ax.bar(index,values2,bottom=values1)
ax.bar(index,values3,bottom=values1+values2)

ax.set_xticks(index)
ax.set_xticklabels(label)
[Text(0, 0, 'Jan'),
 Text(0, 0, 'Feb'),
 Text(0, 0, 'Mar'),
 Text(0, 0, 'Apr'),
 Text(0, 0, 'May'),
 Text(0, 0, 'Jun'),
 Text(0, 0, 'Jul'),
 Text(0, 0, 'Aug')]

Horizontal Bar Chart

When you have many catogaries its better to represent bar chart in horizontal format.

fig, axe = plt.subplots(dpi=100)

label = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug"]
index = np.arange(len(label))
values1 = [100, 200, 300, 150, 440, 700, 350, 505]
values2 = [200, 250, 360, 180, 640, 780, 520, 580]
axe.barh(index, values1)
axe.barh(index, values2, left=values1)
axe.set_yticks(index)
axe.set_yticklabels(label)
[Text(0, 0, 'Jan'),
 Text(0, 0, 'Feb'),
 Text(0, 0, 'Mar'),
 Text(0, 0, 'Apr'),
 Text(0, 0, 'May'),
 Text(0, 0, 'Jun'),
 Text(0, 0, 'Jul'),
 Text(0, 0, 'Aug')]

Error Bar

Error bars are graphical representations of the variability of data. They are used to indicate the error or uncertainty in a reported measurement. Error bars give a general idea of how precise a measurement is. For example, say we want to give a prediction about car sales for the next 12 months, but we are not 100% sure about our prediction. In order to indicate this uncertainty, we would provide a relative possible error. Error bars often represent one standard deviation of uncertainty, one standard error, or a particular confidence interval (e.g., a 95% confidence interval).

Properties of Error Bar

  • x,y: Defines the data location.
  • xerr,yerr: Sets the corresponding error to x, y.
  • fmt: Sets the format for the data points/data lines.
  • ecolor: Sets the color of the error line.
  • elinewidth: Sets the line width of the error bar.
  • uplims,lolims: It can be True or False. The default value is False. These arguments can be used to indicate that value gives only upper/lower limits.
  • capsize: Sets the length of the error bar caps in points.
x = np.linspace(1, 10, num=10)
y = 2 * np.sin(x/20 * np.pi)
yerr = np.random.normal(0, 0.3, 10)

fig, axe = plt.subplots(dpi=100)

axe.errorbar(x, y, yerr=yerr)
axe.errorbar(x, y+1, yerr=yerr)
axe.errorbar(x, y+2, yerr=yerr, fmt="-o")
<ErrorbarContainer object of 3 artists>
labels = ["dog", "fish", "cat", "bird", "sheep", "horse"]
values = [10, 8, 12, 7, 5, 9]
yerr = [1, 2, 3, 1, 2, 3]

fig, axe = plt.subplots(1,2,dpi=100,figsize=(20,10))
axe[0].bar(np.arange(0, len(labels)), values, label=labels,
        yerr=yerr, alpha=0.7, ecolor='r', capsize=8)
axe[0].set_xticks(np.arange(0, len(labels)))
axe[0].set_xticklabels(labels)

axe[1].errorbar(labels,values,yerr=yerr)
<ErrorbarContainer object of 3 artists>

Histogram

The histogram is an important graph in statistics and data analysis. It can be used to help people quickly understand the distribution of data. In order to draw a histogram, we follow the steps outlined below:

Step 1: Bin the range of your data.
Step 2: Divide the entire range of values into their corresponding bins.
Step 3: Count how many values fall into each different bin



Properties of Histogram

  • x : Our input values, either a single list/array or multiple sequences of arrays.
  • bins : If bins is set with an integer, it will define the number of equal-width bins within a range. If bins is set with a sequence, it will define the bin edges, including the left edge of the first bin and the right edge of the last bin.
  • histtype : Sets the style of the histogram. The default value is bar. step generates a line plot that is unfilled by default. stepfilled generates a line plot that is filled by default.
  • density : Sets True or False. The default is set to False. If True, the histogram will be normalized to form a probability density.
  • cumulative : Sets True or -1. If True, then a histogram is computed where each bin gives the count in that bin plus all bins for smaller values.
data = np.random.randn(2000)

fig,ax = plt.subplots(dpi=100)
ax.hist(data,bins=10)
(array([  9.,  80., 240., 487., 533., 416., 180.,  47.,   7.,   1.]),
 array([-3.1873996 , -2.46062252, -1.73384544, -1.00706837, -0.28029129,
         0.44648578,  1.17326286,  1.90003993,  2.62681701,  3.35359409,
         4.08037116]),
 <a list of 10 Patch objects>)
fig, axe = plt.subplots(nrows=2, ncols=2, dpi=200)
plt.tight_layout()
axe[0][0].hist(data, bins=30)
axe[0][0].set_title("set bins=30")
axe[0][1].hist(data, density=True)
axe[0][1].set_title("normalized")
axe[1][0].hist(data, color="r")
axe[1][0].set_title("set color as red")
axe[1][1].hist(data, histtype='step')
axe[1][1].set_title("step")
Text(0.5, 1.0, 'step')
data1 = np.random.normal(0, 1, 3000)
data2 = np.random.normal(-2.6, 1.8, 3000)
data3 = np.random.normal(2.4, 1.5, 3000)

fig, axe = plt.subplots(dpi=800)
axe.hist(data1, bins=40, density=True, histtype='stepfilled', alpha=0.3, label="mu=0,std=1")
axe.hist(data2, bins=40, density=True, histtype='stepfilled', alpha=0.3, label="mu=-2.6,std=1.8")
axe.hist(data3, bins=40, density=True, histtype='stepfilled', alpha=0.3, label="mu=2.4,std=1.5")
axe.legend()
<matplotlib.legend.Legend at 0x7f404dad5588>

Box Plot

A boxplot is a standardized way of displaying a dataset based on a five-number summary: the minimum, the maximum, the sample median, and the first and third quartiles. In statistics, the boxplot is a method for graphically depicting groups of numerical data through their quartiles.

Properties of Box plot

  • x : Array or a sequence of vectors. The input data.
  • vert : Set as True or False. The default value is True, which displays the boxes vertically.
  • labels : Sets the labels for each dataset.
  • notch : Set as True or False. the default value is False. If True, the parameter will produce a notched box plot.
  • widths : Sets the width of the box.
  • patch_artist : Set as True or False. the default value is False. If False, the parameter will produce boxes with the Line2D artist. Otherwise, the boxes will be drawn with Patch artists.
labels = ["Sun", "Moon", "Jupiter", "Venus"]
values = []
values.append(np.random.normal(100, 10, 200))
values.append(np.random.normal(90, 20, 200))
values.append(np.random.normal(120, 25, 200))
values.append(np.random.normal(130, 30, 200))

fig, axe = plt.subplots(dpi=100)
axe.boxplot(values, labels=labels)
plt.show()

Customizing Box plot

  • boxprops: We can specify the style of the box.
  • whiskerprops: We can specify the style of the whisker, which is the line that connects the quartiles to the minimum/maximum.
  • medianprops: We can specify the style of the median (the line that indicates the median).
labels = ["Sun", "Moon", "Jupiter", "Venus"]
values = []
values.append(np.random.normal(100, 10, 200))
values.append(np.random.normal(90, 20, 200))
values.append(np.random.normal(120, 25, 200))
values.append(np.random.normal(130, 30, 200))

fig, axe = plt.subplots(dpi=100)
axe.boxplot(values, labels=labels,patch_artist=True,
            boxprops=dict(facecolor='teal', color='r'))
plt.show()

Heat Maps

heatmap is a useful chart that we can use to show the relationship between two variables. Heatmap displays a general view of numerical data; it does not extract specific data points. It is a graphical representation of data where the individual values contained in a matrix are represented as colors. They can also be used to visualize missing values in data.

In order to create a heatmap, we can pass a 2-D array to imshow(). As we can see in the code below, passing the values to imshow is the core operation of the plot.

xlabels = ["dog", "cat", "bird", "fish", "horse"]
ylabels = ["red", "blue", "yellow", "pink", "green"]

values = np.array([[0.8, 1.2, 0.3, 0.9, 2.2],
                   [2.5, 0.1, 0.6, 1.6, 0.7],
                   [1.1, 1.3, 2.8, 0.5, 1.7],
                   [0.2, 1.2, 1.7, 2.2, 0.5],
                   [1.4, 0.7, 0.3, 1.8, 1.0]])

fig, axe = plt.subplots(dpi=100)
axe.set_xticks(np.arange(len(xlabels)))
axe.set_yticks(np.arange(len(ylabels)))
axe.set_xticklabels(xlabels)
axe.set_yticklabels(ylabels)
im = axe.imshow(values)
plt.show()
xlabels = ["dog", "cat", "bird", "fish", "horse"]
ylabels = ["red", "blue", "yellow", "pink", "green"]

values = np.array([[0.8, 1.2, 0.3, 0.9, 2.2],
                   [2.5, 0.1, 0.6, 1.6, 0.7],
                   [1.1, 1.3, 2.8, 0.5, 1.7],
                   [0.2, 1.2, 1.7, 2.2, 0.5],
                   [1.4, 0.7, 0.3, 1.8, 1.0]])

fig, axe = plt.subplots(dpi=300)
axe.set_xticks(np.arange(len(xlabels)))
axe.set_yticks(np.arange(len(ylabels)))
axe.set_xticklabels(xlabels)
axe.set_yticklabels(ylabels)
im = axe.imshow(values)

for i in range(len(xlabels)):
    for j in range(len(ylabels)):
        text = axe.text(i, j, values[i, j],
                       horizontalalignment="center", verticalalignment="center", color="w")

Add a color bar to the heatmap

Although we have added some text to the chart, we still want to include a special legend to show the relationship between a color and its value. Colorbar is what we need. As we can see in the example code below, all we have to do is call axes.figure.colorbar()

xlabels = ["dog", "cat", "bird", "fish", "horse"]
ylabels = ["red", "blue", "yellow", "pink", "green"]

values = np.array([[0.8, 1.2, 0.3, 0.9, 2.2],
                   [2.5, 0.1, 0.6, 1.6, 0.7],
                   [1.1, 1.3, 2.8, 0.5, 1.7],
                   [0.2, 1.2, 1.7, 2.2, 0.5],
                   [1.4, 0.7, 0.3, 1.8, 1.0]])

fig, axe = plt.subplots(dpi=100)
axe.set_xticks(np.arange(len(xlabels)))
axe.set_yticks(np.arange(len(ylabels)))
axe.set_xticklabels(xlabels)
axe.set_yticklabels(ylabels)
im = axe.imshow(values)

for i in range(len(xlabels)):
    for j in range(len(ylabels)):
        text = axe.text(i, j, values[i, j],
                       horizontalalignment="center", verticalalignment="center", color="w")
axe.figure.colorbar(im, ax=axe)
<matplotlib.colorbar.Colorbar at 0x7f404d802320>

Drawing Spider chart

The Spider chart is useful in showing relative values for a single data point, or for comparing two or more items as they relate to various categories. Unlike many of the other plot types we’ve learned about, Matplotlib doesn’t provide a radar function. In order to draw a radar chart, we need to write the code ourselves.

Steps to Radar Chart

  1. Step 1: Calculate the angle for the value of each category. We begin by assigning each category value an angle. The angles are accumulated according to the order in which they occur.
  2. Step 2: Append the first item to the last, making the circular graph close.
  3. Step 3: Set the coordinate system as polar for your figure.
  4. Step 4: Use the plot() function to plot. We use plot() here in the same way that we used it in the lesson How to Draw a Line Plot.
  5. Step 5: Fill the area. This step is optional.
import math
labels = ["Sun", "Moon", "Jupiter", "Venus", "Mars", "Mecury"]
values=[10,8,4,5,2,7]
angles = [n/float(len(labels)) * 2 *math.pi for n in range(len(labels))]

# Append first value to last to create circular connection
values.append(values[0])
angles.append(angles[0])

fig,ax = plt.subplots(subplot_kw=dict(polar=True),dpi=200)

ax.set_xticks(angles[:-1])
ax.set_xticklabels(labels)

ax.plot(angles,values)
ax.fill(angles,values,'skyblue',alpha=0.4)



plt.show()
labels = ["Sun", "Moon", "Jupiter", "Venus", "Mars", "Mecury"]
values = [10, 8, 4, 5, 2, 7]
values2 = [3, 7, 2, 8, 5, 9]
values += values[:1]
values2 += values2[:1]
angles = [n / float(len(labels)) * 2 * math.pi for n in range(len(labels))]
angles += angles[:1]

fig, axe = plt.subplots(subplot_kw=dict(polar=True), dpi=200)
axe.set_xticks(angles[:-1])
axe.set_xticklabels(labels, color='r')
axe.plot(angles, values)
axe.fill(angles, values, 'skyblue', alpha=0.4)

axe.plot(angles, values2)
axe.fill(angles, values2, 'teal', alpha=0.4)
plt.show()

Drawing color bars

The simplest way to draw a color bar is by calling colorbar().

Notice:: In our other lessons, most of the functions are called from the axes object. In this lesson, the colorbar() is called from the figure object. Let’s see some of the parameters required by colorbar.

`mappable`: The matplotlib.cm.ScalarMappable described by the colorbar.

`cax`: The axes object onto which the color bar will be drawn.

What is matplotlib.cm.ScalarMappable?

In short, ScalarMappable is a class to map scaler data to RGBA. This class requires two parameters:

`norm`: A normalize class, which typically maps a range number to the interval [0, 1].

`cmap`: The colormap is used to map normalized data values to RGBA colors. Matplotlib has already defined many color maps, which we can use by calling plt.

`get_cmap() `with a name as the parameter.
import matplotlib.pyplot as plt
import matplotlib.colors

fig, axe = plt.subplots(dpi=800, figsize=(12, 2))
fig.subplots_adjust(bottom=0.5)
cmap = plt.get_cmap("viridis", 5)
norm = matplotlib.colors.Normalize(vmin=0, vmax=1)
cmapper = matplotlib.cm.ScalarMappable(norm=norm, cmap=cmap)
cmapper.set_array([])
fig.colorbar(cmapper,
             cax=axe,
             orientation='horizontal',
             label="viridis colormap")
<matplotlib.colorbar.Colorbar at 0x7f404d6ff898>
import matplotlib.pyplot as plt
import matplotlib.colors

fig, axe = plt.subplots(dpi=800, figsize=(10, 2))
fig.subplots_adjust(bottom=0.5)
cmap = plt.get_cmap("viridis")
norm = matplotlib.colors.Normalize(vmin=0, vmax=1)
cmapper = matplotlib.cm.ScalarMappable(norm=norm, cmap=cmap)
cmapper.set_array([])
fig.colorbar(cmapper,
             cax=axe, orientation='horizontal', label="viridis colormap")
<matplotlib.colorbar.Colorbar at 0x7f4050872e80>

3D Plots

Matplotlib provides many functions for drawing 3D plots. However, we will only focus on one of them in this course: the surface plot. A three-dimensional graph is essentially a plot of points in three dimensions, with data points that are triples (x,y,z)(x,y,z)(x,y,z). A surface plot is like a wireframe plot, but each face of the wireframe is a filled polygon. We can also add a colormap to the surface plot, which can aid in the perception of the topology of the surface

The basic function to plot a surface is plot_surface(). Below are some of its important parameters:

  • X, Y, Z: 2D arrays, the data values.
  • rcount, ccount: The maximum number of samples used in each direction. If the input data is large, it will downsample the data.
  • cmap: Sets the colormap of the surface patches.
  • color: Sets the color of the surface patches.
  • norm: Sets the normalization for the colormap.

Notice:

  • In order to draw a 3D plot, we must import Axes3D from mpl_toolkits.mplot3d, like at line 1 in the example code below. This import registers the 3D projection, but is otherwise unused.
  • We need to set the projection='3d' for our axes like at line 13. Otherwise, the code will fail.
  • The meshgird helps us generate all (x,y)(x,y)(x,y) pairs based on our x, y space. In short, the function creates a cartesian set for the cos(sqrt(x2+y2))cos(sqrt(x^2 + y^2))cos(sqrt(x​2​​+y​2​​)) at line 9 and 10.
from mpl_toolkits.mplot3d import Axes3D  # required
X = np.linspace(-5,5,200)
Y = np.linspace(-5,5,200)

X,Y = np.meshgrid(X,Y)
Z = np.cos(np.sqrt(X**2+Y**2))

fig,ax = plt.subplots(dpi=200)
ax = fig.gca(projection='3d')
ax.plot_surface(X,Y,Z)
plt.show()
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:9: UserWarning: Requested projection is different from current axis projection, creating new axis with requested projection.
  if __name__ == '__main__':

Adding color bar to surface

from mpl_toolkits.mplot3d import Axes3D  # required
X = np.linspace(-5,5,200)
Y = np.linspace(-5,5,200)

X,Y = np.meshgrid(X,Y)
Z = np.cos(np.sqrt(X**2+Y**2))

fig,ax = plt.subplots(dpi=200)
ax = fig.gca(projection='3d')
surface = ax.plot_surface(X,Y,Z,cmap=plt.get_cmap('plasma'))
plt.colorbar(surface)
plt.show()
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:9: UserWarning: Requested projection is different from current axis projection, creating new axis with requested projection.
  if __name__ == '__main__':

Fill between curves

In order to draw a confidence region, the basic function to call is fill_between(). Below are some of the important parameters required by the function:

  • x : The x coordinates of the nodes defining the curves.
  • y1 : The y coordinates of the nodes defining the first curve.
  • y2 : The y coordinates of the nodes defining the second curve.
  • where : Array of bool. Defines where to exclude some horizontal regions from being filled. For example, fill between x[i] and x[i+1] if where[i] and where[i+1] is True.
x = np.linspace(-5,5,200)
y1 = np.sin(x)

fig,ax = plt.subplots(dpi=100)
ax.plot(x,y1)
ax.fill_between(x,y1,color='green',alpha=0.4)
<matplotlib.collections.PolyCollection at 0x7f404d20cef0>
x = np.linspace(-5,5,200)
y1 = np.sin(x)

fig,ax = plt.subplots(dpi=100)
ax.plot(x,y1,color='black')
ax.fill_between(x,y1,where=(y1>0),color='green',alpha=0.4)
ax.fill_between(x,y1,where=(y1<0),color='red',alpha=0.4)
<matplotlib.collections.PolyCollection at 0x7f404d3bb1d0>

How to draw a confidence band

We will be able to draw a confidence band if we follow the steps below:

Step 1: Prepare the data that we need. In the example code below, our data is x and y at line 4 and line 5.

Step 2: Fit our data. NumPy provides the useful function polyfit. On our linear data set, we can fit a slope and an intercept, which we’ve done with a and b at line 7.

Notice: polyfit is a least squares polynomial fit, which it a polynomial p(x) = p[0] x^{0} + p[1]x^{1} ... +p[n] to points (x, y). Polyfit returns a vector of coefficients, pp, that minimizes the squared error. More details can be found on the official Matplotlib site.

Step 3: Get an estimation curve based on a and b, which we’ve done with y_est at line 8.

Step 4: Get the error, which we’ve done with y_err at line 9.

Step 5: Draw the plot by using fill_between(). The band is between $ y\_est - y\_erry_est−y_err\ and \ y\_est + y\_erry_est+y_err$ which we’ve set at line 14.

x = np.linspace(0, 10, 11)
y = [3.9, 4.4, 10.8, 10.3, 11.2, 13.1, 14.1,  9.9, 13.9, 15.1, 12.5]

# step 2
a,b = np.polyfit(x,y,1)

# step 3
y_est = a * x + b

# set 4
y_err = x.std() * np.sqrt((x - x.mean())**2 / np.sum((x - x.mean())**2) + (1/len(x)))

# step 4
fig,ax = plt.subplots(dpi=100)
ax.plot(x,y_est,'-')
ax.fill_between(x, y_est - y_err, y_est + y_err, alpha=0.2)
ax.scatter(x,y)
plt.show()

Stacked Plot

The idea of stack plots is to show “parts to a whole” over time; basically, it's like a pie-chart, only over time. Stack plots are mainly used to see various trends in variables over a specific period of time.

Properties

stackplot() is the basic function that Matplotlib provides to create a stack plot. Below are some of the important parameters required by the function:

x: 1d array of dimension $N$.

y: 2d array, dimension$(M \times M)$,or a sequence of arrays. Each dimension is $1 \times N$.

stackplot(x, y)
stackplot(x, y1, y2, y3)

baseline: The method used to calculate the baseline.

  • zero a constant zero baseline, which is the default setting.
  • sym sets the plot symmetrically around zero.
  • wiggle minimizes the sum of the squared slopes.

colors: Provides a list of colors for each data set.

x = [1, 2, 3, 4, 5]
y = [1, 2, 4, 8, 16]
y1 = y+np.random.randint(1,5,5)
y2 = y+ np.random.randint(1,5,5)
y3 = y+np.random.randint(1,5,5)
y4 = y+np.random.randint(1,5,5)
y5 = y+np.random.randint(1,5,5)
y6 = y+np.random.randint(1,5,5)

labels = ["Jan", "Feb", "Mar", "Apr", "May"]

fig,ax = plt.subplots(dpi=200)
ax.stackplot(x,y,y1,y2,y3,y4,y5,y6,labels=["A", "B", "C", "D", "E", "F", "G"])
ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.legend(loc='upper left')
<matplotlib.legend.Legend at 0x7f404d0e19e8>

Different baseline

In the example above, the baseline we used is zero, which is the default option. In the example below, we will learn about the other two options, sym and wiggle. In order the change the position of the baseline, we pass the corresponding string to the parameter baseline.

np.random.seed(42)
x = [1, 2, 3, 4, 5]
y = [1, 2, 4, 8, 16]
y1 = y+np.random.randint(1,5,5)
y2 = y+ np.random.randint(1,5,5)
y3 = y+np.random.randint(1,5,5)
y4 = y+np.random.randint(1,25,5)
y5 = y+np.random.randint(1,15,5)
y6 = y+np.random.randint(1,20,5)

labels = ["Jan", "Feb", "Mar", "Apr", "May"]

fig, axe = plt.subplots(nrows=2, dpi=200)
plt.tight_layout()

axe[0].stackplot(x, y, y1, y2, y3, y4, y5, y6,
                baseline="sym")
axe[0].set_xticks(x)
axe[0].set_xticklabels(labels)
axe[0].set_title("symmetric")

axe[1].stackplot(x, y, y1, y2, y3, y4, y5, y6,
                baseline="wiggle")
axe[1].set_xticks(x)
axe[1].set_xticklabels(labels)
axe[1].set_title("wiggle")
Text(0.5, 1.0, 'wiggle')