Knowledge Check 1¶
Gestalt Principles of Perception¶
a set of psychological rules that explain how the human brain organizes sensory information, particularly visual information, into meaningful patterns and wholes¶
proximity - objects close together are perceived as a group
similarity - objects that share similar attributes are perceived as a group
enclosure - objects enclosed in a group are perceived as similar
closure - group of objects are drawn into something whole
continuity - when we look at a group, we naturally try to organize
connection - if objects are connected we perceive them as unified
Cairo's Principles¶
- truthful - objective
- functional - purpose
- beautiful - aesthetic
- insightful - meaning, insights
- enlightenment - story
- ethically responsible - no manipulation
Scale Lesson¶
Categorical¶
- nominal - name categories, no order
- ordinal - ordered
Quantitative¶
- Interval - equidistant, no "true" zero
- Ratio - equidistant, true zero
Interval vs Ratio¶
presence of a "true" or "absolute" zero is the difference b/w ratio (0) and interval (no 0)
- categorical
- ordinal
- quantitative
- sequential
- diverging
- cycle
Formats¶
- tables
- networks
- fields
- text
- multidimensional
- spatial
- trees
Line chart¶
- y - vertical
- x - horizontal
- line - connects visual elements
- used to show relation b/w 2 numerical variables
Bar¶
- y - data value
- x - category
- bars - seperated garphical element
Map¶
- geographical patterns
- dot - higher density, higher frequency
- chlorpleth - geographical boundaries with color
- can be combined with bar/line chart
pip install nbconvert
Knowledge Check 4¶
Matplotlib¶
Versatile
Customizable
High Quality
Extensible
Cross-Platform
Interactive
A figure is the top-level container that holds all the elements of a plot, representing the entire window or page where the plot is drawn
Disadvantages¶
- Verbose Syntaxe
- Default Aesthetics
- Limited interectivity and 3 d plotting
- Performance issues with large data
- Dependency on external libraries
import matplotlib as plt
%matplotlib inline
import numpy as np
%reload_ext autoreload
%autoreload 2
x = [0, 1, 2, 3, 4]
y = [0, 1, 4, 9, 16]
plt.plot(x, y)
plt.show()
import matplotlib.pyplot as plt
x = [0, 2, 4, 6, 8]
y = [0, 4, 16, 36, 64]
fig, ax = plt.subplots()
ax.plot(x, y, marker='o', label="Data Points")
ax.set_title("Basic Components of Matplotlib Figure")
ax.set_xlabel("X-Axis")
ax.set_ylabel("Y-Axis")
plt.show()
Different Types of Plots in Matplotlib¶
Matplotlib offers a wide range of plot types to suit various data visualization needs. Here are some of the most commonly used types of plots in Matplotlib:
- Line Graph
- Bar Chart
- Histogram
- Scatter Plot
- Pie Chart
- 3D Plot
The parts of a Matplotlib figure include (as shown in the figure above):¶
- Figure: The overarching container that holds all plot elements, acting as the canvas for visualizations.
- Axes: The areas within the figure where data is plotted; each figure can contain multiple axes.
- Axis: Represents the x-axis and y-axis, defining limits, tick locations, and labels for data interpretation.
- Lines and Markers: Lines connect data points to show trends, while markers denote individual data points in plots like scatter plots.
- Title and Labels: The title provides context for the plot, while axis labels describe what data is being represented on each axis.
Key Features of Matplotlib¶
- Versatile Plotting: Create a wide variety of visualizations, including line plots, scatter plots, bar charts, and histograms.
- Extensive Customization: Control every aspect of your plots, from colors and markers to labels and annotations.
- Seamless Integration with NumPy: Effortlessly plot data arrays directly, enhancing data manipulation capabilities.
- High-Quality Graphics: Generate publication-ready plots with precise control over aesthetics.
- Cross-Platform Compatibility: Use Matplotlib on Windows, macOS, and Linux without issues.
- Interactive Visualizations: Engage with your data dynamically through interactive plotting features.
- What is Matplotlib Used For?
- Matplotlib is a Python library for data visualization, primarily used to create static, animated, and interactive plots. It provides a wide range of * plotting functions to visualize data effectively.
Key Uses of Matplotlib:¶
- Basic Plots: Line plots, bar charts, histograms, scatter plots, etc.
- Statistical Visualization: Box plots, error bars, and density plots.
- Customization: Control over colors, labels, gridlines, and styles.
- Subplots & Layouts: Create multiple plots in a single figure.
- 3D Plotting: Surface plots and 3D scatter plots using mpl_toolkits.mplot3d.
- Animations & Interactive Plots: Dynamic visualizations with FuncAnimation.
- Integration: Works well with Pandas, NumPy and Jupyter Notebooks.
Creating a Simple Plot¶
### Creating a Simple Plot
x_1 = np.linspace(0,5)
y_1 = x_1**2
plt.plot(x_1, y_1)
plt.title('Days squared chart')
plt.xlabel('Days')
plt.ylabel('Days Squared')
plt.show()
Printing Multiples Plots¶
plt.subplot(1,2,1)
plt.plot(x_1, y_1, 'r')
plt.subplot(1,2,2)
plt.plot(x_1, y_1, 'b')
plt.show()
Using Figure Object¶
fig_1 = plt.figure(figsize=(5,4))
axes_1 = fig_1.add_axes([0.1,0.1,0.9,0.9])
axes_1.set_xlabel('Days')
axes_1.set_ylabel('Days Squared')
axes_1.set_title('Days Squared Chart')
axes_1.plot(x_1, y_1, label='x/x2')
axes_1.plot(y_1, x_1, label = 'x2/x')
axes_1.legend(loc=0)
plt.show()
Creating Bar Chart¶
# Data
categories = ['A', 'B', 'C', 'D']
values = [3,7,2,5]
plt.bar(categories, values)
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Simple Bar Chart')
plt.show()
plt.bar(x, height = 2, width=0.8, bottom=None, align = 'center')
plt.show()
- x - positions of x axis where bars are placed
- height- height of the bars
- width - width of the bars
- bottom- the y- coordinate of the bottom of the bars
- align - alignment of bars (center or edge)
categories = ['A', 'B', 'C', 'D']
values = [3,7,2,5]
plt.bar(categories, values, width=0.8, bottom=None, align = 'center')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Simple Bar Chart')
plt.show()
Setting linewidths, colors, linetypes¶
- deafult is 'b-' which is a solid blue line
- plotting as a red circle instead of a blue line
- plotting red dsahes, blue squares, and green triangles
plt.plot([1,2,3,4], [1,4,9,16], 'ro')
plt.show()
# evenly sampled time at 200ms intervals
t = np.arange(0., 5., 0.2)
# red dashes, blue squares, and green triangles
plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
plt.show()
Plotting with keyword strings¶
data = {'a': np.arange(50),
'c':np.random.randint(0,50,50),
'd': np.random.randn(50)}
data['b'] = np.abs(data['d']) * 100
plt.scatter('a', 'b', c ='c', s='d', data = data)
plt.xlabel('entry a')
plt.ylabel('entry b')
plt.show()
Plotting Images¶
- PIL: A library used for opening, manipulating, and saving many different image file formats
- Pillow: A more modern, activley maintained fork of PIL that provides the same functionalities with additional improvements and compatibility with newer versions of python
Each pixel value in the array represents the image's color info and the shape of array depends on the image
- Grayscale image: A 2D array with dimensions (heightt , width)
- Color image (RGB): A 3D array with dimensions where the third dimension corresponds with RGB color scale
# importing Image opject library
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
# converts image to a numpy array
img = np.asarray(Image.open('SCI.jpg'))
# print array
print(repr(img))
# display image
imgplot = plt.imshow(img) # ✅ assign to imgplot (optional)
plt.axis('off') # Optional: turn off axis
plt.show()
array([[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], ..., [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], ..., [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], ..., [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], ..., [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], ..., [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], ..., [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], ..., [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]], dtype=uint8)
Manipulating the image¶
Applying pseudocolor schemes
- enhancing contrast in visualizing data
- pseudocolor is only relevent to single-channel, grayscale, luminosity
- converting an RGB image to a 2D grayscale-like image by extracting only the REd channel for the image
img[:, :, 0] uses NumPy slicing to extract the first channel of the array across all rows and columns
- : means "select all" rows and columns
- 0 specifies the first channel, which corresponds to the red channel in an RGB image
lum_img = img[:, :, 0]
Change colormaps on existing plot objects using set_cmap()
- A colormap numerical data to colors
- set_cmap() allows you to specify how values should be translated into colors for visual representation
Common colormaps¶
- 'gray'- grayscale ranging from black to white
- 'viridis' - A perceptually uniform colormap from dark purple to yellow
- 'hot' - black to red to yellow to white
- 'jet' - blue to green to red
imgplot = plt.imshow(lum_img) # ✅ assign to imgplot (optional)
plt.axis('off') # Optional: turn off axis
plt.show()