Skip links
Data Visualization Python 2023

Data Visualization in Python Using Matplotlib and Seaborn [2023]

Data visualization, an essential aspect of data science, allows us to uncover complex dataset patterns, relationships, and insights. Python, as a popular language for data analysis, offers a variety of powerful libraries for creating informative and visually appealing data visualizations. 

Matplotlib and Seaborn are powerful Python libraries for data visualization. Matplotlib provides a versatile base for creating static, interactive, and animated plots, while Seaborn simplifies the process with a high-level interface, enhancing the aesthetics of visualizations. Together, they enable users to effectively communicate insights and patterns from complex datasets.

In this tutorial, we’ll explore how to create stunning data visualizations using Matplotlib and Seaborn.

What is the Main Purpose of Data Visualization?

Data visualization, representing data in a graphical or pictorial format, plays a crucial role in data analysis for several reasons:

  • Pattern Discovery: Visualizations help us identify trends and patterns in data that may not be evident through numerical analysis alone.
  • Simplification: Complex datasets can be simplified and made more understandable through visuals.
  • Storytelling: Visualizations are powerful tools for conveying insights and telling a data-driven story to technical and non-technical audiences.
  • Decision-Making: Well-crafted visualizations assist decision-makers in making informed choices based on data.

Introduction to Matplotlib

Matplotlib is a versatile data visualization library for Python. It provides various options for creating static, animated, or interactive plots. Here’s how to get started with Matplotlib:

Installation

You can install Matplotlib using pip by entering the code:

  •  pip install matplotlib

Basic Example

Let’s create a simple line plot using Matplotlib to visualize a sample dataset:

import matplotlib.pyplot as plt

# Sample data

x = [1, 2, 3, 4, 5]

y = [2, 4, 1, 3, 5]

# Create a line plot

plt.plot(x, y)

# Add labels and a title

plt.xlabel(‘X-axis’)

plt.ylabel(‘Y-axis’)

plt.title(‘Simple Line Plot’)

# Show the plot

plt.show()

This code will generate a line plot displaying the relationship between x and y.

Introduction to Seaborn

Seaborn, which is a Python data visualization library based on Matplotlib, simplifies everyday visualization tasks and provides a high-level interface to create informative and aesthetically pleasing statistical graphics. Seaborn is particularly useful for visualizing complex datasets and statistical relationships.

Installation

You can install Seaborn using pip:

  •  pip install seaborn

Basic Example 

Here’s an example of creating a histogram using Seaborn to visualize the distribution of a dataset:

import seaborn as sns

import matplotlib.pyplot as plt

# Sample data

data = [0.2, 0.5, 0.7, 1.0, 1.2, 1.6]

# Create a histogram

sns.histplot(data, kde=True)

# Add labels and a title

plt.xlabel(‘Value’)

plt.ylabel(‘Frequency’)

plt.title(‘Histogram with KDE’)

# Show the plot

plt.show()

This code will generate a histogram with a kernel density estimate (KDE) overlay to represent the data’s distribution.

How to Create Visualizations with Matplotlib?

Matplotlib is a powerful library for creating various visualizations, from basic line plots to complex heatmaps and 3D plots. Here are some common types of plots and how to make them using Matplotlib:

Line Plot

Import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]

y = [2, 4, 1, 3, 5]

plt.plot(x, y)

plt.xlabel(‘X-axis’)

plt.ylabel(‘Y-axis’)

plt.title(‘Simple Line Plot’)

plt.show()

Bar Chart

import matplotlib.pyplot as plt

categories = [‘A’, ‘B’, ‘C’, ‘D’]

values = [25, 40, 30, 45]

plt.bar(categories, values)

plt.xlabel(‘Categories’)

plt.ylabel(‘Values’)

plt.title(‘Bar Chart’)

plt.show()

Scatter Plot

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]

y = [2, 4, 1, 3, 5]

plt.scatter(x, y)

plt.xlabel(‘X-axis’)

plt.ylabel(‘Y-axis’)

plt.title(‘Scatter Plot’)

plt.show()

Histogram

import matplotlib.pyplot as plt

import numpy as np

data = np.random.randn(1000)

plt.hist(data, bins=20)

plt.xlabel(‘Value’)

plt.ylabel(‘Frequency’)

plt.title(‘Histogram’)

plt.show()

Pie Chart

import matplotlib.pyplot as plt

labels = [‘A’, ‘B’, ‘C’, ‘D’]

sizes = [15, 30, 45, 10]

plt.pie(sizes, labels=labels, autopct=’%1.1f%%’)

plt.title(‘Pie Chart’)

plt.show()

These examples demonstrate the flexibility of Matplotlib in creating various types of data visualizations. You can customize each plot further by adjusting colors, styles, and other parameters to match your needs.

How to Create Visualizations with Seaborn?

Seaborn builds on Matplotlib’s foundation and simplifies the process of creating statistical visualizations. It provides high-level functions for many common plot types and is especially useful for exploring relationships between data variables. Here are a few examples of how to create different types of visualizations using Seaborn:

Scatter Plot with Regression Line

import seaborn as sns

import matplotlib.pyplot as plt

# Sample data

x = [1, 2, 3, 4, 5]

y = [2, 4, 1, 3, 5]

# Create a scatter plot with a regression line

sns.regplot(x, y)

plt.xlabel(‘X-axis’)

plt.ylabel(‘Y-axis’)

plt.title(‘Scatter Plot with Regression Line’)

plt.show()

Box Plot

import seaborn as sns

import matplotlib.pyplot as plt

# Sample data

data = [65, 75, 80, 85, 90, 100]

# Create a box plot

sns.boxplot(data)

plt.xlabel(‘Value’)

plt.title(‘Box Plot’)

plt.show()

Pair Plot (for exploring relationships in a dataset)

import seaborn as sns

# Sample dataset

iris = sns.load_dataset(‘iris’)

# Create a pair plot

sns.pairplot(iris, hue=’species’)

plt.title(‘Pair Plot of Iris Dataset’)

plt.show()

These Seaborn examples demonstrate how to create statistical visualizations and explore relationships between variables. Seaborn also offers features for styling and customization, making it a valuable tool for data scientists.

How to Customize Visualizations in Python?

Both Matplotlib and Seaborn allow you to customize your visualizations to make them more informative and visually appealing. Here are a few customization options:

Matplotlib Customization

  • Colors: You can specify colors using the color parameter in plotting functions.
  • Line Styles: Adjust the line style using the linestyle parameter.
  • Legends: Add legends to distinguish multiple data series.
  • Titles and Labels: Customize titles, axis labels, and other text elements.

Seaborn Customization

  • Color Palettes: Seaborn provides different color palettes for plots.
  • Styling Themes: You can change the style of your plots using Seaborn’s themes.
  • Axes Labels: Adjust axis labels and titles using functions like set_xlabel() and set_title().
  • Legends: Customize legends to improve plot readability.

Remember that customization helps tailor your visualizations to your specific data and audience.

Use Matplotlib and Seaborn for Data Visualization in Python

Data visualization is a critical part of the data science workflow. It allows data scientists and analysts to explore data, identify patterns, and communicate findings effectively. Matplotlib and Seaborn, two powerful Python libraries, offer many tools for creating stunning and informative data visualizations. 

By mastering these libraries, you can unlock the potential of your data and make it come to life. Whether you are a beginner or an experienced data scientist, these libraries will serve you well in your data visualization journey. So go ahead, explore your data, and tell its story through captivating visualizations.

Inferenz is a top data solution consulting company that extensively uses data visualization as a part of its client servicing. Our data solution includes data design, architecture, and engineering. Get in touch with us for a more in-depth review and analysis of your needs! 

contact inferenz for Data Visualization Python