A chart is a graphical representation of large datasets that enables businesses and development sectors to make impactful decisions in the future based on their data. In this Answer, we will discuss some charts in R, along with their illustrated examples.
A pie chart is a circular chart that helps in displaying categorical data in the form of slices. It gives the information at an abstract level. However, these are ineffective if you want to precisely extract information from a dataset. You should only use a pie chart after considering the nature of your dataset. Below is the code and the illustration of a pie chart.
# Example datacategories <- c("A", "B", "C", "D")values <- c(20, 30, 10, 40)# Create a pie chartpie(values, labels = categories)
A Gantt chart illustrates a project's schedule and tasks for a specific time interval. It allows us to visually represent our tasks' start and end dates and the dependencies between tasks. It is widely used in project management to track the progress of a project. The visualization of the Gantt chart is as follows with the code provided.
library(ggplot2)# Create a data frame with activity detailsactivities <- data.frame(Task = c("Task A", "Task B", "Task C"),StartDate = as.Date(c("2023-07-01", "2023-07-05", "2023-07-10")),EndDate = as.Date(c("2023-07-03", "2023-07-08", "2023-07-15")))# Create the Gantt chart using ggplot2ggplot(activities) +geom_segment(aes(x = StartDate, xend = EndDate, y = Task, yend = Task), linewidth = 10) +labs(title = "Project Timeline") +ylab("Task") +scale_x_date(date_labels = "%Y-%m-%d", date_breaks = "1 day") +theme_minimal() +theme(axis.text.x = element_text(angle = 45, hjust = 1))
A horizontal line on the chart represents each task in the Gantt chart. The length of the bar indicated the duration of the task.
A histogram is a visual representation of the distribution of a continuous or discrete variable. It provides a graphical summary of the frequency within different intervals of the variable. The data for a histogram consists of only one variable, and the range of that variable is divided into a set of equal-width intervals. The height of the bar represents the frequency of the variable. Here's a code example of how to make a histogram.
# Generate a large amount of random dataset.seed(123)data <- rnorm(10000, mean = 50, sd = 10)# Create a colored histogramhist(data, breaks = 30, col = "skyblue", border = "white",main = "Colored Histogram", xlab = "Values", ylab = "Frequency")
A waterfall chart or a bridge chart visually represents how an initial value or starting point changes over a series of positive and negative contributions. It is often used in finance, business analysis, and project management to illustrate the cumulative effect of various factors on a total value. We use a waterfall chart to analyze the factors that negatively or positively contribute to the overall outcome.
x <- -20:20y <- -20:20z <- sqrt(outer(x ^ 2, y ^ 2, "+"))contour(x, y, z)
It represents the multiple scatter plot in a grid and is used to determine the correlations of the variables in the data set. It is particularly useful when analyzing multivariate datasets. Scatter plot matrices can be enhanced with additional features, such as color coding for categorical variables or regression lines, to indicate the overall trend between variables. In the code below, we analyzed an example of a scatter plot matrix chart using the iris dataset.
# Load the built-in iris datasetdata(iris)# Create a scatterplot matrixpairs(iris[, 1:4], pch = 19, col = iris$Species)# Add a legendlegend("topright", legend = unique(iris$Species), col = unique(iris$Species), pch = 19)
Charts provide useful visualizations for analyzing our data and are used to observe dataset trends to make business, education, and development decisions. It enables researchers and analysts to extract meaningful insights from their data.