FInal
final rmd
2023-04-24
library(datasets)
library(tidyverse)
## ── Attaching core tidyverse packages
──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr
1.1.1 ✔ readr 2.1.4
## ✔ forcats
1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2
3.4.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr
1.3.0
## ✔ purrr
1.0.1
## ── Conflicts
────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()
masks stats::lag()
## ℹ Use the
]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts
to become errors
library(plotly)
##
## Attaching package: 'plotly'
##
## The following object is masked from
'package:ggplot2':
##
## last_plot
##
## The following object is masked from
'package:stats':
##
## filter
##
## The following object is masked from
'package:graphics':
##
## layout
Problem description: The goal of this analysis is to
evaluate the impact of four different diets on the growth of chicks over time.
The hypothesis is that different diets will have varying effects on chick
weight gain, with some diets potentially leading to faster growth.
Related work: I have decided to refer to Abby
Krinalodi’s(Fall 2017) work in relation to my own.Both the their work on SAT
scores and my project on chick weight involve analyzing datasets and using
visualization techniques to uncover trends and patterns. However, they differ
in their specific datasets, methodologies, and visualization methods. Their’s
focuses on SAT scores and employs multivariate analysis, while my project
explores chick weight data using various methods such as time series analysis,
distribution, and correlation. The context and questions addressed in each
project are distinct, with their’s examining SAT scores in relation to high
school graduates, while my project compares the effects of different diets on
chick growth over time.
Solution: The approach entails using a variety of
visualizations to examine and contrast the weight increase of chicks on
different diets as time passes. The methodology consists of the following
methods:
Time series analysis: Facet ggplot lets me look at the
growth of chicks on each diet as time progresses, enabling the identification
of growth rate trends and an assessment of each diet’s effectiveness.
Distribution analysis: By employing violin plots, it is
possible to visualize the distribution of chick weights for each diet over
time. This offers insights into the variability of weight gain in chicks on
various diets and helps to detect any potential outliers or patterns in the
data.
Part to whole analysis: The average weight with standard
error over time for each diet is plotted to analyze the average growth rates
and variability of chicks on different diets. This visualization enables a
direct comparison of the diets’ performance in terms of average weight gain.
df = ChickWeight
split_datasets <- split(ChickWeight, ChickWeight$Diet)
set.seed(11)
# Access the dataset for Diet 1
diet1_data <-
split_datasets[[1]]
set.seed(11)
unique_chicks <- unique(diet1_data$Chick)
random_chicks <- sample(unique_chicks, 12)
diet1_data1 = diet1_data %>%
filter(Chick %in% random_chicks)
# Access the dataset for Diet 2
diet2_data <-
split_datasets[[2]]
# Access the dataset for Diet 3
diet3_data <-
split_datasets[[3]]
# Access the dataset for Diet 4
diet4_data <-
split_datasets[[4]]
#DIET1
ggplot(diet1_data1, aes(x = Time, y = weight, group = Chick, color
=
Chick)) +
geom_line(size = 1) +
theme_minimal() +
guides(color = guide_legend(override.aes = list(size = 2))) + labs(title = "Time vs
Weight for Each Chick in Diet 1",
x = "Time",
y = "Weight") +
theme_minimal()
## Warning: Using `size` aesthetic for lines was
deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see
where this warning was
## generated.
#DIET2
ggplot(diet2_data, aes(x = Time, y = weight, group = Chick, color
=
Chick)) +
geom_line(size = 1) +
theme_minimal() +
guides(color = guide_legend(override.aes = list(size = 2))) + labs(title = "Time vs
Weight for Each Chick in Diet 2",
x = "Time",
y = "Weight") +
theme_minimal()
#DIET3
ggplot(diet3_data, aes(x = Time, y = weight, group = Chick, color
=
Chick)) +
geom_line(size = 1) +
theme_minimal() +
guides(color = guide_legend(override.aes = list(size = 2))) + labs(title = "Time vs
Weight for Each Chick in Diet 3",
x = "Time",
y = "Weight") +
theme_minimal()
#DIET4
ggplot(diet4_data, aes(x = Time, y = weight, group = Chick, color
=
Chick)) +
geom_line(size = 1) +
theme_minimal() +
guides(color = guide_legend(override.aes = list(size = 2))) + labs(title = "Time vs
Weight for Each Chick in Diet 4",
x = "Time",
y = "Weight") +
theme_minimal()
# Combine datasets
combined_data <- rbind(
diet1_data1 %>% mutate(Diet = "Diet
1"),
diet2_data %>% mutate(Diet = "Diet
2"),
diet3_data %>% mutate(Diet = "Diet
3"),
diet4_data %>% mutate(Diet = "Diet
4")
)
# Facet ggplot
ggplot(combined_data, aes(x = Time, y = weight, group = Chick, color
=
Chick)) +
geom_line(size = 1) +
theme_minimal() +
guides(color = guide_legend(override.aes = list(size = 2))) +
labs(title = "Time vs
Weight for Each Chick",
x = "Time",
y = "Weight") +
theme_minimal() +
facet_wrap(~Diet, ncol = 2)
# Calculate mean weight for each
diet and time
mean_weight_data <-
combined_data %>%
group_by(Diet, Time) %>%
summarise(MeanWeight = mean(weight), SEM
=
sd(weight) / sqrt(n()), .groups = "drop") %>%
mutate(ErrorBarLength = 2 * SEM)
#violin plot
ggplot(combined_data, aes(x = Time, y = weight, fill = Diet)) +
geom_violin() +
labs(title = "Violin
Plot of Chick Weights Over Time for Each Diet",
x = "Time",
y = "Weight") +
theme_minimal() +
scale_fill_discrete(name = "Diet")
# Create the ggplot object
p <- ggplot(mean_weight_data, aes(x = Time, y = MeanWeight, color
=
Diet, group = Diet)) +
geom_line(size = 1) +
geom_errorbar(aes(ymin = MeanWeight - SEM, ymax = MeanWeight + SEM, text = paste("Error Bar Length:", round(ErrorBarLength, 2))), width = 0.2) +
labs(title = "Mean
Weight with Standard Error Over Time for Each Diet",
x = "Time",
y = "Mean
Weight") +
theme_minimal() +
scale_color_discrete(name = "Diet")
## Warning in geom_errorbar(aes(ymin = MeanWeight -
SEM, ymax = MeanWeight + :
## Ignoring unknown aesthetics: text
# Convert the ggplot object to a
plotly object
p <- ggplotly(p, tooltip = "text")
p
Facet ggplot (Time series analysis): The facet ggplot
presents the growth of chicks on each of the four diets over time. By plotting
individual growth trajectories for each chick, we can observe how the diets
affect the growth patterns. The following observations can be made from the
facet ggplot:
Diet 1: Chicks on Diet 1 exhibit the most inconsistent
growth patterns, with a wide range of growth rates among the chicks. Some
chicks experience rapid growth, while others display slower, steadier increases
in weight. Diet 2: Chicks on Diet 2 show more consistency in their growth
rates, particularly for the majority of chicks in the middle. However, one
chick in Diet 2 exhibits no growth, and another chick shows abnormal growth,
which adds some variability to the overall growth pattern for this diet. Diet
3: Chicks on Diet 3 demonstrate consistent growth for all individuals, but
there is noticeable variation between the chicks. The growth rates are more
uniform than in Diet 1 and Diet 2, but the distinction between the chicks is
apparent. Diet 4: The growth pattern for chicks on Diet 4 is the most
consistent of all the diets, with the least variation between the chicks. The
lines in the graph are closest to each other, indicating more uniform growth
rates and minimal differences among the chicks on this diet.
Violin plot
(Distribution analysis): The violin plot displays the distribution of chick
weights over time for each diet, allowing us to visualize the variability in
weight gain. From the violin plot, we can observe:
Diet 1: The
distribution of chick weights on Diet 1 is the most variable among the four
diets, with more fluctuations in the shape of the violin plot. This indicates
that there is a wider range of weight gains for chicks on Diet 1. Diet 2: The
weight distribution for chicks on Diet 2 is more consistent than Diet 1, with
less fluctuation in the shape of the violin plot. This suggests that chicks on
Diet 2 have a relatively more uniform growth rate compared to those on Diet 1.
Diet 3: The distribution of chick weights on Diet 3 appears to be the most
consistent of all diets, with the least fluctuation in the violin plot’s shape.
This indicates that the growth rate of chicks on Diet 3 is the most uniform
among the diets. Diet 4: The weight distribution for chicks on Diet 4 is more
variable than Diet 3 but less variable than Diet 1. This suggests that chicks
on Diet 4 have a moderately variable growth rate, with some chicks experiencing
better growth rates than others.
Mean weight plot (Part to whole analysis): The
mean weight plot displays the average growth rates and variability for each
diet, allowing for a direct comparison of the performance of each diet in terms
of average weight gain. Observations from the mean weight plot include:
Diet 4 seems to provide the most consistent growth among chicks,
with the least variation in weight over time. This may indicate that Diet 4
provides a more stable and predictable pattern of growth for chicks, which
could be desirable for researchers or farmers seeking a more uniform flock.
Diet 2 shows the highest variation in weight among the chicks, which might
suggest that the diet’s effectiveness is more inconsistent. This could be due
to differences in individual chick responses to the diet or other factors that
may cause varying growth patterns. Further investigation would be needed to
determine the cause of this inconsistency. Diets 1 and 3 show intermediate
variations in weight over time, with Diet 1 having slightly lower variation
than Diet 3. This suggests that these diets may provide relatively stable
growth patterns for chicks, but not as consistent as Diet 4. It’s important to
note that this analysis only focuses on the consistency of growth patterns and
not on the overall growth rate or final weight of the chicks.
From the created datasets, the average final
weight and the growth rates can also be determined. In concusion to the
analysis above and the new found values (Diet 1: Final weight: 189.11 Growth
rate (final weight / initial weight): 189.11 / 41.17 = 4.59
Diet 2: Final weight: 214.70 Growth rate : 214.70 / 40.70 = 5.27
Diet 3: Final weight: 270.30 Growth rate: 270.30 / 40.80 = 6.62
Diet 4: Final weight: 238.56 Growth rate: 238.56 / 41.00 = 5.82),
it can be concluded that Diet 3 appears to be the most effective
diet for promoting chick growth, with the highest average final weight and
growth rate. Diet 4, despite having the most consistent growth, ranks second in
terms of overall growth rate and final weight. Diets 1 and 2 are less effective
in promoting chick growth, with Diet 1 being the least effective among the four
diets.
Comments
Post a Comment