FInal



final rmd

2023-04-24

library(datasets)
library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.1     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1    
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors

library(plotly)

##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
##     last_plot
##
## The following object is masked from 'package:stats':
##
##     filter
##
## The following object is masked from 'package:graphics':
##
##     layout

Problem description: The goal of this analysis is to evaluate the impact of four different diets on the growth of chicks over time. The hypothesis is that different diets will have varying effects on chick weight gain, with some diets potentially leading to faster growth.

Related work: I have decided to refer to Abby Krinalodi’s(Fall 2017) work in relation to my own.Both the their work on SAT scores and my project on chick weight involve analyzing datasets and using visualization techniques to uncover trends and patterns. However, they differ in their specific datasets, methodologies, and visualization methods. Their’s focuses on SAT scores and employs multivariate analysis, while my project explores chick weight data using various methods such as time series analysis, distribution, and correlation. The context and questions addressed in each project are distinct, with their’s examining SAT scores in relation to high school graduates, while my project compares the effects of different diets on chick growth over time.

Solution: The approach entails using a variety of visualizations to examine and contrast the weight increase of chicks on different diets as time passes. The methodology consists of the following methods:

Time series analysis: Facet ggplot lets me look at the growth of chicks on each diet as time progresses, enabling the identification of growth rate trends and an assessment of each diet’s effectiveness.

Distribution analysis: By employing violin plots, it is possible to visualize the distribution of chick weights for each diet over time. This offers insights into the variability of weight gain in chicks on various diets and helps to detect any potential outliers or patterns in the data.

Part to whole analysis: The average weight with standard error over time for each diet is plotted to analyze the average growth rates and variability of chicks on different diets. This visualization enables a direct comparison of the diets’ performance in terms of average weight gain.

df = ChickWeight
split_datasets
<- split(ChickWeight, ChickWeight$Diet)

set.seed(
11)

# Access the dataset for Diet 1

diet1_data
<- split_datasets[[1]]

set.seed(
11)
unique_chicks
<- unique(diet1_data$Chick)
random_chicks
<- sample(unique_chicks, 12)

diet1_data1
= diet1_data %>%
  filter(Chick %in% random_chicks)

# Access the dataset for Diet 2
diet2_data
<- split_datasets[[2]]

# Access the dataset for Diet 3
diet3_data
<- split_datasets[[3]]

# Access the dataset for Diet 4
diet4_data
<- split_datasets[[4]]

#DIET1
ggplot(diet1_data1, aes(
x = Time, y = weight, group = Chick, color = Chick)) +
  geom_line(
size = 1) +
  theme_minimal() +
  guides(
color = guide_legend(override.aes = list(size = 2))) +  labs(title = "Time vs Weight for Each Chick in Diet 1",
      
x = "Time",
      
y = "Weight") +
  theme_minimal()

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.


#DIET2
ggplot(diet2_data, aes(
x = Time, y = weight, group = Chick, color = Chick)) +
  geom_line(
size = 1) +
  theme_minimal() +
  guides(
color = guide_legend(override.aes = list(size = 2))) +  labs(title = "Time vs Weight for Each Chick in Diet 2",
      
x = "Time",
      
y = "Weight") +
  theme_minimal()


#DIET3
ggplot(diet3_data, aes(
x = Time, y = weight, group = Chick, color = Chick)) +
  geom_line(
size = 1) +
  theme_minimal() +
  guides(
color = guide_legend(override.aes = list(size = 2))) +  labs(title = "Time vs Weight for Each Chick in Diet 3",
      
x = "Time",
      
y = "Weight") +
  theme_minimal()


#DIET4
ggplot(diet4_data, aes(
x = Time, y = weight, group = Chick, color = Chick)) +
  geom_line(
size = 1) +
  theme_minimal() +
  guides(
color = guide_legend(override.aes = list(size = 2))) +  labs(title = "Time vs Weight for Each Chick in Diet 4",
      
x = "Time",
      
y = "Weight") +
  theme_minimal()


# Combine datasets
combined_data
<- rbind(
  diet1_data1 %>% mutate(
Diet = "Diet 1"),
  diet2_data %>% mutate(
Diet = "Diet 2"),
  diet3_data %>% mutate(
Diet = "Diet 3"),
  diet4_data %>% mutate(
Diet = "Diet 4")
)

# Facet ggplot
ggplot(combined_data, aes(
x = Time, y = weight, group = Chick, color = Chick)) +
  geom_line(
size = 1) +
  theme_minimal() +
  guides(
color = guide_legend(override.aes = list(size = 2))) +
  labs(
title = "Time vs Weight for Each Chick",
      
x = "Time",
      
y = "Weight") +
  theme_minimal() +
  facet_wrap(~Diet,
ncol = 2)


# Calculate mean weight for each diet and time
mean_weight_data
<- combined_data %>%
  group_by(Diet, Time) %>%
  summarise(
MeanWeight = mean(weight), SEM = sd(weight) / sqrt(n()), .groups = "drop") %>%
  mutate(
ErrorBarLength = 2 * SEM)


#violin plot
ggplot(combined_data, aes(
x = Time, y = weight, fill = Diet)) +
  geom_violin() +
  labs(
title = "Violin Plot of Chick Weights Over Time for Each Diet",
      
x = "Time",
      
y = "Weight") +
  theme_minimal() +
  scale_fill_discrete(
name = "Diet")


# Create the ggplot object
p
<- ggplot(mean_weight_data, aes(x = Time, y = MeanWeight, color = Diet, group = Diet)) +
  geom_line(
size = 1) +
  geom_errorbar(aes(
ymin = MeanWeight - SEM, ymax = MeanWeight + SEM, text = paste("Error Bar Length:", round(ErrorBarLength, 2))), width = 0.2) +
  labs(
title = "Mean Weight with Standard Error Over Time for Each Diet",
      
x = "Time",
      
y = "Mean Weight") +
  theme_minimal() +
  scale_color_discrete(
name = "Diet")

## Warning in geom_errorbar(aes(ymin = MeanWeight - SEM, ymax = MeanWeight + :
## Ignoring unknown aesthetics: text

# Convert the ggplot object to a plotly object
p
<- ggplotly(p, tooltip = "text")
p



Facet ggplot (Time series analysis): The facet ggplot presents the growth of chicks on each of the four diets over time. By plotting individual growth trajectories for each chick, we can observe how the diets affect the growth patterns. The following observations can be made from the facet ggplot:

Diet 1: Chicks on Diet 1 exhibit the most inconsistent growth patterns, with a wide range of growth rates among the chicks. Some chicks experience rapid growth, while others display slower, steadier increases in weight. Diet 2: Chicks on Diet 2 show more consistency in their growth rates, particularly for the majority of chicks in the middle. However, one chick in Diet 2 exhibits no growth, and another chick shows abnormal growth, which adds some variability to the overall growth pattern for this diet. Diet 3: Chicks on Diet 3 demonstrate consistent growth for all individuals, but there is noticeable variation between the chicks. The growth rates are more uniform than in Diet 1 and Diet 2, but the distinction between the chicks is apparent. Diet 4: The growth pattern for chicks on Diet 4 is the most consistent of all the diets, with the least variation between the chicks. The lines in the graph are closest to each other, indicating more uniform growth rates and minimal differences among the chicks on this diet.

 

Violin plot (Distribution analysis): The violin plot displays the distribution of chick weights over time for each diet, allowing us to visualize the variability in weight gain. From the violin plot, we can observe:

Diet 1: The distribution of chick weights on Diet 1 is the most variable among the four diets, with more fluctuations in the shape of the violin plot. This indicates that there is a wider range of weight gains for chicks on Diet 1. Diet 2: The weight distribution for chicks on Diet 2 is more consistent than Diet 1, with less fluctuation in the shape of the violin plot. This suggests that chicks on Diet 2 have a relatively more uniform growth rate compared to those on Diet 1. Diet 3: The distribution of chick weights on Diet 3 appears to be the most consistent of all diets, with the least fluctuation in the violin plot’s shape. This indicates that the growth rate of chicks on Diet 3 is the most uniform among the diets. Diet 4: The weight distribution for chicks on Diet 4 is more variable than Diet 3 but less variable than Diet 1. This suggests that chicks on Diet 4 have a moderately variable growth rate, with some chicks experiencing better growth rates than others.

 

Mean weight plot (Part to whole analysis): The mean weight plot displays the average growth rates and variability for each diet, allowing for a direct comparison of the performance of each diet in terms of average weight gain. Observations from the mean weight plot include:

Diet 4 seems to provide the most consistent growth among chicks, with the least variation in weight over time. This may indicate that Diet 4 provides a more stable and predictable pattern of growth for chicks, which could be desirable for researchers or farmers seeking a more uniform flock. Diet 2 shows the highest variation in weight among the chicks, which might suggest that the diet’s effectiveness is more inconsistent. This could be due to differences in individual chick responses to the diet or other factors that may cause varying growth patterns. Further investigation would be needed to determine the cause of this inconsistency. Diets 1 and 3 show intermediate variations in weight over time, with Diet 1 having slightly lower variation than Diet 3. This suggests that these diets may provide relatively stable growth patterns for chicks, but not as consistent as Diet 4. It’s important to note that this analysis only focuses on the consistency of growth patterns and not on the overall growth rate or final weight of the chicks.

 

From the created datasets, the average final weight and the growth rates can also be determined. In concusion to the analysis above and the new found values (Diet 1: Final weight: 189.11 Growth rate (final weight / initial weight): 189.11 / 41.17 = 4.59

Diet 2: Final weight: 214.70 Growth rate : 214.70 / 40.70 = 5.27

Diet 3: Final weight: 270.30 Growth rate: 270.30 / 40.80 = 6.62

Diet 4: Final weight: 238.56 Growth rate: 238.56 / 41.00 = 5.82),

it can be concluded that Diet 3 appears to be the most effective diet for promoting chick growth, with the highest average final weight and growth rate. Diet 4, despite having the most consistent growth, ranks second in terms of overall growth rate and final weight. Diets 1 and 2 are less effective in promoting chick growth, with Diet 1 being the least effective among the four diets.


Comments