Every year, the US releases to the public a large data set containing information on births recorded in the country. This data set has been of interest to medical researchers who are studying the relation between habits and practices of expectant mothers and the birth of their children. This is a random sample of 1,000 cases from the data set released in 2014.
A data frame with 1,000 observations on the following 13 variables.
Father's age in years.
Mother's age in years.
Maturity status of mother.
Length of pregnancy in weeks.
Whether the birth was classified as premature (premie) or full-term.
Number of hospital visits during pregnancy.
Weight gained by mother during pregnancy in pounds.
Weight of the baby at birth in pounds.
Whether baby was classified as low birthweight (
low) or not (
Sex of the baby,
Status of the mother as a
Whether mother is
not marriedat birth.
Whether mom is
United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics. Natality Detail File, 2014 United States. Inter-university Consortium for Political and Social Research, 2016-10-07. doi:10.3886/ICPSR36461.v1 .
library(ggplot2) ggplot(births14, aes(x = habit, y = weight)) + geom_boxplot() + labs(x = "Smoking status of mother", y = "Birth weight of baby (in lbs)") ggplot(births14, aes(x = whitemom, y = visits)) + geom_boxplot() + labs(x = "Mother's race", y = "Number of doctor visits during pregnancy") #> Warning: Removed 56 rows containing non-finite values (stat_boxplot). ggplot(births14, aes(x = mature, y = gained)) + geom_boxplot() + labs(x = "Mother's age category", y = "Weight gained during pregnancy") #> Warning: Removed 42 rows containing non-finite values (stat_boxplot).