Exercise 1
A multicenter study is conducted in which fine particulate matter levels are measured in four cities (Milan, Rome, Turin, Naples). In each city, 10 measurements are taken at various times of the year. The data is as follows:
Milan (Spring: 60.5, 65.5, 56.2), (Summer: 52, 49, 47), (Winter: 70, 81, 91, 96)
Rome (Spring: 61.5, 62.5, 53.2), (Summer: 50, 43, 42), (Winter: 71, 82, 93, 92)
Turin (Spring: 64.5, 64.5, 54.2), (Summer: 54, 44, 44), (Winter: 70, 84, 90, 96)
Naples (Spring: 60.5, 65.5, 56.2), (Winter: 70, 81, 91, 96, 100, 92, 95)
During the specified dates, the number of hospital admissions for respiratory problems is measured, and the following data is obtained:
Milan (Spring: 10, 11, 10), (Summer: 12, 10, 11), (Winter: 15, 16, 18, 19)
Rome (Spring: 9, 12, 11), (Summer: 10, 14, 12), (Winter: 15, 18, 19, 19)
Turin (Spring: 10, 11, 9), (Summer: 9, 12, 13), (Winter: 15, 14, 18, 19)
Naples (Spring: 10, 10, 10), (Winter: 19, 16, 14, 15, 19, 18, 18)
Is there a relationship between smog and lung problems? If so, is there a city with higher pollution?
Let's create the data vectors right away.
MILANO<-c(60.5,65.5,56.2,52,49,47,70,81,91,96)
STAGM<-c(rep("P",3),rep("E",3),rep("I",4))
Roma <-c(61.5,62.5,53.2,50,43,42,71,82,93,92)
STAGR<-c(rep("P",3),rep("E",3),rep("I",4))
Torino <-c(64.5,64.5,54.2,54,44,44,70,84,90,96)
STAGT<-c(rep("P",3),rep("E",3),rep("I",4))
Napoli <-c(60.5,65.5,56.2,70,81,91,96,100,92,95)
STAGN<-c(rep("P",3),rep("I",7))
MILANOric<-c(10,11,10,12,10,11,15,16,18,19)
Romaric<-c(9,12,11,10,14,12,15,18,19,19)
Torinoric<-c(10,11,9,9,12,13,15,14,18,19)
Napoliric<-c(10,10,10,19,16,14,15,19,18,18)
RILEV<-c(MILANO,Roma ,Torino, Napoli )
STAG<-c(STAGM,STAGR,STAGT,STAGN)
RICOV<-c(MILANOric,Romaric,Torinoric,Napoliric)
Citta<-c(rep("MILANO",10),rep("ROMA",10),rep("TORINO",10),rep("NAPOLI",10))
DF<-data.frame(RILEV,STAG,RICOV,Citta)
Once the dataset is created, I assess whether there is a relationship between pollution levels and hospital admissions, while accounting for other variables like city and season. I use a multiple linear regression model.
summary(glm(RICOV~RILEV+Citta+STAG,family="gaussian"))
The regression model suggests that the number of hospital admissions is significantly associated with the season but not with the city or the level of fine particulate matter.