Conditional Statistics and Subsetting Data
The following examples are based on the data, soc510hw2.csv in which union, female, married, and wage variables are found. Note that union members are coded 1 and non-members are coded 0; males are coded 0 and females are coded 1; married are coded 1 and not-married are coded 0.
Conditional Statistics
-
Mean and standard deviation of wage by union membership.
> mean(wage[union==0]) > mean(wage[union==1]) > sd(wage[union==0]) > sd(wage[union==1])
-
Mean of wage for male union members.
> mean(wage[union==1 & female==0])
-
Mean of wage for male or union members (ie, either male or union members, thus include male union members; male non-union members; and female union members)
> mean(wage[union==1 | female==0])
-
Regression analysis using only male-union members.
> lm(wage~edu+age, subset=(female==0 & union==1))
Subsetting Data
-
Creating a subset data, "datafemale" using "subset" command from "mydata" which is pre-loaded.
> datafemale <- subset(mydata, female==1)
-
Another method: "which" command
> datafemale <- mydata[which(female1), ]
-
Creating a subset, "dataless" in which both LTHS (less than high school) and HSG (high school graduate) workers are selected.
> dataless <- subset(mydata, educ==1 | educ==2)
-
Creating a subset, "fmprvt" in which only female workers who work for private sectors are selected.
> fmprvt <- subset(mydata, female==1 & pubst==1)
[ Going back to Using R: Index ]