Select Page

Objects, Functions, and What the Heck are Vectors anyways?

The first thing I did was import the data with the code:

acs <- read.csv(url(“http://stat511.cwick.co.nz/homeworks/acs_or.csv”))

This imported all the data into an object named acs.  I thought it was quite neat to be able to pull the information directly from the web, and that it was easier than importing from a text file.  To my understanding, data sets like this are the objects of R.  Importing this object went smoothly.  The place where I had issues and which is the main reason this post is late is vectors.  For whatever reason, my brain isn’t clicking with it.  I know vectors in physics and vectors in graphics but vectors in R aren’t meshing with me.  The assignment had me access the data as a vector with

acs[1,3]

and it returns to me a value of 62, but that number doesn’t mean anything to me.  I guess it’s because the data doesn’t exist as a table like I visualize in my mind.  Anyways, next up is the subset function.  I was able to make a subset of all the data in acs in which the age of the husband is greater than the age of the wife with

a <- subset(acs , age_husband > age_wife)

Functions in R are the tools that are used to manipulate the data objects.  There are some, like subset, that are built into the language, but we can also create our own. Using more of the subset function, I used

w <- subset(acs , income_wife > income_husband)

and

h <- subset(acs, income_husband > income_wife)

to create subsets of the data for when the husband makes more than the wife and for the reverse.  I then used the mean function to determine the mean number of children for each case.

Capture

I tried to quickly graph the difference in the two, but quickly realized that it would be significantly more involved than simply using the plot function, but I was able to make histograms to compare the two.

The formatting for w$number_children is weird for some reason.  I don’t see why there should be empty gaps between the x values, but I’ll chalk that up to the simplicity of the hist function for now.

 

Edit: It looks like the underscore in the URL doesn’t display properly. I’ll try to look into it later because I’m sure it’ll be a recurring issue.  Also, I think I just got vectors working in my head.  I had my acs table sorted with decreasing household values.