2 Loading data
Let’s begin by reading in the dataset on State Patrol traffic stops released by the Stanford Open Policing Project. The data can be downloaded from this website, but has also been made available to you as part of the workshop materials as a CSV file.
Once the traffic patrol data has been downloaded into your working directory, pass the name of the file to the read_csv()
function, and assign it to an object. Here, we’ll assign the traffic patrol data to an object named co_traffic_stops
. Note that the name of an object is arbitrary, but ideally should meaningfully describe the data that has been assigned to it.
# Read in Stanford police data for Colorado and assign to object named
# "co_traffic_stops"
<-read_csv("co_statewide_2020_04_01.csv") co_traffic_stops
── Column specification ────────────────────────────────────────────────────────────────────────────────────
cols(
.default = col_character(),
date = col_date(format = ""),
time = col_logical(),
subject_age = col_double(),
arrest_made = col_logical(),
citation_issued = col_logical(),
warning_issued = col_logical(),
contraband_found = col_logical(),
search_conducted = col_logical()
)
ℹ Use `spec()` for the full column specifications.
Once the traffic patrol data has been read into R studio and assigned to an object, we can print the contents of the dataset to the console by typing the name of that object into the R Studio console (note that only the first few records will be printed)
# Print the contents of "co_traffic_stops" (i.e. the CO traffic patrol data)
# to the console; the first few records of the dataset will print
co_traffic_stops
# A tibble: 3,112,853 × 20
raw_row_number date time location county_name subject_age subject_race subject_sex officer_id_hash
<chr> <date> <lgl> <chr> <chr> <dbl> <chr> <chr> <chr>
1 1947986|19479… 2013-06-19 NA 19, I70… Mesa County 26 hispanic male b942632983
2 1537576 2012-08-24 NA 254, H2… Jefferson … NA <NA> <NA> f3d4f46927
3 1581594 2012-09-23 NA 115, I7… Logan Coun… 52 white male 6e49e2fbc8
4 1009205 2011-08-25 NA 197, H8… Douglas Co… 32 white female eaea851669
5 1932619 2013-06-08 NA 107, H2… Kiowa Coun… 33 hispanic male d18e34d749
6 1179436 2011-12-23 NA 48, 384… Boulder Co… NA <NA> <NA> b84c696aed
7 1326795 2012-04-07 NA 0, R250… Boulder Co… 39 white male 4c0279748e
8 1786795 2013-03-03 NA 19, E47… Arapahoe C… 44 white female e6b5b9bb98
9 1552164 2012-09-02 NA 224, H2… Park County NA <NA> <NA> 43f1f150d3
10 1004281|10042… 2011-08-21 NA R2000, … Adams Coun… 32 hispanic male dd2f10b6f8
# … with 3,112,843 more rows, and 11 more variables: officer_sex <chr>, type <chr>, violation <chr>,
# arrest_made <lgl>, citation_issued <lgl>, warning_issued <lgl>, outcome <chr>, contraband_found <lgl>,
# search_conducted <lgl>, search_basis <chr>, raw_Ethnicity <chr>
We can also view the co_traffic_stops
object (or, for that matter, any dataset in R Studio) within the R Studio data viewer by passing the name of the relevant object to the View()
function:
# Inspect co_traffic_stops in the R Studio data viewer
View(co_traffic_stops)