2 Loading data

Let’s begin by reading in the dataset on State Patrol traffic stops released by the Stanford Open Policing Project. The data can be downloaded from the Stanford Open Policing Project data page, but has also been made available to you as part of the workshop materials. This data is stored as a CSV file.

Once the traffic patrol data has been downloaded into your working directory, pass the name of the file (along with its extension) to the read_csv() function, and assign it to an object. Here, we’ll assign the traffic patrol data to an object named co_traffic_stops. Note that the name of an object is arbitrary, but ideally, it should meaningfully describe the data that has been assigned to it.

# Read in Stanford police data for Colorado and assign to object named 
# "co_traffic_stops"
co_traffic_stops<-read_csv("co_statewide_2020_04_01.csv")

── Column specification ────────────────────────────────────────────────────────────────────────────────────
cols(
  .default = col_character(),
  date = col_date(format = ""),
  time = col_logical(),
  subject_age = col_double(),
  arrest_made = col_logical(),
  citation_issued = col_logical(),
  warning_issued = col_logical(),
  contraband_found = col_logical(),
  search_conducted = col_logical()
)
ℹ Use `spec()` for the full column specifications.

Once the traffic patrol data has been read into R studio and assigned to an object, we can print the contents of the dataset to the console by typing the name of that object into the R Studio console or by printing (and running) the name of the object from a script (note that only the first few records will be printed to the console):

# Print the contents of "co_traffic_stops" (i.e. the CO traffic patrol data) 
# to the console; the first few records of the dataset will print
co_traffic_stops
# A tibble: 3,112,853 × 20
   raw_row_number      date       time  location county_name subject_age subject_race subject_sex officer_id_hash
   <chr>               <date>     <lgl> <chr>    <chr>             <dbl> <chr>        <chr>       <chr>          
 1 1947986|1947987     2013-06-19 NA    19, I70… Mesa County          26 hispanic     male        b942632983     
 2 1537576             2012-08-24 NA    254, H2… Jefferson …          NA <NA>         <NA>        f3d4f46927     
 3 1581594             2012-09-23 NA    115, I7… Logan Coun…          52 white        male        6e49e2fbc8     
 4 1009205             2011-08-25 NA    197, H8… Douglas Co…          32 white        female      eaea851669     
 5 1932619             2013-06-08 NA    107, H2… Kiowa Coun…          33 hispanic     male        d18e34d749     
 6 1179436             2011-12-23 NA    48, 384… Boulder Co…          NA <NA>         <NA>        b84c696aed     
 7 1326795             2012-04-07 NA    0, R250… Boulder Co…          39 white        male        4c0279748e     
 8 1786795             2013-03-03 NA    19, E47… Arapahoe C…          44 white        female      e6b5b9bb98     
 9 1552164             2012-09-02 NA    224, H2… Park County          NA <NA>         <NA>        43f1f150d3     
10 1004281|1004282|10… 2011-08-21 NA    R2000, … Adams Coun…          32 hispanic     male        dd2f10b6f8     
# … with 3,112,843 more rows, and 11 more variables: officer_sex <chr>, type <chr>, violation <chr>,
#   arrest_made <lgl>, citation_issued <lgl>, warning_issued <lgl>, outcome <chr>, contraband_found <lgl>,
#   search_conducted <lgl>, search_basis <chr>, raw_Ethnicity <chr>

We can also view the co_traffic_stops object (or, for that matter, any dataset in R Studio) within the R Studio data viewer by passing the name of the relevant object to the View() function:

# Inspect co_traffic_stops in the R Studio data viewer
View(co_traffic_stops)