NYC Open Data: Part I
A look at the Brooklyn Public Library’s open data catalog would give you … well, roughly zero idea of what’s on the shelf.
Let me clarify. And let me be clear. (Those are almost synonymous.) I love the public. I’m there every few weeks. And I used to go practically every week. I taught myself the lion’s share of what I know about R in the commons room.
So I downloaded a .csv of the library’s catalog set from the City of New York’s open data portal. (This one: https://opendata.cityofnewyork.us/)
It looks like this:
library(rgdal) # reads spatial files
## Loading required package: methods
## Loading required package: sp
## rgdal: version: 1.2-7, (SVN revision 660)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 2.1.3, released 2017/20/01
## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/gdal
## Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493]
## Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/proj
## Linking to sp version: 1.2-4
library(ggplot2) # graphics
bpl = read.csv(url("https://data.cityofnewyork.us/api/views/ym2h-u9dt/rows.csv?accessType=DOWNLOAD"))
head(bpl, 5)
## CALL...BIBLIO. AUTHOR
## 1 974.71 N
## 2 RUS q947.072 E
## 3 RUS qB PUSHKIN T
## 4 NEIG 0149";"NEIG 2713
## 5 NEIG 0150
## TITLE
## 1 09/11, 8:48 AM : documenting America's greatest tragedy / edited by BlueEar.com: Global Writing Worth Reading with the faculty and students of the New York University Department of Journalism foreward by Jay Rosen introduction by Ethan Casey.
## 2 1812-1814 : sekretna︠i︡a perepiska generala P.I. Bagrationa, lichnye pisʹma generala N.N. Raevskogo, zapiski generala M.S. Voron︠t︡sova, dnevniki ofi︠t︡serov russkǒi armii : iz sobrani︠i︡a Gosudarstvennogo Istoricheskogo muze︠i︡a / [sostaviteli, avtory vstupitelʹnykh statěi i kommentariev A.K. Afanasʹev ... et al.].
## 3 1799-1837 : Pushkin i ego vrem︠i︡a / [redaktory S.G. Blinov, M.D. Filin].
## 4 177-179 Columbia Heights [picture]
## 5 132-134 Columbia Heights [picture]
## EDITION PUB.INFO
## 1 [Charleston, SC] : BookSurge.com, 2001.
## 2 Moskva : Terra , c1992.
## 3 Moskva : TERRA , 1997.
## 4 1920.
## 5 1921.
## STANDARD..
## 1 1591090113 (pbk.) : $22.00
## 2 5852551538
## 3 5300002216
## 4
## 5
Well then. Quite a mess.
So here’s another public data set. It’s a tally of the City’s three-year-old universal pre-kindergarten program. The mayor’s 2013 campaign included a few major pledges, and the one he arguably delivered more effectively than others was providing free, full-day universal pre-kindergarten to any 4-year-old. His staff invested considerable resources in finding space for locations. Enrollment opened at 19,000 children the first year and grew to 68,000 a couple of years later.
upk = read.csv(url("https://data.cityofnewyork.us/api/views/kiyv-ks3f/rows.csv?accessType=DOWNLOAD"))
names(upk) = tolower(names(upk))
boxplot(upk$seats)
table(upk$seats,upk$borough)
aggregate(upk$seats, by=list(Category=upk$borough), FUN=sum)
I can use ggplot2 to plot the new pre-k locations. It takes downloading and unpacking some spatial files first.
I can also plot the same locations scaled by the number of spots (“seats”).