Rise and Fall of the NYC Subway System

As New York City mass transit costs increase again for the third time in six years, it's interesting to note that the MTA ridership is still a shadow of what it once was.  I'm a proud passenger of NYC public transit, but it's hard to imagine that the annual travelers on the NYC subway system today is well below the passenger totals of the 1930's and 40's, despite New York having a population of a million or two less people. 

     -via DadaVis

City vs. Suburban Walking

Since I moved to New York City about six years ago, I've always felt I've walked more than I did in a small town Indiana college.  Even growing up in the rural Pennsylvania, where I tread daily on a half mile walk to and from the school bus each morning, doesn't compare with my daily perambulation in the city. 

And here is why:  The sheer volume of businesses and services available in a city within walking distance dwarves the accessibility of the suburbs. The urban consumer is much more likely to walk to work or the grocery store, because it's so much closer to home.  Suburban commuters are inclined to drive, because they need a vehicle to get most places they want to go.

I think anyone in an urban area understands this implicitly, but the team at the Sightline Institute has illustrated this by actually graphing the area in which a commuter can cover in a mile's walk. A mile simply takes a person further in the city, based on the geometry of urban planning, whereas a suburban stroll leaves you navigating neighborhood cul de sacs and side roads. 

Both the density and proximity of desirable destinations make a city a more effective place to walk, and urban residents simply walk... more.

Radial Cartography of New York City

Here are a several amazing infographics of NYC, reflecting the building height in relation to land value, and mapping Manhattan's development index, according to price per square foot and zoning regulations in relation to property values.  This come from a fantastic data visualization blog, with quite a few fascinating cartographic representations.

Two Dimensional representation of (3 dimensional) building heights.

Bill Rankin, 2006

Manhattan's land value represented by color density.  Similar to building height above, the highest valued land is clustered around midtown and NYC's Financial District / Battery Park City.

Bill Rankin, 2006

"Tax assessments are a tricky data source, since they do not measure market value — indeed, there are even tax-assessed "values" on public buildings and parks. (Here Central Park is "valued" at $1.9 billion.) But they do give a rough sense of relative values within the city: the pocket of wealth up near the cloisters, and the relative sparseness of the lower east side.

Note: even though this map shows building footprints, the land value shown for each building is per square foot of lot size."


a theoretical market 
Bill Rankin, 2006

In a hypothetical "perfect market," this map would be a consistent yellow-green. Every lot would be built up precisely in accord with its land value. Lots which were underdeveloped would be torn down and built up again with higher densities. Overdeveloped sites would lower the price of nearby lots, again establishing equillibrium. But government distorts the market (often in very good ways), by building low-density public buildings and monuments in central areas (e.g., the Public Library or Lincoln Center), or building public housing that the market would not otherwise provide (see the lower east side). Zoning also restricts development for the sake of light, air, and congestion.

Notice the difference between Midtown and Wall St.: Downtown looks very close to a "perfect market," while Midtown seems restrained by zoning. Notice also underdevelopment along Madison Ave. and Broadway, and the (thankfully) "irrational" presence of churches throughout the city.

Note: See also methodological notes about tax-assessment data on the Land Value map.

relation to zoning
Bill Rankin, 2006

In response to my original map of "underdevelopment" in Manhattan, urban planner Josh Jackson emailed me from New York with some ideas for a map which would show development in relation to FAR. This is my attempt at such a map. Thanks to Josh for helping me with these ideas; our email exchange is below.

New York is the Center of the World

From Radial Cartography

MTA Transit Calculator: Travel Time Heatmap for any Subway in NYC by WNYC

WNYC just created a fantastic transit calculator map for travel time from borough to neighborhood to zipcode in New York City. Just type in a desired address, and you can see average travel times to anywhere in the five boroughs. This is my average travel-time living on East Broadway on the Lower East Side.  The red colors indicate a less than ten minute commute, ranging from purple showing over and hour and a half travel time. 

via WNYC.org

This is a great tool if you're looking for a new apartment in a new neighborhood, but aren't sure about how long you'll spend in transit during your work commute.  

Annual Electric Usage By Block for New York City

The map represents the total annual building energy consumption at the block level (zoom levels 11-15) and at the taxlot level (zoom levels 16-18) for New York City, and is expressed in kilowatt hours (k Wh) per square meter of land area. The data comes from a mathematical model based on statistics, not private information from utilities, to estimate the annual energy consumption values of buildings throughout the five boroughs. To see the break down of the type of energy being used, for which purpose and in what quantity, hover over or click on a block or taxlot.

 -via columbia.edu

All Of NYC's Affordable Housing through the Furman Center's Data Search Tool

Search all of New York City's affordable housing by name, owner, year built, location, financing or physical information (for example by # of building violations in 2010).  Or, you can research all sorts of demographic information from Crime to Education to employment to health to all sorts of housing informtion, to property tax to population, ethnic demographics and transportation.

Online Marketing Group Affordable Housing

The Furman Center for Real Estate and Urban Policy collects a broad array of data on demographics, neighborhood conditions, transportation, housing stock and other aspects of the New York City real estate market. We make our data directly available to the public through our new Data Search Tool, and publish comprehensive analyses of these data in our periodic reports.

The Data Search Tool is a new online application that provides direct access to New York City data collected by the Furman Center. Users can select from a range of variables to create customized maps, download tables, and track trends over time. Users are able to overlay never-before available information on privately-owned, publicly -subsidized housing programs collected through the Furman Center’s Subsidized Housing Information Project (SHIP). Information about how to use the Data Search Tool is available in our online guide.

Online Marketing Group Affordable Housing

From the Furman Center

When the Visualization Eclipses The Data...

In his 2003 novel Pattern Recognition, William Gibson created a character named Cayce Pollard with an unusual psychosomatic affliction: She was allergic to brands. Even the logos on clothing were enough to make her skin crawl, but her worst reactions were triggered by the Michelin Tire mascot, Bibendum.

Although it’s mildly satirical, I can relate to this condition, since I have a similar visceral reaction to word clouds, especially those produced as data visualization for stories.

If you are fortunate enough to have no idea what a word cloud is, here is some background. A word cloud represents word usage in a document by resizing individual words in said document proportionally to how frequently they are used, and then jumbling them into some vaguely artistic arrangement. This technique first originated online in the 1990s as tag clouds (famously described as “the mullets of the Internet“), which were used to display the popularity of keywords in bookmarks.

More recently, a site named Wordle has made it radically simpler to generate such word clouds, ensuring their accelerated use as filler visualization, much to my personal pain.

So what’s so wrong with word clouds, anyway? To understand that, it helps to understand the principles we strive for in data journalism. At The New York Times, we strongly believe that visualization is reporting, with many of the same elements that would make a traditional story effective: a narrative that pares away extraneous information to find a story in the data; context to help the reader understand the basics of the subject; interviewing the data to find its flaws and be sure of our conclusions. Prettiness is a bonus; if it obliterates the ability to read the story of the visualization, it’s not worth adding some wild new visualization style or strange interface.

Of course, word clouds throw all these principles out the window. Here’s an example to illustrate. About six months ago, I had the privilege of giving a talk about how we visualized civilian deaths in the WikiLeaks War Logs at a meeting of the New York City Hacks/Hackers. I wanted my talk to be more than “look what I did!” but also to touch on some key principles of good data journalism. What better way to illustrate these principles than with a foil, a Goofus to my Gallant?

And I found one: the word cloud. Please compare these two visualizations — derived from the same data set — and the differences should be apparent:

I’m sorry to harp on Fast Company in particular here, since I’ve seen this pattern across many news organizations: reporters sidestepping their limited knowledge of the subject material by peering for patterns in a word cloud — like reading tea leaves at the bottom of a cup. What you’re left with is a shoddy visualization that fails all the principles I hold dear.

Every time I see a word cloud presented as insight, I die a little inside.

For starters, word clouds support only the crudest sorts of textual analysis, much like figuring out a protein by getting a count only of its amino acids. This can be wildly misleading; I created a word cloud of Tea Party feelings about Obama, and the two largest words were implausibly “like” and “policy,” mainly because the importuned word “don’t” was automatically excluded. (Fair enough: Such stopwords would otherwise dominate the word clouds.) A phrase or thematic analysis would reach more accurate conclusions. When looking at the word cloud of the War Logs, does the equal sizing of the words “car” and “blast” indicate a large number of reports about car bombs or just many reports about cars or explosions? How do I compare the relative frequency of lesser-used words? Also, doesn’t focusing on the occurrence of specific words instead of concepts or themes miss the fact that different reports about truck bombs might be use the words “truck,” “vehicle,” or even “bongo” (since the Kia Bongo is very popular in Iraq)?

Of course, the biggest problem with word clouds is that they are often applied to situations where textual analysis is not appropriate. One could argue that word clouds make sense when the point is to specifically analyze word usage (though I’d still suggest alternatives), but it’s ludicrous to make sense of a complex topic like the Iraq War by looking only at the words used to describe the events. Don’t confuse signifiers with what they signify.

And what about the readers? Word clouds leave them to figure out the context of the data by themselves. How is the reader to know from this word cloud that LN is a “Local National” or COP is “Combat Outpost” (and not a police officer)? Most interesting data requires some form of translation or explanation to bring the reader quickly up to speed, word clouds provide nothing in that regard.

Visualization is reporting, with many of the same elements that would make a traditional story effective.

Furthermore, where is the narrative? For our visualization, we chose to focus on one narrative out of the many within the Iraq War Logs, and we displayed the data to make that clear. Word clouds, on the other hand, require the reader to squint at them like stereograms until a narrative pops into place. In this case, you can figure out that the Iraq occupation involved a lot of IEDs and explosions. Which is likely news to nobody.

As an example of how this might lead the reader astray, we initially thought we saw surprising and dramatic rise in sectarian violence after the Surge, because of the word “sect” was appearing in many more reports. We soon figured out that what we were seeing had less to do with violence levels and more to do with bureaucracy: the adoption of new Army requirements requiring the reporting of the sect of detainees. Of course, the horrific violence we visualized in Baghdad was sectarian, but this was not something indicated in the text of the reports at the time. If we had visualized the violence in Baghdad as a series of word clouds for each year, we might have thought that the violence was not sectarian at all.

In conclusion: Every time I see a word cloud presented as insight, I die a little inside. Hopefully, by now, you can understand why. But if you are still sadistically inclined enough to make a word cloud of this piece, don’t worry. I’ve got you covered.

This is an insightful and rather shrewd criticism of word clouds, and I think it applies to much of the infographic, data-visualizaion obsessed tech culture we live in.

I find myself fascinated by many of the new and innovative ways to graphically represent data. Yet, as Jacob Harris points out, many of these sleek new techniques (if they don't miss the point entirely) strip supposedly core ideas from the very context that lend them meaning... and we are left with a aesthetically pleasing series of pretty graphs and pie charts that convey very little actual information (see my post on the Infographic Idiom).

And even though CNN, Fox and other news networks are now embracing new visualization tools, tag clouds are ultimately useless measures of political sentiment, because concepts themselves really cannot be reduced to their most elemental articulation; in a word.