New York City – Vacant Lots Suitable for Residential Development

GIS Analysis and Ranking Using a Multi-Criteria Approach

Based on reports published by the Department of City Planning, New York City’s population is projected to grow from 8.2 million in 2010 to 9 million in 2040.

For the same period, the number of housing units is projected to grow from 3,375,002 to 3,696,359.
New housing can be achieved by redevelopment of existing buildings or new development on vacant spaces and brownfields.


The focus of this study was to identify current vacant lots ("lots") available for residential development, calculate composite indexes based on several factors and rank them in order of suitability.

Housing distribution in units per lot. (Spring 2019)


My study proposes a framework that combines different quantifiable indicators into a composite index. This concept brings two subsequent questions: Which are the most relevant metrics to be included in the index aiming at assessing urban density and residential development suitability? What is good versus bad performance?

There is a solid body of literature on both Multi-Criteria Decision Making (MCDM) and Urban Residential Densification. nyc_lots multi-criteria decision analysis can be thought of as a process that combines and transforms geographical data (input) into a resultant decision (output) (Malczewski, 1999). The number of factors and relationships among them cause difficulties in decision making. GIS and MCDM techniques offer support to decision makers solving nyc_lots decision problems.

A number of studies were done in different countries, in the past twenty years in the area of urban development and densification – one of the pioneers being Urban Governance, Social Inclusion and Sustainability (Europe, 200-2003). After reviewing the literature another question arose: What targets must be pursued in densification strategy for it to be successful? Due to New York City’s geography, densification has to make more people live in a given area. Successful densification also makes people travel less to their work, shopping and leisure places in order to reduce car use, commuting time and cost, and congestion. Therefore, densification has to create mixed-use environments. Other targets are the accessibility to public transportation, parks and open spaces, and schools. Eventually, suitable densification results in a high-quality urban environment that is not deemed as over-crowded by residents.

Keeping these targets in mind, I selected eight factors to compute relevant metrics. These are the key performance indicators that make up the composite assessment index:
1. Residential density
2. Housing density
3. Land use diversity
4. Proximity to subway
5. Proximity to schools
6. Proximity to parks
7. Real estate market
8. Proximity to buses

Current State of Vacant Lots in NYC

According to 2018 data from the New York City Department of City Planning there are currently 12,025 vacant lots available for residential development. For this study only the lots outside the flood areas were considered, totaling 7,936, with the following distribution among the five boroughs:

Staten Island – 2,686
Queens – 1,797
Bronx – 1,679
Brooklyn – 1,547
Manhattan – 227

We can immediately see that the number of lots in Staten Island is significantly higher than those in the other boroughs. In Manhattan the number is much smaller.

Distribution of vacant lots. (Fall 2018)

Study Areas

1600 feet buffers were derived around lot centroids and they were clipped to the shoreline, so the density calculations are consistent. This would provide a study area of approximately 0.29 square miles around each lot, which is appropriate for the computation of densities and distances. Out of the 7,936 lots very few are larger than the centroid buffer and I considered the fact not to impact the study. Based on location, the number of schools, transit stops, parks and other features vary within each buffer.

Study areas around vacant lots.

NYC Street & Transit Network

A transit network using GTFS data was created for all proximity and travel time calculations. It includes streets, subway, bus, express bus and ferries systems. Details regarding the network creation and analysis can be seen here.

Vacant lots location relative to the subway network.

Composite Index

After identifying the pertinent lots, deriving study areas around each and setting up the transportation network, the next step was to compute the performance indicators. Each lot was assigned a score which later will be used to calculate the composite index.

1. Residential density

It is measured in number of residents per square mile. The calculations include all the land, not just land occupied by residential uses. To have a reference point, the average densities (using 2017 estimates) by boroughs are:
Staten Island – 8,112
Queens – 21,460
Bronx – 34,653
Manhattan – 72,033
Brooklyn – 37,137

I used population data from the American Community Survey, 5-Year Data, 2016. Since the goal is to increase density, for our study less dense areas are scoring higher.

Population density by study areas. Darker scores higher.

2. Housing density

It is measured in number of housing units per square mile. It is no surprise that the housing density is very similar to the population density. For the scope of the present study I did not include the possibility of a future change in the FAR designation.

I used housing data from the American Community Survey, 5-Year Data, 2016. Since we want to increase density, for our study less dense areas are better.

Housing density by study areas. Darker scores higher.

3. Land use diversity

This factor was calculated by dividing the area used for office, commercial and industrial (non-residential) with the residential area. In the case of mixed use, the area was included in both categories.

PLUTO data was used to calculate this indicator. Since we are adding residential use, for our study more diverse areas are better.

Population density by study areas. Darker scores higher.

4. Proximity to subway

The fourth indicator included in the composite index accounts for the proximity to subway stations. Most of Staten Island and large areas of Queens and Brooklyn are not within walking distance to a subway station as well as some areas in The Bronx. Many residents in Queens and Brooklyn use the LIRR service, but I only included the subway in the study and I added proximity to buses as an additional indicator.

Proximity to subway. Darker scores higher.

5. Proximity to schools

To calculate this indicator I used data from NYC Open Data Portal. The number for each study area was derived from the number of schools within a 10 minute walking distance “service area” and the walking distance to the nearest school. For our purpose, the higher the number, the better.

Proximity to schools. Darker scores higher.

6. Proximity to parks

New York City has a good number of parks, which I separated into three categories: Flagship Parks (Central Park, Prospect Park), playgrounds and everything in between. The first step was to create “service areas” around each park, using the street network, then aggregate the results. For the ranking process the higher the score, the better.

Proximity to parks. Darker scores higher.

7. Real estate market

Given the large number of transactions recorded for 2018 and a relatively uniform distribution throughout the city, I used the average transaction price per study area. This resulted in an acceptable way to compare and rank the lots. The distribution is severely left skewed. Data is from the Newman Library, Baruch, CUNY. For the purpose of the study more expensive areas are better.

Population density by study areas. Darker scores higher.

8. Proximity to buses

The last indicator included the composite index accounts for the proximity to bus stations. We can see on the map that areas without subway routes are well served by buses. The score was calculated by combining the location within a bus “service area” and the walking distance to the nearest bus stop.

Proximity to buses. Darker scores higher.

Compilation of Composite Index

The different indicators were defined and calculated then weighted and aggregated in the composite index. To do so, the results were first normalized, then each indicator was assigned a weight. I obtained the normalized values by dividing each study area’s value by the highest in the category, resulting in a 0 to 1 range. There are various ways to calculate weights. I originally adjusted the neutral scores to “add or reduce weight”. Subsequently I used rating and pairwise comparison methods to compile alternative indexes. The idea is that the users/decision makers are able to decide upon the importance of the factors for their own purpose. The results presented here show just one in many possible outcomes, the map to the left was produced using the adjusted weights (values in purple in the table below).

Factor Equal weights Adjusted weights Weights using rating Weights using pairwise comparison
Classification Normalized Priority Ranking
Residential density 0.125 0.200 80 0.15 0.03 8
Housing density 0.125 0.150 70 0.13 0.11 4
Land use diversity 0.125 0.150 40 0.07 0.14 2
Proximity to subway 0.125 0.150 100 0.19 0.46 1
Proximity to schools 0.125 0.075 60 0.17 0.12 3
Proximity to parks 0.125 0.050 50 0.09 0.05 6
Real estate market 0.125 0.100 70 0.13 0.05 5
Proximity to buses 0.125 0.100 40 0.07 0.03 7
Total 1 1 510 1 1

Results & Interpretation

The map shows the results of our analysis. The ranking is divided in seven classes, displayed here in quantile mode, from yellow to dark red. The histogram shows a right skewed distribution.

Looking on the map we see that all boroughs have a good mix of lots with a higher concentration of the better ranking ones in Brooklyn. It’s apparent that proximity to the subway is one factor which influences the results the most, in this particular case.

The composite index diagram shows the six top ranked sites where we can see that all scored very low on the Real Estate factor, mixed and medium on Proximity to Schools and Parks and very high on the others. It would be very facile to implement other scenarios by changing the factors, as well as their computation method and weights.

This suitability model provides a flexible tool, which can be used for various scenarios by assigning different factors and weights in the index computation.