https://github.com/danachermesh/PUI2017_dcr346Instructor: Dr. Federica Bianco Abstract: This study sought to analyze the possible correlation between the number of residential permits issuance and buildings violations, represented by 311 building related complaints. Using several sources of data including Census Bureau, NYC open data, 311 and NYC spatial data, a descriptive analysis and regression models were conducted to better understand the two urban factors. The results were insignificant, which not necessarily mean the correlation between the two isn't exist, rather than different or further methods could have better explain it. A more meaningful negative correlation was detected between renter-occupied housing units and BV complaints.Introduction: New York City is a rapidly renewing urban area, with an escalating demand for housing and increasing housing costs. My motivation for this research was to analyze how does the number of residential permits issuance correlate with 311 complaints related to buildings violations, if at all, and by this, ideally, to identify areas that should get more attention regarding building codes and building use validation. I relate to permit issuance as an indicator of urban renewal, due to the fact that the majority of construction in New York City requires a Department of Buildings permit, this to make sure that the plans are in compliance with building code. I pre assumed that an area with a very low number of residential permits issued over a year (meaning, an area that is less developing / renewing) will also show a relatively large number of building violation. I also guessed there are highly renewing areas with large number of permits issued and a large number of building violation complaints. Additionally, I was interested in assessing the role of renter / owner occupancy ratio on building violations complaints. Data: This research rely on data from year 2016, and focus on residential information only. Any personal information was excluded. The analysis was performed in the granularity level of Zip codes, which seemed a reasonable geographical unit to observe urban renewal trends. The study required the use of several data sources. First, Permit Issuance data were obtained from DOB permits issuance open data and were cleaned to include Residential permits only, then were filtered again to include only New Buildings (NB) and massive Alternation (AL) permit types, ignoring plumbing, signs, equipment etc. permit types, that are insignificant to this research. The permits data were normalized by the overall number of occupied housing units, obtained from the US Census Bureau, American Fact Finder website, using the ACS 5 years estimate data. Data for 2016 do not exist in the zip code geographical level; for that reason I used from year 2013, assuming the change in the number of housing units is not meaningful. All data were grouped by zip code to count the number of permits issued in each zip code in 2016. The data of the Department of Buildings (DOB) violations are divided to more than a hundred complaint categories, most of them are meaningless to urban renewal. It was hard to define the exact categories that will best contribute to this analysis. In order to avoid misinterpretations, 311 complaints data were selected instead. The 311 data were filtered to include only Building related complaints in year 2016. The 311 complaints are also divided by complaint descriptor; the descriptors were included in this analysis are:Illegal Conversion Of Residential Building/SpaceIllegal Commercial Use In Resident ZoneZoning - Non-Conforming/Illegal Vehicle StorageNo Certificate Of Occupancy/Illegal/Contrary To COSRO - Illegal Work/No Permit/Change In Occupancy/UseROOFINGPORCH/BALCONYSKYLIGHTGUTTER/LEADERFENCING 311 is a relatively new citizens-city engagement system, of which not all citizens are taking advantage or aware. To overcome this bias the 311 data were normalized by dividing each zip code's number of building-related complaints by its overall number of 311 complaints. Due to the large size of the 311 dataset the overall number of complaints in 2016 was assessed by extracting two months only from that year, January and June, proxies for winter overall complaints and summer overall complaints respectively (see Ipython notebook). The data were grouped by zip code to count their number of building violation complaints. The weakness of the 311 data is that even when neutralizing the bias in the citizens' use of the 311 system, it is harder to address the differences between citizens' involvement and engagement level in the city and their feeling about complaining. Additionally, for the second part of the analysis, data of number of renter-occupied housing units and owner-occupied housing units were obtained, also from the American Fact finder website. Finally, New York City's zip codes shapefiles were included, also obtained from NYC open data website. Methodology: The first step of the analysis was to observe and describe the data of both primarily variables, Permit Issuance and 311 Building Violation complaints. The distribution of the normalized variables were viewed to assess the possible similarity in their statistical behaviour: