Description
After reading through Policymap's documentation on their home sales dataset, I think it would be a good idea to impose a minimum threshold to exclude census tracts that have very few recorded sales when calculating summary statistics like median sale price.
I think that census tracts with fewer than ten (10) sales in a year should be suppressed from the dataset.
Clarification: tracts with fewer than ten sales per year will be excluded from the calculation of summary metrics like median sale price; however, they will be included when calculating metrics like sale rate.
Text from Policymap's Zillow data documentation
To ensure that only market based residential transactions were included, PolicyMap used the subset of residential sales that were at-arms-length transactions, over $5,000 in value, and did not involve vacant or unimproved land. Transactions that involved auctions, foreclosures, Real Estate Owned (REO) property sales, sheriff sales, or Planned Unit Developments (PUDs) were also excluded. Partial sales, property improvements, and transfers across multiple properties (also called bulk sales) were likewise removed. PolicyMap suppresses indicators with fewer than five sales in a given time period and geography as “insufficient data.” Counties with “limited data availability” have greater than 75% of sales with no reported sales price or a zero dollar sales price. Indicators may be unreliable in these areas.
Activity