The hidden cost of smoking: Rent premia in the housing market

In this article, we provide novel evidence on the additional costs associated with smoking. While it may not be surprising that smokers pay a rent premium, we are the first to quantify the size of this premium. Our approach is innovative in that we use text mining methods that extract implicit information on landlords’ attitudes to smoking directly from Zoopla UK rental listings. Applying hedonic, matching, and machine-learning methods to the text-mined data, we find a positive smoking rent premium of around 6%. This translates into £ 14.40 of indirect costs, in addition to £ 40 of weekly spending on cigarettes estimated for an average smoker in the United Kingdom.


INTRODUCTION
Tobacco smoking is one of the world's biggest public health problems (Goodchild et al., 2018).The prevalence of smoking is highest amongst people who are unemployed or working in routine and manual jobs with low income, and without formal educational qualifications (Action on Smoking & Health [ASH], 2019).The unfavorable health and labor market consequences lock the most disadvantaged groups into an intergenerational cycle of poor physical/mental health and wellbeing, addiction and impoverishment (Action on Smoking & Health [ASH], 2018; Böckerman et al., 2015;Levine et al., 1997;Pichon-Riviere et al., 2020;van Ours, 2004;Weng et al., 2013).
In this article, we aim to move beyond the well-established health and labor market outcomes associated with smoking.Instead, we focus on the experience of smokers in the housing rental market.We provide novel evidence of additional "hidden" costs associated with smoking that have not been explored before.
There are a number of potential reasons why landlords may wish to charge smokers a rent premium.First, smoke can leave an odor that persists on surfaces.This could be especially a problem for furnished properties.Second, smoking can damage paintwork, which could be particularly undesirable in new or newly refurbished properties.This in turn, can hinder future lets and create a financial burden for landlords.Third, there is an increased fire hazard associated with smoking. 1 Finally, landlords may perceive nonsmoking tenants as potentially more reliable.Marking a property as "nonsmoking," therefore, may help landlords identify tenants that will take better care of the property they are renting (over and above the concerns with smokers already listed).This relates to a point emphasized by Halket et al. (2021) that contracting frictions exist between landlords and tenants regarding maintenance.
As a result, landlords may charge a rental premium to compensate for the potential financial costs and the safety risk associated with smoking.Nevertheless, there is no prior evidence on whether such a premium exists and, if it exists, how large it is.Exploring these potential further costs associated with smoking is worthwhile as it helps uncover the true cost of smoking.
Using Zoopla Property data, which offer unique historical information on the UK real estate market, we present the first estimates of the rent premium associated with smoking in the longterm rental market.We do this using the implicit content in advertisements regarding landlords' attitudes toward smoking.
We use textual information available in the description section of the Zoopla property data to capture whether houses listed for rent include keywords or clauses such as "no smoking," "smoking not allowed," "smoking is not permitted" and so on.Using this information, we investigate the extent to which these houses command different prices when compared with houses, which do not include clauses intended to exclude smokers.We measure the size of the smoking rental premium using both standard hedonic pricing models and matching methods (mainly coarsened exact matching) as well as machine learning techniques.While coarsened exact matching enables a comparison for the houses that are almost identical in observed characteristics, machine learning techniques further help us to exploit the data more fully and calculate a probability of a rental property excluding smokers even when a listing does not include a smoking-related clause.Across all these estimations, our results point to a rental premium for properties where smoking is not prohibited.The estimated weekly rental premium is around 6%, translating into £14.40 of additional indirect costs of smoking in the form of rental premium.This extra cost is substantial, especially when considering the direct weekly cost of £40 an average UK smoker spends on cigarettes.We also show that the premium is higher in more expensive regions of the United Kingdom.We attribute this finding to higher maintenance costs in more expensive regions, which gives landlords a stronger incentive to seek out more reliable tenants.
Section 2 reviews the existing literature.Section 3 explains the details of Zoopla property data and the methods we employ in exploiting the textual information to determine the houses where 1 A UK Home Office report documents that smokers' materials were responsible for 8% of accidental dwelling fires and 34% of fire-related fatalities in 2018/19 See https://tinyurl.com/39m2drew,access date: November 5, 2021.
smoking is not allowed.The various estimation methods we use to measure the smoking rent premium are also explained here.In Section 4, we present and interpret our results.Finally, Section 5 concludes.

LITERATURE REVIEW
Previous research has investigated the rent premia associated with certain property and/or tenant characteristics.There is a long-standing interest in analyzing the factors that determine rental prices.Building on the simple hedonic pricing models which include standard property-specific characteristics (such as number of bedrooms, bathrooms, size etc.), studies document that rental prices are also influenced by other factors such as characteristics of the neighborhood, market conditions, location of the property as well as characteristics of the potential tenants (e.g., their ethnicity, gender and sexual orientation).For example, positive rent premia are noted for green buildings in the commercial real estate market (e.g., Fuerst et al., 2012;Robinson & Sanserson, 2016), for cancellable rental contracts (Yoshida et al., 2016) and for having trees on a rental property's lot (Baranzini et al., 2010).Additionally, it is widely acknowledged that rent premia exist based on ethnicity, sexual orientation and gender (see Flage, 2018, for a review of the international evidence).There is also literature that explores the link between housing quality and health outcomes (see for example, Palacios et al., 2021).However, we are aware of only one previous paper that attempts to measure the link between smoking and rent.Using a hedonic model, Benjamin et al. (2001) find that vacationers to the Outer Banks of North Carolina are willing to pay a premium of 11.6% to rent properties that prohibit smoking during the peak rental season.Interestingly, this premium goes in the opposite direction to what we observe here.This finding may reflect differences between the long-term and holiday rental markets.In particular, owner-occupiers participate in the holiday rental market but not in the long-term rental market.Since owner-occupiers tend to be richer, their presence in the holiday rental market (where typically they are the majority) can create a demand for nonsmoking rentals that mostly does not exist in the long-term rental market.
Our use of textual information available in the description section of online Zoopla property data to detect the smoker preferences of landlords connects with a strand of the real estate literature, which stresses the importance of textual data in improving the property price predictions in hedonic models.Exclusively focusing on the US sale/rental market and benefiting from the textual information available in the Multiple Listing Service (MLS) data, many argue that listing agents' descriptions and remarks about houses contain hidden characteristics that may well influence their prices (see, for example, Haag et al., 2000;Luchtenberg et al., 2019;Nowak & Smith, 2017).Accordingly, these studies either identify a set of keywords or phrases based on the remarks section of the MLS data in order to signal sellers' or landlords' motivation, amenities, interior design, physical improvements, such as whether the property is recently refurbished, painted etc. (e.g., Soyeh et al., 2014) or they resort to machine/deep learning when mining textual information (e.g., Nowak et al., 2021;Zhou et al., 2019)-the methods which we also employ in our analysis.
These studies all prove the value of textual information and controlling for the additional property attributes in estimating property prices.They show that the words used in MLS listings affect sale price, time on market (Haag et al., 2000;Nowak & Smith, 2017;Pryce & Oates, 2008), the likelihood of visiting a property for sale (Luchtenberg et al., 2019), and rental prices (Zhou et al., 2019).Some focus on specific attributes hidden in the textual data; for example, Bond and Devine (2016) find that including phrases related to being green is associated with higher rental rates and Leadership in Energy and Environmental Design (LEED) certification commands the highest rent premium.
Our paper also connects with the literature on price-rent ratios (Bracke, 2015;Halket et al., 2021;Heston & Nakamura, 2009;Hill & Syed, 2016).Price-rent ratios are observed to be larger at the high end of the market: a result that is attributed partly to contracting frictions between owners and renters over maintenance.We find a similar dynamic at work with the smoking premium.

3
DATA AND METHODOLOGY

The Zoopla data set
Our empirical investigation is based on rental listings from Zoopla property data.2Zoopla is one of the UK's leading providers of historical property listings data, with more than 27 million residential property records in their archives.It contains records of more than 8 million properties advertised for sale and/or rent.In this article, we employ the data for the properties, which were advertised for rent between the years 2012 and 2018.Zoopla property data include a rich set of historical information on advertised properties' physical characteristics such as the number of bedrooms, floors, bathrooms and property type (e.g., flat; terrace house etc.).The data provide a detailed level of geographical information, enabling the location of rental properties to be determined at the local authority level and at even smaller units, such as post town.
Most important to our analysis is that Zoopla property data contain landlords' or listing agents' written descriptions of the rental property.Using these descriptions, we identify and generate a list of keywords and phrases, which we then use as indicators of landlords' attitudes toward smoking in the rental property.We determine clauses pointing to a negative attitude toward smoking; for example, "smoking is not allowed," "smoking is not permitted," "no smoking," and so on.If the listing for the property involves any of these phrases, it is classified as a nonsmoking property.We also capture positive phrases such as "smoking is allowed"; "smoking is permitted" to identify these properties as smoking friendly.The full list of these negative/positive indicator phrases is available in Table A1.
In addition to determining whether smoking is permitted in the rental property or not, the description section of Zoopla property data assists us in identifying additional nonstandard attributes of the properties (such as whether the property is recently refurbished or furnished) that are not readily available in the data.The variable definitions and key descriptive statistics are presented in Table 1.

Hedonic estimation
In the first step, we employ a hedonic pricing model where rent3 is expressed as a linear combination of its key property attributes along with an indicator for smoking not being allowed in the rental property.Accordingly, the following OLS regression is estimated: The dependent variable in the hedonic model is the logarithm of the rent of the ith rental property.No Smoking is our main variable of interest, which is a dummy variable that takes the value 1 if smoking is not allowed in the rental property and value 0 in the absence of any such restriction.The vector X i contains standard attributes of the rental houses, including the type of property (detached house, terraced house, flat, and other), number of bedrooms, number of bathrooms, number of floors.4It also includes additional characteristics which we identified using the textual information available in the data.These are dummy variables indicating whether the house is furnished (1 if furnished, 0 otherwise) and whether the house is refurbished (1 if refurbished, 0 otherwise).D i is a vector of dummy variables indicating the county-level location (the data set includes listings from 85 counties) and the year in which the listing enters the market.The vectors γ and δ denote the shadow prices of the physical and location/year-of-listing characteristics of the properties.
In this model, the coefficient β captures the effect of the presence of a nonsmoking clause in a listing on the rental price.A negative coefficient indicates that landlords ask for a premium for allowing smoking in the property.Accordingly, the size of this coefficient reflects the rent premium (or penalty) of smoking in the housing market.

Coarsened exact matching
Estimates of the smoking premium obtained from hedonic modeling may be misleading if landlords select nonsmokers into particular types of properties. 5In order to overcome such problems inherent in a standard hedonic pricing model that relies on OLS estimation, we employ matching methods to reduce the imbalance in the covariates between the treated and control groups and improve causal inferences.Our preferred matching technique is coarsened exact matching (CEM), which is a relatively new matching technique that employs monotonic imbalance bounding (Iacus et al., 2012).CEM is proven to be more efficient in terms of achieving lower levels of imbalance, model dependence and bias than other commonly used matching techniques such as propensity score matching (Blackwell et al., 2010;Iacus, et al., 2012).More pragmatically, CEM is faster than other matching methods and works well with larger data sets like the one we use in our analysis (King & Nielsen, 2019). 6ollowing the notation in Blackwell et al. (2010), assume that Y i (1) is the potential outcome (here the rent) if unit i receives the treatment and Y i (0) is the potential outcome if the same unit is in the control group.For each observed unit i only one of these potential outcomes is observed, and we set T i = 1 if i is treated and T i = 0 otherwise.The observed outcome can be shown as:   =     (1) + (1 −   )  (0).The treatment effect (in our case, the effect of the "no smoking restriction") on unit i can then be expressed as   =   (1) −   (0) and the average treatment effect (ATT) on the treated is  = 1

∑
, where   = ∑   and  = {1 ≤  ≤  ∶   = 1}.CEM addresses the abovementioned observational data problem related to nonrandom assignment to treated and control groups and these groups not being identical before treatment.It does so by temporarily coarsening the data, applying exact matching to the coarsened data, identifying the matched and unmatched units and then keeping only the original-uncoarsened-values of the matched data.
To be more specific, CEM first coarsens the covariates (X) by recoding7 and generates a set of strata, which contain the same coarsened values of X.Following this, it assigns these strata to the original data.Finally, it retains any observation whose stratum contains at least one treated and one control unit and drops the others.Once this procedure is accomplished, these strata constitute the basis for calculating the treatment effect.
We match the rental houses which we identify as "nonsmoking" with those without a ban on smoking on the basis of the type of the rental property, number of bedrooms, number of bathrooms, whether they are furnished and whether they are refurbished.To control for any confounding effects, we also use the year of entering the market when matching the houses.Finally, to account for locational differences, we include county dummies in our CEM estimations.However, given that there are 85 counties, and this may reduce the quality of the matches, we also aggregate counties and match houses over larger regions.Here we use the 11 NUTS1 regions for Great Britain (we exclude Northern Ireland due to a lack of data). 8s with any other matching method, CEM is a data preprocessing technique (Blackwell et al., 2010).It ensures that the observations in the original data are pruned and the remaining data are better balanced in terms of treated and control groups.Hence, a more similar empirical distribution is achieved regarding the covariates.When the data are exactly balanced or when one-to-one exact matching is performed, a difference in mean outcome values between the treated and control groups produces an estimate of the causal effect.When the match is approximately balanced or the match is not exact, then it might be necessary to control for the covariates in the model and estimate parametric models.Even when this is the case, the derived estimates are less modeldependent and statistical bias is lower when compared to the estimations based on the data, which are not preprocessed by matching methods.We achieved good levels of balance in our matched data between the treated and control units when we used county when matching the properties (see Table A2, A3)9 .Therefore, we calculated the average treatment effect of smoking by simply including CEM weights in our linear regression, as suggested by Blackwell et al. (2010). 10 Additionally, we perform a CEM analysis separately for each county across the years using the same set of characteristics described above.We then compare the size of the smoking rent premium observed in each county with the average rental prices observed at county-level.Doing improved the imbalance between the treated and control groups even more (see the  1 statistic in Table A3), matching using county can control for more unobservables.Moreover, using regions translates into more coarsening, resulting in fewer strata and a more diverse observations within them.Therefore, using regions may result in higher imbalance and model dependence.Nevertheless, for comparison purposes and robustness checks, the estimation results based on the data matched at regional level are also presented in the Results section (see column 2, Table 3). 10Including property characteristics in the model has virtually no impact on the estimated smoking coefficient.Weights are included because, under matching with replacement, multiple properties in the treated group can be matched with the same property in the untreated group.Hence there will typically be more observations in the untreated group than in the treated group.so enables us to explore any variation in the potential smoking rent premium across different parts of the rental market.

Probabilistic categorizing of listings that are silent on smoking using random forests
One problem with our data set is that the number of listings that explicitly allow smoking is relatively small.It is likely, therefore, that our text mining methods have not identified all nonsmoking properties.As a consequence, our empirical estimates of the smoking rent premium may be too low.To partially address this problem, we employ Random Forests-a machine learning approach-to categorize listings with unknown information into listings in which smoking is either allowed or not. 11 Our approach follows several steps: (i) We extract data that explicitly indicate preferences for nonsmoking (Smoking not allowed = 1) and smoking (Smoking allowed = 0) tenants.The sample has 162,507 and 3,578 listings in each category, respectively.Given that the former group is much larger in size, we balance the data so that the numbers of observations in each group are similar.To do so, we created subcategories based on the intersection of the following attributes: location, number of bedrooms, number of bathrooms, and number of floors.(ii) Within each subcategory, we match observations indicating tolerance to smokers with the same, randomly selected sample of listings that do not allow smoking.If a particular intersection does not have any listings, which allow for smoking, it is dropped.(iii) We divide our balanced data set of 6970 observations into training and testing data sets with the 80:20 split.A binary variable indicating whether smoking is not allowed is the main output in this machine learning exercise.The vector of inputs (or features) is created from two sources, quantitative and textual data.The former (e.g., region or bathrooms) includes all available categorical information transformed into binary variables.The latter (listing descriptions) were processed by using word2vec to extract 100 features per word, which were later averaged over all words in each listing description.The minimum occurrence of a word is 20.(iv) The random forest classifier from sklearn Python library is chosen as our main approach.
Parameters of the classifier were optimized using grid search by looking through depth of trees [30,40,50,100], number of trees [50,100,200,250,300,400,500], and number of features (sqrt(features), log2(features), auto).The best accuracy rate (using testing data) is 84% when we have 500 trees with depth of 30 and log(2) for max features.The most primary drivers are text-based embeddings with feature importance values varying from 0.007 to 0.014.Having developed the model, we categorize listings that do not explicitly mention smoking preference and estimate the probability of not allowing smoking.
11 It is possible that some rental property owners may not have a particular preference toward smoker or nonsmoker tenants, or they may choose not to disclose a preference at least at the listing stage but a lower rent or security deposit may be offered to nonsmokers at the negotiation stage.Alternatively, landlords may also consider the possibility of accepting a smoker at the negotiation stage (assuming that the potential tenant informs the landlord that (s)he is a smoker).While a dichotomous no smoking variable used in the previous models may not be able to capture these elements, the machinelearning approach addresses these possibilities by estimating a probability of being a nonsmoker rental unit even when the listing does not disclose a preference towards smoking.(v) Once these probabilities have been estimated, we rerun the hedonic model from Section 3.2, this time inserting these probabilities as the dependent variable.

Hedonic results
Table 2 reports the results from the first step of our analysis, where a hedonic pricing model is estimated by OLS.Specification 1 includes our key variable of interest indicating whether smoking is permitted or not along with the control dummies for the year and counties.The coefficient for "No Smoking" in this specification is −0.046, indicating that houses, where smoking is not allowed, are associated with 4.6% lower rental prices compared to those without any ban on smoking.When we control for size-related variables such as number of bedrooms, bathrooms and floors, the rental price difference goes down to 3% (Specification 2).Finally, in the last specification (Specification 3), we include additional variables to control for the type of the property and whether the house is refurbished and/or furnished.Controlling for these characteristics does not change the size of the rental premium of smoking dramatically.Even after controlling for all the property characteristics, the coefficient of "No Smoking" is still negative and statistically significant showing a 2.6% lower weekly rent for the houses where smoking is not allowed.The control variables, which we include in our model, have the expected sign and magnitude.The rent is positively correlated with the number of bedrooms, bathrooms, and the property being refurbished or furnished.

Coarsened exact matching results
The standard hedonic pricing model points to a rent premium for houses where smoking is allowed.Nevertheless, this evidence may be distorted by the way landlords select houses to allow smoking or not.Therefore, in the second step of our analysis, we use CEM and attempt to compare the rental price of pairs of smoking and nonsmoking rental properties that are almost identical across all the covariates.Given our large sample size, we did not experience any sample size issues while performing matching; our control and treated groups are highly balanced (Table A2 in the Appendix presents the summary statistics and measures describing the quality of our matched data).The estimation results based on data preprocessed by CEM are less model-dependent and biased, enabling more credible causal inferences with regard to the effect of smoking on rental prices.

Cross-section variation in the smoking rent premium
Table 3 shows the average treatment effect for houses where smoking is not allowed as −0.063.This is notably larger than the figure produced by the OLS estimation that relied on the same covariates (last column in Table 2).It indicates that for houses where smoking is not allowed, the rent is 6% lower than those where smoking is not banned.A county-level CEM analysis demonstrates that the amount of the premium differs along the rental price distribution.When we plot the amount of the smoker rent premium against the mean F I G U 1 Smoking premium and mean price Note: Each point corresponds to a county at a given year.The red line represents the fitted values and the shaded area represents 95% confidence interval.The smoking premium is calculated using CEM.Subsamples with fewer than 50 nonsmoking houses are dropped to ensure the quality of the matching process.
[Color figure can be viewed at wileyonlinelibrary.com] rental prices at county-level (see Figure 1), we observe an upward sloping curve indicating that the smoking rent premium is larger at the higher-end of the rental market.
A parallel can be drawn here with the literature on housing market price-rent ratios (or rental yields-which are the reciprocals of price-rent ratios).A number of papers have observed an upward sloping price-rent ratio curve (for example, Bracke, 2015;Halket et al., 2021;Heston & Nakamura, 2009;Hill & Syed, 2016).Multiple factors may be contributing to this finding.However, Halket et al. (2021) stress the role played by contracting frictions regarding maintenance between landlords and tenants.These frictions become more important for more expensive properties.
As a result of contracting frictions, better quality properties with higher maintenance costs tend to be selected into the owner-occupied rather than rental market.Furthermore, some of the characteristics associated with higher maintenance costs are typically omitted from the data set (e.g., higher-quality kitchens, bathrooms, and finishes).It follows from this that owner-occupied properties at the higher end of the market are generally of better quality than rentals that are matched with them on the observable characteristics.This by itself can explain why an upwardsloping relationship is observed between price and the price-rent ratio.
Returning to our smoking context, the incentive to find a reliable tenant is therefore greater at the higher-end of the rental market, where contracting frictions are larger.To the extent that a tenant identifying as a nonsmoker is interpreted by the landlord as a signal of reliability, it follows that the smoker rent premium should be larger at the higher-end of the market.Our findings in Figure 1 are consistent with this view.

4.4
Results obtained by probabilistically classifying listings using random forests Finally, we replicate the hedonic pricing model estimations, with a new variable to capture landlords' preferences toward smoking.As mentioned in Section 3, we use a machine learning method to compute the probability of not allowing smoking if a listing does not include any keywords related to smoking.Accordingly, we create a new variable, which takes the value of 1 if the description includes any of the negative phrases listed in Table A1 and 0 if the description includes any of the positive phrases.For the listings, which do not include any smoking-related keywords, this variable takes the probability values computed by the machine-learning process.This new variable represents the probability of not accepting smoking instead of a dummy variable capturing a restriction on smoking.Table 4 shows the results from this estimation.While the size of the coefficient for the key variable is very large before controlling for property characteristics, it drops to −0.070 after the inclusion of the full set of covariates.Nevertheless, a smoking premium of 7%, is larger than the results suggested by the previous model, which includes a dummy for the smoking restriction.12This is as expected since in the previous model it is probable that some nonsmoking properties were incorrectly identified as allowing smoking.
Our results show that regardless of the method we choose and the type of variable we use to capture landlords' preferences, we find a rent premium for the houses where smoking is not prohibited.Moreover, this premium is stronger when we employ CEM and, when we assign restriction probabilities to listings, which do not explicitly express a preference using a machine learning approach.We believe both matching and machine learning methods contribute to more precise estimation of the smoking rent premium.These two approaches generate similar estimates of the smoking premium that range between 6% and 7%.

CONCLUSION
In this article, we investigate the hidden costs associated with smoking by focusing on the rental market.Using a unique historical property data set and benefiting from the textual information therein available, we distinguish between rental properties where smoking is restricted or not.
We then investigate the extent to which houses where smoking is not restricted command higher rental prices when compared to nonsmoking houses.We use various estimation methods, ranging from a standard hedonic pricing model to matching methods and more sophisticated machine-learning techniques.Results from all these approaches consistently show that houses where smoking is not restricted charge a rent premium.The extent of this premium is noteworthy.If we consider the results from the data matched by the CEM method, there is a 6% rent premium related to smoking.As the mean weekly rent is around £240 in our sample, this premium corresponds to an average of £14.40 pounds per week.To put this number into context, the average number of cigarettes consumed by an adult smoker in England is 10 per day13 and the average retail price of 20 filtered cigarettes is £11.45.14This translates into an average of £40 weekly spending as a direct cost of smoking.Accordingly, the £14.40 rental premium, as a hidden indirect cost, paid by an average smoker increase the total cost of smoking by a third.This is a significant additional financial cost, especially considering a recent report by Action on Smoking and Health, which points to the increased poverty rates in smoking households in the United Kingdom after accounting for the money spent on tobacco (ASH, 2019).It is documented that while the poverty rate across the general population is 19.2%, this figure is 28.4% for private tenants who smoke.Private tenants are shown to spend 5.8% of their disposable income on tobacco products.In this article, we show that the total cost would be noticeably higher if the smoking rent premium is taken into account.
Further research is required into the causes of the smoking rent premium.Also, although it is worthwhile to explore further dynamics such as potential regional heterogeneities in future work, we find robust evidence of the presence of a rent premium associated with smoking.The social and economic costs tobacco smoking imposes on individuals and their families are substantial, and this article shows that, in fact, they might be even larger than we think.

Note:
The L 1 statistic is a measure of imbalance across the treated and control groups and, L 1 = 0 in case of perfect global balance and larger values point to a larger imbalance with L 1 = 1 indicating the complete separation (Blackwell et al., 2010).The second column reports the difference in means between the treated and control groups.Multivariate L 1 distance 0.000

Note:
The L 1 statistic is a measure of imbalance across the treated and control groups and, L 1 = 0 in case of perfect global balance and larger values point to a larger imbalance with L 1 = 1 indicating the complete separation (Blackwell et al., 2010).The second column reports the difference in means between the treated and control groups.
, county is used to match the rental properties in terms of location.In Column (2), region is used to match rental properties.In total, we have 85 counties and 11 regions.In both specifications, number of bedrooms, bathrooms, floors, property type, property being furnished, refurbished are used when matching the properties.Standard errors are in parenthesis.***Statistically significant at 1%.
Furnished is a dummy variable that takes the value of 1 if the property is furnished and 0 otherwise.Property type categories are Detached house, Terraced house, Flat, and Other (each is a dummy variable taking the value of 1 if property belongs to the relevant category and 0 otherwise).Other includes property types such as a bungalow, cottage, lodge, etc.All specifications include constant, county, and year dummies.Robust standard errors in parentheses.***, **, and * correspond to 1%, 5%, and 10% statistical significance levels, respectively.
TA B L E 2 Hedonic pricing model.OLS estimationsThe dependent variable log(rental price) is the natural logarithm of weekly rent.No smoking takes the value of 1 if the property description explicitly disallows smoking and 0 otherwise.Number of bedrooms, Number of floors, and Number of bathrooms are continuous variables.Refurbished is a dummy variable that takes the value of 1 if the property is refurbished and 0 otherwise.

TA B L E 4
Hedonic pricing model.OLS estimations using probability of accepting smokers Refurbished is a dummy variable that takes the value of 1 if the property is refurbished and 0 otherwise.Furnished is a dummy variable that takes the value of 1 if the property is furnished and 0 otherwise.Property type categories are Detached house, Terraced house, Flat, and Other (each is a dummy variable taking the value of 1 if property belongs to the relevant category and 0 otherwise).Other includes property types such as a bungalow, cottage, lodge etc.All specifications include constant, county and year dummies.Robust standard errors in parentheses.***, **, and * correspond to 1%, 5%, and 10% statistical significance levels, respectively.
The dependent variable log(rental price) is the natural logarithm of weekly rent.Probability of not accepting smokers is implied probability of not accepting smokers.bedrooms, Number of floors, and Number of bathrooms are continuous variables.
Coarsened exact matching: match statistics (based on county-level geographical data) TA B L E A 2

Matching summary when region is used instead of county
Coarsened exact matching: match statistics (based on 11 NUTS1 regions-level geographical data) TA B L E A 3