dropna() on the car column. This eliminated ~2,400 rows where the scraper had returned incomplete records.NaN. Missing prices were then imputed using the mean price per manufacturer + model + year group, preserving realistic price estimates.fuel_consumption column contained mixed types and missing values. It was converted to numeric, then missing values were filled using a cascade strategy: (1) mean by manufacturer + year + engine volume, (2) mean by manufacturer + year, (3) mean by manufacturer + motor type, and finally (4) the overall dataset mean. Electric cars were assigned 0 consumption.test_date and on_street_date had partial missing values. A cross-fill strategy was applied: if one date was missing, it was inferred from the other. Remaining gaps were filled using the car's model year. Dates were then converted to proper datetime objects.z_price column (price z-score for outlier detection), an is_electric binary flag, and a price_category column using quantile-based binning into three tiers: cheap, medium, and expensive.cleaned_full_cars_data_v4.csv in UTF-8 encoding and used as the input for all exploratory analysis.This project demonstrates a complete data science workflow: from raw web-scraped data with mixed Hebrew/English text, through multi-step cleaning and imputation, to rich exploratory visualizations. It covers distribution analysis, manufacturer comparisons, price modelling, feature correlation, and pairwise relationship analysis — all implemented in Python using Pandas, Matplotlib, and Seaborn.