The Space team has made the following datasets publicly available.
This repository contains the full workflow, data, and code used to develop a machine-learning framework for accurate construction cost prediction by integrating micro-level project estimates with macro-level U.S. construction spending indicators. The project includes two source datasets—(1) construction_estimates.csv, containing project-specific material cost, labor cost, profit rate, discount/markup, and total cost, and (2) construction_spending.csv, containing national monthly spending across public, private, residential, and non-residential sectors. A third file, merged_construction_dataset.csv, combines both sources to create a unified feature set used for feature selection (RFE, SelectKBest, correlation filtering) and supervised modeling (Linear Regression and Random Forest). All scripts for preprocessing, ARFF conversion, model training, evaluation, and figure generation are included, providing a reproducible pipeline for researchers and practitioners interested in improving construction cost forecasting.
Merged data.
This dataset contains historical U.S. construction spending data across multiple market segments. It includes monthly spending totals for public, private, and total construction activity. The dataset provides insight into macro-level construction trends over time and serves as an external economic context dataset supporting project-level cost estimation.
Raw CSV file for construction project cost estimation from Kaggle. Includes material, labor, profit, and total cost values.
Powered by Clowder (1.22.1#1085 branch:master sha1:f28c203c56b2d4690d32ea0bce5364458de1ec79).