Construction Spending ML Model Dataset and Documentation

Datasets and documentation for ML modeling of macro and traditional construction spending trends.

The Space team has made the following datasets and collections publicly available. You must be a logged-in member of the Space to access all the datasets and collections.

Datasets

Viewing most recent datasets View All Datasets

Thumbnail of Project Memo

Project Memo

This repository contains the full workflow, data, and code used to develop a machine-learning framework for accurate construction cost prediction by integrating micro-level project estimates with macro-level U.S. construction spending indicators. The project includes two source datasets—(1) construction_estimates.csv, containing project-specific material cost, labor cost, profit rate, discount/markup, and total cost, and (2) construction_spending.csv, containing national monthly spending across public, private, residential, and non-residential sectors. A third file, merged_construction_dataset.csv, combines both sources to create a unified feature set used for feature selection (RFE, SelectKBest, correlation filtering) and supervised modeling (Linear Regression and Random Forest). All scripts for preprocessing, ARFF conversion, model training, evaluation, and figure generation are included, providing a reproducible pipeline for researchers and practitioners interested in improving construction cost forecasting.

Construction Spending Dataset (.csv and .arff)

This dataset contains historical U.S. construction spending data across multiple market segments. It includes monthly spending totals for public, private, and total construction activity. The dataset provides insight into macro-level construction trends over time and serves as an external economic context dataset supporting project-level cost estimation.

Construction Cost Datasets (.csv and .arff)

Raw CSV file for construction project cost estimation from Kaggle. Includes material, labor, profit, and total cost values.

Collections

There are no collections associated with this Space.

The following datasets have been published through this Space and any affiliated Spaces.

Filter by:

Statistics

Collections 0
Datasets 4
Files: 5
Bytes: 2.4 MB
Users: 2

External Links

No External Links

Access

PUBLIC