The Space team has made the following datasets and collections publicly available. You must be a logged-in member of the Space to access all the datasets and collections.
Viewing most recent datasets View All Datasets
The memorandum and the appendix of my AI project
These are the datasets used for the project. Here is the original link: https://www.kaggle.com/datasets/taweilo/capital-bikeshare-dataset-202005202408
There is a Supervised and Clustered Dataset along with my combined datasets
Attributes:
- date (observation date)
- pickup_counts (number of bikes rented per day)
- dropoff_counts (number of bikes dropped off per day)
-tempmax (max daily temp F)
-tempmin (min daily temp F)
-humidity (average daily humidity %)
-precip (daily total precipitation inches)
-windspeed (average daily windspeed mph)
-weekday (numeric representation of day of week 1=monday 7=sunday)
-month (numeric representation 1-12)
-holiday (1 = federal or major holiday and 0 = normal day)
-total_usage (total usage of a station)
https://www.dallasopendata.com/Archive/Dallas-Police-Public-Data-RMS-Incidents-with-GeoLo/4ea4-q4ui/about_data
https://www.dallasopendata.com/dataset/Geolocation-2016/2byq-ux7x/about_data
https://www.dallasopendata.com/Public-Safety/High-Crash-Rate-Intersections-in-Dallas/cyd9-x7py/about_data
Viewing most recent collections View All Collections
Datasets used to predict vector-borne disease risk based on weather patterns and weekly disease reports.
This Collection contains datasets used for the project and my memo, any appendices or extra data used
Optimizing Campus
Coffee Shop
Operations:
Reducing Waste
and Improving
Efficiency
Below are my datasets and their attributes (GPT-5 used for formatting):
Dataset Name: Construction Estimation Data
Source Link: https://www.kaggle.com/datasets/sasakitetsuya/construction-estimation-data
Description: A simulated dataset of 1,000 construction project cost estimates including cost components and pricing adjustments.
Attributes:
material_cost – Estimated material cost for each project (USD)
labor_cost – Estimated labor cost for each project (USD)
profit_rate – Contractor markup percentage (%)
discount_or_markup – Additional price adjustment applied (USD)
policy_reason – Text category describing the reason for markup/discount
total_estimate – Final estimated project cost (USD)
Instances: 1,000 projects
Units: USD & Percentage
Spatial Scope: National (synthetic)
Temporal Scope: Static cross-section
Purpose: Core supervised ML dataset used to train models predicting final construction cost based on cost inputs and pricing policy factors.
__
Dataset Name: U.S. Construction Spending Dataset
Source Link:
https://www.kaggle.com/datasets/shashwatwork/construction-spending-dataset/data
Description:
This dataset contains historical U.S. construction spending data across multiple market segments. It includes monthly spending totals for public, private, and total construction activity. The dataset provides insight into macro-level construction trends over time and serves as an external economic context dataset supporting project-level cost estimation.
Attributes:
Date – Month and year of reported spending
Total Construction Spending – Aggregate U.S. construction spending (in USD millions)
Private Construction Spending – Spending on private-sector construction projects (USD millions)
Public Construction Spending – Spending on publicly funded construction projects (USD millions)
(names may vary slightly depending on file columns — adjust after loading)
Instances:
Monthly observations across multiple years
Units:
U.S. Dollars (millions)
Spatial Scope:
United States, national-level data
Temporal Scope:
Monthly time series
Purpose:
To incorporate real-world construction market activity trends into the modeling process by adding a macro-economic indicator that reflects industry spending levels and demand cycles.
Justification:
The primary Kaggle construction cost dataset includes project-level variables such as material cost, labor cost, and profit factors, which directly influence individual project estimates. The Construction Spending dataset adds external industry context by providing monthly U.S. construction spending trends. Including this variable supports a more realistic modeling approach by aligning project-level cost estimates with broader construction market activity and economic conditions.
The following datasets have been published through this Space and any affiliated Spaces.
| Collections | 6 |
| Datasets | 11 |
| Files: | 15 |
| Bytes: | 321.6 MB |
| Users: | 8 |
No External Links
PUBLIC