Data can be used in HASH to instantiate agents in a simulation, to set the properties of agents, or to influence any part of the simulation.
HASH allows you to incorporate your own data into simulations, or download datasets from hIndex to use in a simulation.
Example Simulations that use datasets:
To import your own datasets into HASH, click the 'Add Dataset' button above the list of files in hCore, and select a CSV or JSON dataset from your computer to upload:
Once uploaded the dataset will be listed in your simulation's file list.
There are also third-party datasets published in hIndex you can add to your project, by searching in the "Add to Simulation" interface in the bottom-left hand corner of hCore.
Coming soon: data syncing from remote sources is currently only achievable through HASH Engine, but will be soon be available in HASH Core, via HASH.
Let us know which integrations you'd like to see natively supported by completing our data connectors survey.
HASH parses imported datasets and generates a new field in context.data()
with the file name. This contains the content of datasets associated in the simulation. At this time HASH supports datasets imported in CSV or JSON formats.
[
["name", "age"],
["Bob", 32],
["Alice", 58],
];
To access a dataset, use its path on context.data()
- you can find its path by right-clicking on it in your files list, and clicking 'copy path to clipboard'.
// Access a dataset in your simulation
context.data()["dataset-path.csv"];
Coming soon: we will be streamlining this process shortly, providing more optionality around parsing treatment, and expanding support for the types of datasets ingestible by HASH.
If you wish to explore the universe of data available in HASH outside of hCore, you can do so directly within hIndex. As with behaviors, we encourage you to tag data in hIndex with the type of 'Thing' it represents. This ensures that the data can subsequently be easily discovered and reused.
Initializing agents is one of the most common uses of data in HASH. In the city infection model you can see an example of using data to create agents with heterogeneous values.
The simulation contains a file, sf900homes100offices.csv
, that appropriately contains listings of 900 homes and 100 offices. Each row contains a different building with a different lat, lng location
use_def neighborhood lat long
0 Single Family Residential Sunset/Parkside -122.502183895904 37.763653457648
1 Single Family Residential Bernal Heights -122.4170591352 37.747528129366
An accompanying behavior, gis_data_upload.js
, imports the data, performs transformations to it (ex. cleaning the data, parsing it into floats), and then pushes the data as objects into an array.
let gis_data = context.data()["@b/property_data/sf900homes100offices.csv"]
...
let json_data = selected.map(row => ({
"use_def": row[0],
"neighborhood": row[1],
"lat": parseFloat(row[2]),
"lon": parseFloat(row[3]),
"type": transform_type(row[0])
})
)
...
json_data.forEach(e => agents.push(e))
A third behavior, create_agents.js
, then iterates through the agents array and initializes the agents.
Now the simulation has a collection of agents with unique positions derived from real world data.
This section on hydrating agents' properties with external data is coming soon.
Datasets can be used to calibrate a model to find the parameters that best match the real world. Upload a dataset and create an optimization experiment that reduces the error between a run and the dataset - the HASH optimization engine will automatically identify the parameters that reduce the error the most.
Read more and see an example in the Complex Metrics section on Validating and Calibrating.