Collections

Collections within a single file can always be loaded with opencosmo.open(). Collections can be treated like read-only dictionaries. Dataset names can be retrieved with keys(), the datasets can be accessed with values() or Collection[key], and iteration can be done with items().

class opencosmo.SimulationCollection(datasets)

A collection of datasets of the same type from different simulations. In general this exposes the exact same API as the individual datasets, but maps the results across all of them.

Parameters:

datasets (Mapping[str, oc.Dataset | Collection])

make_schema()
Return type:

DataSchema

property cosmology: dict[str, Cosmology]

Get the cosmologies of the simulations in the collection

Returns:

cosmologies

Return type:

dict[str, astropy.cosmology.Cosmology]

property redshift: dict[str, float | tuple[float, float]]

Get the redshift slices or ranges for the simulations in the collection

Returns:

redshifts

Return type:

dict[str, float | tuple[float,float]]

property simulation: dict[str, SimulationParameters]

Get the simulation parameters for the simulations in the collection

Returns:

simulation_parameters

Return type:

dict[str, opencosmo.parameters.SimulationParameters]

bound(region, select_by=None)

Restrict the datasets to some region. Note that the SimulationCollection does not do any checking to ensure its members have identical boxes. As a result this method can in principle fail for some of the simulations in the collection and not others. This should never happen when working with official OpenCosmo data products.

See Regions for details of how to construct regions.

Parameters:
  • region (opencosmo.spatial.Region) – The region to query

  • select_by (str | None)

Returns:

dataset – The portion of the dataset inside the selected region

Return type:

opencosmo.Dataset

filter(*masks, **kwargs)

Filter the datasets in the collection. This method behaves exactly like opencosmo.Dataset.filter() or opencosmo.StructureCollection.filter(), but it applies the filter to all the datasets or collections within this collection. The result is a new collection.

Parameters:
Returns:

A new collection with the same datasets, but only the particles that pass the filter.

Return type:

SimulationCollection

select(*args, **kwargs)

Select a subset of the datasets in the collection. This method calls the underlying method in opencosmo.Dataset, or opencosmo.Collection depending on the context. As such its behavior and arguments can vary depending on what this collection contains.

Parameters:
  • args – The arguments to pass to the select method. This is usually a list of column names to select.

  • kwargs – The keyword arguments to pass to the select method. This is usually a dictionary of column names to select.

Return type:

SimulationCollection

take(n, at='random')

Take a subest of rows from all datasets or collections in this collection. This method will delegate to the underlying method in opencosmo.Dataset, or opencosmo.StructureCollection depending on the context. As such, behavior may vary depending on what this collection contains. See their documentation for more info.

Parameters:
  • n (int) – The number of rows to take

  • at (str, default = "random") – The method to use to take rows. Must be one of “start”, “end”, “random”.

Return type:

SimulationCollection

with_new_columns(*args, **kwargs)

Update the datasets within this collection with a set of new columns. This method simply calls opencosmo.Dataset.with_new_columns() or opencosmo.StructureCollection.with_new_columns(), as appropriate.

with_units(convention)

Transform all datasets or collections to use the given unit convention. This method behaves exactly like opencosmo.Dataset.with_units().

Parameters:

convention (str) – The unit convention to use. One of “unitless”, “scalefree”, “comoving”, or “physical”.

Return type:

SimulationCollection

class opencosmo.StructureCollection(source, header, datasets, links, *args, **kwargs)

A collection of datasets that contain both high-level properties and lower level information (such as particles) for structures in the simulation. Currently these structures include halos and galaxies.

For now, these are always a combination of a properties dataset and several particle or profile datasets.

Parameters:
  • source (oc.Dataset)

  • header (oc.header.OpenCosmoHeader)

  • datasets (dict[str, oc.Dataset | StructureCollection])

  • links (dict[str, LinkedDatasetHandler])

property cosmology: Cosmology

The cosmology of the structure collection

property redshift: float | tuple[float, float]

For snapshots, return the redshift or redshift range this dataset was drawn from.

Returns:

redshift

Return type:

float | tuple[float, float]

property simulation: SimulationParameters

Get the parameters of the simulation this dataset is drawn from.

Returns:

parameters

Return type:

opencosmo.parameters.SimulationParameters

keys()

Return the names of the datasets in this collection.

Return type:

list[str]

values()

Return the datasets in this collection.

Return type:

list[Dataset | StructureCollection]

items()

Return the names and datasets as key-value pairs.

Return type:

Generator[tuple[str, Dataset | StructureCollection], None, None]

property region
bound(region, select_by=None)

Restrict this collection to only contain structures in the specified region. Querying will be done based on the halo or galaxy centers, meaning some particles may fall outside the given region.

See Regions for details of how to construct regions.

Parameters:
  • region (opencosmo.spatial.Region)

  • select_by (str | None)

Returns:

dataset – The portion of the dataset inside the selected region

Return type:

opencosmo.Dataset

Raises:
  • ValueError – If the query region does not overlap with the region this dataset resides in

  • AttributeError: – If the dataset does not contain a spatial index

filter(*masks, on_galaxies=False)

Apply a filter to the halo or galaxy properties. Filters are constructed with opencosmo.col() and behave exactly as they would in opencosmo.Dataset.filter().

If the collection contains both halos and galaxies, the filter can be applied to the galaxy properties dataset by setting on_galaxies=True. However this will filter for halos that host galaxies that match this filter. As a result, galxies that do not match this filter will remain if another galaxy in their host halo does match.

See Querying In Collections for some examples.

Parameters:
  • *filters (Mask) – The filters to apply to the properties dataset constructed with opencosmo.col().

  • on_galaxies (bool, optional) – If True, the filter is applied to the galaxy properties dataset.

Returns:

A new collection filtered by the given masks.

Return type:

StructureCollection

Raises:

ValueError – If on_galaxies is True but the collection does not contain a galaxy properties dataset.

select(columns, dataset=None)

Update the linked collection to only include the columns specified in the given dataset. If no dataset is specified, the properties dataset is used.

Parameters:
  • columns (str | Iterable[str]) – The columns to select from the dataset.

  • dataset (str, optional) – The dataset to select from. If None, the properties dataset is used.

Returns:

A new collection with only the selected columns for the specified dataset.

Return type:

StructureCollection

Raises:

ValueError – If the specified dataset is not found in the collection.

with_units(convention)

Apply the given unit convention to the collection. See opencosmo.Dataset.with_units()

Parameters:

convention (str) – The unit convention to apply. One of “unitless”, “scalefree”, “comoving”, or “physical”.

Returns:

A new collection with the unit convention applied.

Return type:

StructureCollection

take(n, at='random')

Take some number of structures from the collection. See opencosmo.Dataset.take().

Parameters:
  • n (int) – The number of structures to take from the collection.

  • at (str, optional) – The method to use to take the structures. One of “random”, “first”, or “last”. Default is “random”.

Returns:

A new collection with the structures taken from the original.

Return type:

StructureCollection

take_range(start, end)
Parameters:
  • start (int)

  • end (int)

with_new_columns(dataset, **new_columns)

Add new column(s) to one of the datasets in this collection. This behaves exactly like oc.Dataset.with_new_columns(), except that you must specify which dataset the columns should refer too.

pe = oc.col("phi") * oc.col("mass")
collection = collection.with_new_columns("dm_particles", pe=pe)

Structure collections can hold other structure collections. For example, a collection of Halos may hold a structure collection that contians the galaxies of those halos. To update datasets within these collections, use dot syntax to specify a path:

pe = oc.col("phi") * oc.col("mass")
collection = collection.with_new_columns("galaxies.star_particles", pe=pe)

See Creating New Columns in Collections for examples.

Parameters:
  • dataset (str) – The name of the dataset to add columns to

  • columns (**) – The new columns

  • new_columns (DerivedColumn)

Returns:

new_collection – This collection with the additional columns added

Return type:

opencosmo.StructureCollection

Raises:

ValueError – If the dataset is not found in this collection

with_index(index)
Parameters:

index (DataIndex)

objects(data_types=None)

Iterate over the objects in this collection as pairs of (properties, datasets). For example, a halo collection could yield the halo properties and datasets for each of the associated partcles.

If you don’t need all the datasets, you can specify a list of data types for example:

for row, particles in
    collection.objects(data_types=["gas_particles", "star_particles"]):
    # do work

At each iteration, “row” will be a dictionary of halo properties with associated units, and “particles” will be a dictionary of datasets with the same keys as the data types.

Parameters:

data_types (Iterable[str] | None)

Return type:

Iterable[dict[str, Any]]

halos(*args, **kwargs)

Alias for “objects” in the case that this StructureCollection contains halos.

galaxies(*args, **kwargs)

Alias for “objects” in the case that this StructureCollection contains galaxies

make_schema()
Return type:

StructCollectionSchema