Collections

Collections within a single file can always be loaded with opencosmo.open(). Collections can be treated like read-only dictionaries. Dataset names can be retrieved with keys(), the datasets can be accessed with values() or Collection[key], and iteration can be done with items().

class opencosmo.SimulationCollection(datasets)

A collection of datasets of the same type from different simulations. In general this exposes the exact same API as the individual datasets, but maps the results across all of them.

Parameters:

datasets (Mapping[str, oc.Dataset | Collection])

make_schema()
Return type:

DataSchema

property cosmology: dict[str, Cosmology]

Get the cosmologies of the simulations in the collection

Returns:

cosmologies

Return type:

dict[str, astropy.cosmology.Cosmology]

property redshift: dict[str, float]

Get the redshift slices for the simulations in the collection

Returns:

redshifts

Return type:

dict[str, float]

property simulation: dict[str, SimulationParameters]

Get the simulation parameters for the simulations in the collection

Returns:

simulation_parameters

Return type:

dict[str, opencosmo.parameters.SimulationParameters]

bound(region, select_by=None)

Restrict the datasets to some region. Note that the SimulationCollection does not do any checking to ensure its members have identical boxes. As a result this method can in principle fail for some of the simulations in the collection and not others. This should never happen when working with official OpenCosmo data products.

See Regions for details of how to construct regions.

Parameters:
  • region (opencosmo.spatial.Region) – The region to query

  • select_by (str | None)

Returns:

dataset – The portion of the dataset inside the selected region

Return type:

opencosmo.Dataset

filter(*masks, **kwargs)

Filter the datasets in the collection. This method behaves exactly like opencosmo.Dataset.filter() or opencosmo.StructureCollection.filter(), but it applies the filter to all the datasets or collections within this collection. The result is a new collection.

Parameters:
Returns:

A new collection with the same datasets, but only the particles that pass the filter.

Return type:

SimulationCollection

select(*args, **kwargs)

Select a subset of the datasets in the collection. This method calls the underlying method in opencosmo.Dataset, or opencosmo.Collection depending on the context. As such its behavior and arguments can vary depending on what this collection contains.

Parameters:
  • args – The arguments to pass to the select method. This is usually a list of column names to select.

  • kwargs – The keyword arguments to pass to the select method. This is usually a dictionary of column names to select.

Return type:

SimulationCollection

take(n, at='random')

Take a subest of rows from all datasets or collections in this collection. This method will delegate to the underlying method in opencosmo.Dataset, or opencosmo.StructureCollection depending on the context. As such, behavior may vary depending on what this collection contains. See their documentation for more info.

Parameters:
  • n (int) – The number of rows to take

  • at (str, default = "random") – The method to use to take rows. Must be one of “start”, “end”, “random”.

Return type:

SimulationCollection

with_new_columns(*args, **kwargs)

Update the datasets within this collection with a set of new columns. This method simply calls opencosmo.Dataset.with_new_columns() or opencosmo.StructureCollection.with_new_columns(), as appropriate.

with_units(convention)

Transform all datasets or collections to use the given unit convention. This method behaves exactly like opencosmo.Dataset.with_units().

Parameters:

convention (str) – The unit convention to use. One of “unitless”, “scalefree”, “comoving”, or “physical”.

Return type:

SimulationCollection

class opencosmo.StructureCollection(properties, header, handlers, *args, **kwargs)

A collection of datasets that contain both high-level properties and lower level information (such as particles) for structures in the simulation. Currently these structures include halos and galaxies.

For now, these are always a combination of a properties dataset and several particle or profile datasets.

Parameters:
  • properties (oc.Dataset)

  • header (oc.header.OpenCosmoHeader)

  • handlers (dict[str, s.LinkedDatasetHandler])

property cosmology: Cosmology

The cosmology of the structure collection

property redshift: float | tuple[float, float]

For snapshots, return the redshift of the slice this dataset was drawn from. For lightcones, return the redshift range.

Returns:

redshift: float | tuple[float, float]

property simulation: SimulationParameters

Get the parameters of the simulation this dataset is drawn from.

Returns:

parameters

Return type:

opencosmo.parameters.SimulationParameters

property properties: Dataset

The properties dataset of the collection. Either, halo properties or galaxy properties.

keys()

Return the keys of the linked datasets.

Return type:

list[str]

values()

Return the linked datasets.

Return type:

list[Dataset]

items()

Return the linked datasets as key-value pairs.

Return type:

list[tuple[str, Dataset]]

bound(region, select_by=None)

Restrict this collection to only contain structures in the specified region. Querying will be done based on the halo centers, meaning some particles may fall outside the given region.

See Regions for details of how to construct regions.

Parameters:
  • region (opencosmo.spatial.Region)

  • select_by (str | None)

Returns:

dataset – The portion of the dataset inside the selected region

Return type:

opencosmo.Dataset

Raises:
  • ValueError – If the query region does not overlap with the region this dataset resides in

  • AttributeError: – If the dataset does not contain a spatial index

filter(*masks, on_galaxies=False)

Apply a filter to the properties dataset and propagate it to the linked datasets. Filters are constructed with opencosmo.col() and behave exactly as they would in opencosmo.Dataset.filter.

If the collection contains both halos and galaxies, the filter can be applied to the galaxy properties dataset by setting on_galaxies=True. However this will filter for halos that host galaxies that match this filter. As a result, galxies that do not match this filter will remain if another galaxy in their host halo does match.

Parameters:
  • *filters (Mask) – The filters to apply to the properties dataset constructed with opencosmo.col().

  • on_galaxies (bool, optional) – If True, the filter is applied to the galaxy properties dataset.

Returns:

A new collection filtered by the given masks.

Return type:

StructureCollection

Raises:

ValueError – If on_galaxies is True but the collection does not contain a galaxy properties dataset.

select(columns, dataset=None)

Update the linked collection to only include the columns specified in the given dataset. If no dataset is specified, the properties dataset is used.

Parameters:
  • columns (str | Iterable[str]) – The columns to select from the dataset.

  • dataset (str, optional) – The dataset to select from. If None, the properties dataset is used.

Returns:

A new collection with only the selected columns for the specified dataset.

Return type:

StructureCollection

Raises:

ValueError – If the specified dataset is not found in the collection.

with_units(convention)

Apply the given unit convention to the collection. See opencosmo.Dataset.with_units()

Parameters:

convention (str) – The unit convention to apply. One of “unitless”, “scalefree”, “comoving”, or “physical”.

Returns:

A new collection with the unit convention applied.

Return type:

StructureCollection

take(n, at='random')

Take some number of structures from the collection. See opencosmo.Dataset.take().

Parameters:
  • n (int) – The number of structures to take from the collection.

  • at (str, optional) – The method to use to take the structures. One of “random”, “first”, or “last”. Default is “random”.

Returns:

A new collection with the structures taken from the original.

Return type:

StructureCollection

with_new_columns(dataset, **new_columns)

Add new column(s) to one of the datasets in this collection. This behaves exactly like oc.Dataset.with_new_columns(), except that you must specify which dataset the columns should refer too.

pe = oc.col("phi") * oc.col("mass")
collection = collection.with_new_columns("dm_particles", pe=pe)
Parameters:
  • dataset (str) – The name of the dataset to add columns to

  • columns (**) – The new columns

  • new_columns (DerivedColumn)

Returns:

new_collection – This collection with the additional columns added

Return type:

opencosmo.StructureCollection

Raises:

ValueError – If the dataset is not found in this collection

objects(data_types=None)

Iterate over the objects in this collection as pairs of (properties, datasets). For example, a halo collection could yield the halo properties and datasets for each of the associated partcles.

If you don’t need all the datasets, you can specify a list of data types for example:

for row, particles in
    collection.objects(data_types=["gas_particles", "star_particles"]):
    # do work

At each iteration, “row” will be a dictionary of halo properties with associated units, and “particles” will be a dictionary of datasets with the same keys as the data types.

Parameters:

data_types (Iterable[str] | None)

Return type:

Iterable[tuple[dict[str, Any], Dataset | dict[str, Dataset]]]

make_schema()
Return type:

StructCollectionSchema