Collections
Collections within a single file can always be loaded with opencosmo.open(). Collections can be treated like read-only dictionaries. Dataset names can be retrieved with keys(), the datasets can be accessed with values() or Collection[key], and iteration can be done with items().
- class opencosmo.SimulationCollection(datasets)
A collection of datasets of the same type from different simulations. In general this exposes the exact same API as the individual datasets, but maps the results across all of them.
- Parameters:
datasets (Mapping[str, oc.Dataset | Collection])
- make_schema()
- Return type:
DataSchema
- property cosmology: dict[str, Cosmology]
Get the cosmologies of the simulations in the collection
- Returns:
cosmologies
- Return type:
dict[str, astropy.cosmology.Cosmology]
- property redshift: dict[str, float | tuple[float, float]]
Get the redshift slices or ranges for the simulations in the collection
- Returns:
redshifts
- Return type:
dict[str, float | tuple[float,float]]
- property simulation: dict[str, SimulationParameters]
Get the simulation parameters for the simulations in the collection
- Returns:
simulation_parameters
- Return type:
dict[str, opencosmo.parameters.SimulationParameters]
- bound(region, select_by=None)
Restrict the datasets to some region. Note that the SimulationCollection does not do any checking to ensure its members have identical boxes. As a result this method can in principle fail for some of the simulations in the collection and not others. This should never happen when working with official OpenCosmo data products.
See Regions for details of how to construct regions.
- Parameters:
region (opencosmo.spatial.Region) – The region to query
select_by (str | None)
- Returns:
dataset – The portion of the dataset inside the selected region
- Return type:
- filter(*masks, **kwargs)
Filter the datasets in the collection. This method behaves exactly like
opencosmo.Dataset.filter()oropencosmo.StructureCollection.filter(), but it applies the filter to all the datasets or collections within this collection. The result is a new collection.- Parameters:
filters – The filters constructed with
opencosmo.col()masks (Mask)
- Returns:
A new collection with the same datasets, but only the particles that pass the filter.
- Return type:
- select(*args, **kwargs)
Select a subset of the datasets in the collection. This method calls the underlying method in
opencosmo.Dataset, oropencosmo.Collectiondepending on the context. As such its behavior and arguments can vary depending on what this collection contains.- Parameters:
args – The arguments to pass to the select method. This is usually a list of column names to select.
kwargs – The keyword arguments to pass to the select method. This is usually a dictionary of column names to select.
- Return type:
- take(n, at='random')
Take a subest of rows from all datasets or collections in this collection. This method will delegate to the underlying method in
opencosmo.Dataset, oropencosmo.StructureCollectiondepending on the context. As such, behavior may vary depending on what this collection contains. See their documentation for more info.- Parameters:
n (int) – The number of rows to take
at (str, default = "random") – The method to use to take rows. Must be one of “start”, “end”, “random”.
- Return type:
- with_new_columns(*args, **kwargs)
Update the datasets within this collection with a set of new columns. This method simply calls
opencosmo.Dataset.with_new_columns()oropencosmo.StructureCollection.with_new_columns(), as appropriate.
- with_units(convention)
Transform all datasets or collections to use the given unit convention. This method behaves exactly like
opencosmo.Dataset.with_units().- Parameters:
convention (str) – The unit convention to use. One of “unitless”, “scalefree”, “comoving”, or “physical”.
- Return type:
- class opencosmo.StructureCollection(source, header, datasets, links, *args, **kwargs)
A collection of datasets that contain both high-level properties and lower level information (such as particles) for structures in the simulation. Currently these structures include halos and galaxies.
For now, these are always a combination of a properties dataset and several particle or profile datasets.
- Parameters:
source (oc.Dataset)
header (oc.header.OpenCosmoHeader)
datasets (dict[str, oc.Dataset | StructureCollection])
links (dict[str, LinkedDatasetHandler])
- property cosmology: Cosmology
The cosmology of the structure collection
- property redshift: float | tuple[float, float]
For snapshots, return the redshift or redshift range this dataset was drawn from.
- Returns:
redshift
- Return type:
float | tuple[float, float]
- property simulation: SimulationParameters
Get the parameters of the simulation this dataset is drawn from.
- Returns:
parameters
- Return type:
- keys()
Return the names of the datasets in this collection.
- Return type:
list[str]
- values()
Return the datasets in this collection.
- Return type:
list[Dataset | StructureCollection]
- items()
Return the names and datasets as key-value pairs.
- Return type:
Generator[tuple[str, Dataset | StructureCollection], None, None]
- property region
- bound(region, select_by=None)
Restrict this collection to only contain structures in the specified region. Querying will be done based on the halo or galaxy centers, meaning some particles may fall outside the given region.
See Regions for details of how to construct regions.
- Parameters:
region (opencosmo.spatial.Region)
select_by (str | None)
- Returns:
dataset – The portion of the dataset inside the selected region
- Return type:
- Raises:
ValueError – If the query region does not overlap with the region this dataset resides in
AttributeError: – If the dataset does not contain a spatial index
- filter(*masks, on_galaxies=False)
Apply a filter to the halo or galaxy properties. Filters are constructed with
opencosmo.col()and behave exactly as they would inopencosmo.Dataset.filter().If the collection contains both halos and galaxies, the filter can be applied to the galaxy properties dataset by setting on_galaxies=True. However this will filter for halos that host galaxies that match this filter. As a result, galxies that do not match this filter will remain if another galaxy in their host halo does match.
See Querying In Collections for some examples.
- Parameters:
*filters (Mask) – The filters to apply to the properties dataset constructed with
opencosmo.col().on_galaxies (bool, optional) – If True, the filter is applied to the galaxy properties dataset.
- Returns:
A new collection filtered by the given masks.
- Return type:
- Raises:
ValueError – If on_galaxies is True but the collection does not contain a galaxy properties dataset.
- select(columns, dataset=None)
Update the linked collection to only include the columns specified in the given dataset. If no dataset is specified, the properties dataset is used.
- Parameters:
columns (str | Iterable[str]) – The columns to select from the dataset.
dataset (str, optional) – The dataset to select from. If None, the properties dataset is used.
- Returns:
A new collection with only the selected columns for the specified dataset.
- Return type:
- Raises:
ValueError – If the specified dataset is not found in the collection.
- with_units(convention)
Apply the given unit convention to the collection. See
opencosmo.Dataset.with_units()- Parameters:
convention (str) – The unit convention to apply. One of “unitless”, “scalefree”, “comoving”, or “physical”.
- Returns:
A new collection with the unit convention applied.
- Return type:
- take(n, at='random')
Take some number of structures from the collection. See
opencosmo.Dataset.take().- Parameters:
n (int) – The number of structures to take from the collection.
at (str, optional) – The method to use to take the structures. One of “random”, “first”, or “last”. Default is “random”.
- Returns:
A new collection with the structures taken from the original.
- Return type:
- take_range(start, end)
- Parameters:
start (int)
end (int)
- with_new_columns(dataset, **new_columns)
Add new column(s) to one of the datasets in this collection. This behaves exactly like
oc.Dataset.with_new_columns(), except that you must specify which dataset the columns should refer too.pe = oc.col("phi") * oc.col("mass") collection = collection.with_new_columns("dm_particles", pe=pe)
Structure collections can hold other structure collections. For example, a collection of Halos may hold a structure collection that contians the galaxies of those halos. To update datasets within these collections, use dot syntax to specify a path:
pe = oc.col("phi") * oc.col("mass") collection = collection.with_new_columns("galaxies.star_particles", pe=pe)
See Creating New Columns in Collections for examples.
- Parameters:
dataset (str) – The name of the dataset to add columns to
columns (**) – The new columns
new_columns (DerivedColumn)
- Returns:
new_collection – This collection with the additional columns added
- Return type:
- Raises:
ValueError – If the dataset is not found in this collection
- with_index(index)
- Parameters:
index (DataIndex)
- objects(data_types=None)
Iterate over the objects in this collection as pairs of (properties, datasets). For example, a halo collection could yield the halo properties and datasets for each of the associated partcles.
If you don’t need all the datasets, you can specify a list of data types for example:
for row, particles in collection.objects(data_types=["gas_particles", "star_particles"]): # do work
At each iteration, “row” will be a dictionary of halo properties with associated units, and “particles” will be a dictionary of datasets with the same keys as the data types.
- Parameters:
data_types (Iterable[str] | None)
- Return type:
Iterable[dict[str, Any]]
- halos(*args, **kwargs)
Alias for “objects” in the case that this StructureCollection contains halos.
- galaxies(*args, **kwargs)
Alias for “objects” in the case that this StructureCollection contains galaxies
- make_schema()
- Return type:
StructCollectionSchema