Columns
In OpenCosmo, references to columns are created independently of the datasets that contain them. You can create combinations of columns with basic arithmetic, as well as …
Columns created this way can be added to datasets or collections with the with_new_columns method. The actual values in these columns are evaluated lazily, so it is fine to create these new columns at the beginning of your analysis even if you plan to filter out significant numbers of the rows.
- opencosmo.col(name)
Create a reference to a column with a given name. These references can be combined to produce new columns or express queries that operate on the values in a given dataset. For example:
import opencosmo as oc ds = oc.open("haloproperties.hdf5") query = oc.col("fof_halo_mass") > 1e14 px = oc.col("fof_halo_mass") * oc.col("fof_halo_com_vx") ds = ds.with_new_columns(fof_halo_com_px = px).filter(query)
For more advanced usage, see Working with Columns
- Parameters:
name (str)
- Return type:
- class opencosmo.column.Column(name)
Represents a reference to a column with a given name. Column reference are created independently of the datasets that actually contain data. You should not create this class directly, instead use
opencosmo.col().Columns can be combined, and support comparison operators for masking datasets.
Combinations:
Basic arithmetic with +, -, *, and /
Powers with
**, andcolumn.sqrt()log and exponentiation with
column.log10()andcolumn.exp10()
Comparison operators:
Arithmetic comparisons such as <, <=, >, ==, !=
Membership with
column.isin
In general, combinations of columns produce a
DerivedColumn, which can be treated the exact same was as basic Columns.For example, to compute the x-component of a halo’s momentum, and then filter out halos below a certain value of that momentum
import opencosmo as oc dataset = oc.open("haloproperties.hdf5") halo_px = oc.col("fof_halo_mass") * oc.col("fof_halo_com_vx") dataset = dataset.with_new_columns(fof_halo_com_px = halo_px) min_momentum_filter = oc.col("fof_halo_com_px) > 10**14 dataset = dataset.filter(min_momentum_filter)
- Parameters:
name (str)
- arccos()
Create a derived column containing the arccosine of this column (in radians). The column must be dimensionless.
- Return type:
DerivedColumn
- arcsin()
Create a derived column containing the arcsine of this column (in radians). The column must be dimensionless.
- Return type:
DerivedColumn
- arctan2(other)
Create a derived column containing arctan2(self, other) in radians. Both columns must be dimensionless.
- Parameters:
other (ConstructedColumn | int | float | Quantity)
- Return type:
DerivedColumn
- exp10(expected_unit_container=<class 'astropy.units.function.logarithmic.DexUnit'>)
Create a derived column that will contain the base-10 exponentiation of the given column. If the column being exponentiated contains units, it must be an astropy LogUnit (e.g. Dex or Mag)
You can specify the type of LogUnit container you expect the column to have with expected_unit_container. Defaults to DexUnit.
- Parameters:
expected_unit_container (LogUnit)
- Return type:
DerivedColumn
- log10(unit_container=<class 'astropy.units.function.logarithmic.DexUnit'>)
Create a derived column that will compute the log of a given column. If the column contains units, the units must not be an astropy LogUnit (such as Dex or Mag)
If you want the units of the new column to be a particular type of LogUnit, you can pass that type to the
unit_containerargument. Defaults to DexUnit.- Parameters:
unit_container (LogUnit)
- Return type:
DerivedColumn
- sqrt()
Create a derived column that will contain the square root of the given column.
- Return type:
DerivedColumn
Provided Column Combinations
There are a number of basic column combinations that are
- opencosmo.column.norm_cols(*columns)
Get the euclidian norm of any number of columns. This function takes in the names of the magnitude columns, and produces a DerivedColumn that can be passed into
with_new_columnsThis function will never fail, but
with_new_columnswill if the columns do not have the same units.- Parameters:
*columns (str | Column | DerivedColumn) – Any number of columns. You can pass in simple column names, columns constructred with
opencosmo.col(), or columns created from combinations of other columns- Returns:
new_column – A new derived column that can be passed into
with_new_columns- Return type:
DerivedColumn
- opencosmo.column.add_mag_cols(*magnitudes)
Add together any number of magnitude columns to get a total magnitude. This function takes in the names of the magnitude columns, and produces a DerivedColumn that can be passed into
with_new_columnsThis function will never fail, but
with_new_columnswill if you include columns that are not magnitudes.import opencosmo as oc from opencosmo.column import add_mag_cols dataset = oc.open("catalog.hdf5") mag_total = add_mag_cols("mag_g", "mag_r", "mag_i", "mag_z", "mag_y") dataset = dataset.with_new_columns(mag_total=mag_total)
- Parameters:
*magnitudes (str | Column | DerivedColumn) – Any number of magnitude columns. You can pass in simple column names, columns constructred with
opencosmo.col(), or columns created from combinations of other columns- Returns:
new_column – A new derived column that can be passed into
with_new_columns- Return type:
DerivedColumn
- opencosmo.column.offset_3d(coord_name_a, coord_name_b, labels=['x', 'y', 'z'])
Create a derived column that contains the magnitude of the offset between two sets of 3d coordinates. For exmaple, to get the magnitude of the difference between the FoF halo centers and the SOD halo centers:
from opencosmo.column import offset_3d import opencosmo as oc dataset = oc.open("haloproperties.hdf5") offset_column = offset_3d("fof_halo_com", "sod_halo_com") dataset = dataset.with_new_columns(offset=offset_column)
This function assumes that the columns are named “fof_halo_center_{x, y, z}” and “sod_halo_center_{x, y, z}”, you can choose different labels by setting the
labelsargument.This function outputs a derived column that can be passed into
with_new_columnswill if the columns do not all have the same units.- Parameters:
coord_name_a (str) – The base name of the first coordinate
coord_name_b (str) – The base name of the second coordinate
labels (Iterable[str], default = ["x", "y", "z"]) – The coordinate labels. The names of the columns are assumed to be “{coord_name_a}_{labels}” and “{coord_name_b}_{labels}”