Columns

In OpenCosmo, references to columns are created independently of the datasets that contain them. You can create combinations of columns with basic arithmetic, as well as …

Columns created this way can be added to datasets or collections with the with_new_columns method. The actual values in these columns are evaluated lazily, so it is fine to create these new columns at the beginning of your analysis even if you plan to filter out significant numbers of the rows.

opencosmo.col(column_name)

Create a reference to a column with a given name. These references can be combined to produce new columns or express queries that operate on the values in a given dataset. For example:

import opencosmo as oc
ds = oc.open("haloproperties.hdf5")
query = oc.col("fof_halo_mass") > 1e14
px = oc.col("fof_halo_mass") * oc.col("fof_halo_com_vx")
ds = ds.with_new_columns(fof_halo_com_px = px).filter(query)

For more advanced usage, see Working with Columns

Parameters:

column_name (str)

Return type:

Column

class opencosmo.column.Column(column_name)

Represents a reference to a column with a given name. Column reference are created independently of the datasets that actually contain data. You should not create this class directly, instead use opencosmo.col().

Columns can be combined, and support comparison operators for masking datasets.

Combinations:

  • Basic arithmetic with +, -, *, and /

  • Powers with **, and column.sqrt()

  • log and exponentiation with column.log10() and column.exp10()

Comparison operators:

  • Arithmetic comparisons such as <, <=, >, ==, !=

  • Membership with column.isin

In general, combinations of columns produce a DerivedColumn, which can be treated the exact same was as basic Columns.

For example, to compute the x-component of a halo’s momentum, and then filter out halos below a certain value of that momentum

import opencosmo as oc

dataset = oc.open("haloproperties.hdf5")
halo_px = oc.col("fof_halo_mass") * oc.col("fof_halo_com_vx")
dataset = dataset.with_new_columns(fof_halo_com_px = halo_px)

min_momentum_filter = oc.col("fof_halo_com_px) > 10**14
dataset = dataset.filter(min_momentum_filter)
Parameters:

column_name (str)

exp10(expected_unit_container=<class 'astropy.units.function.logarithmic.DexUnit'>)

Create a derived column that will contain the base-10 exponentiation of the given column. If the column being exponentiated contains units, it must be an astropy LogUnit (e.g. Dex or Mag)

You can specify the type of LogUnit container you expect the column to have with expected_unit_container. Defaults to DexUnit.

Parameters:

expected_unit_container (LogUnit)

Return type:

DerivedColumn

log10(unit_container=<class 'astropy.units.function.logarithmic.DexUnit'>)

Create a derived column that will compute the log of a given column. If the column contains units, the units must not be an astropy LogUnit (such as Dex or Mag)

If you want the units of the new column to be a particular type of LogUnit, you can pass that type to the unit_container argument. Defaults to DexUnit.

Parameters:

unit_container (LogUnit)

Return type:

DerivedColumn

sqrt()

Create a derived column that will contain the square root of the given column.

Return type:

DerivedColumn

Provided Column Combinations

There are a number of basic column combinations that are

opencosmo.column.norm_cols(*columns)

Get the euclidian norm of any number of columns. This function takes in the names of the magnitude columns, and produces a DerivedColumn that can be passed into with_new_columns

This function will never fail, but with_new_columns will if the columns do not have the same units.

Parameters:

*columns (str | Column | DerivedColumn) – Any number of columns. You can pass in simple column names, columns constructred with opencosmo.col(), or columns created from combinations of other columns

Returns:

new_column – A new derived column that can be passed into with_new_columns

Return type:

DerivedColumn

opencosmo.column.add_mag_cols(*magnitudes)

Add together any number of magnitude columns to get a total magnitude. This function takes in the names of the magnitude columns, and produces a DerivedColumn that can be passed into with_new_columns

This function will never fail, but with_new_columns will if you include columns that are not magnitudes.

import opencosmo as oc
from opencosmo.column import add_mag_cols

dataset = oc.open("catalog.hdf5")
mag_total = add_mag_cols("mag_g", "mag_r", "mag_i", "mag_z", "mag_y")

dataset = dataset.with_new_columns(mag_total=mag_total)
Parameters:

*magnitudes (str | Column | DerivedColumn) – Any number of magnitude columns. You can pass in simple column names, columns constructred with opencosmo.col(), or columns created from combinations of other columns

Returns:

new_column – A new derived column that can be passed into with_new_columns

Return type:

DerivedColumn

opencosmo.column.offset_3d(coord_name_a, coord_name_b, labels=['x', 'y', 'z'])

Create a derived column that contains the magnitude of the offset between two sets of 3d coordinates. For exmaple, to get the magnitude of the difference between the FoF halo centers and the SOD halo centers:

from opencosmo.column import offset_3d
import opencosmo as oc

dataset = oc.open("haloproperties.hdf5")

offset_column = offset_3d("fof_halo_com", "sod_halo_com")
dataset = dataset.with_new_columns(offset=offset_column)

This function assumes that the columns are named “fof_halo_center_{x, y, z}” and “sod_halo_center_{x, y, z}”, you can choose different labels by setting the labels argument.

This function outputs a derived column that can be passed into with_new_columns will if the columns do not all have the same units.

Parameters:
  • coord_name_a (str) – The base name of the first coordinate

  • coord_name_b (str) – The base name of the second coordinate

  • labels (Iterable[str], default = ["x", "y", "z"]) – The coordinate labels. The names of the columns are assumed to be “{coord_name_a}_{labels}” and “{coord_name_b}_{labels}”