GitHub - jcmgray/xyzpy: Efficiently generate and analyse high dimensional data.

xyzpy is python library for efficiently generating, manipulating and plotting data with a lot of dimensions, of the type that often occurs in numerical simulations. It stands wholly atop the labelled N-dimensional array library xarray. The project's documentation is hosted on readthedocs.

The aim is to take the pain and errors out of generating and exploring data with a high number of possible parameters. This means:

you don't have to write super nested for loops
you don't have to remember which arrays/dimensions belong to which variables/parameters
you don't have to parallelize over or distribute runs yourself
you don't have to worry about loading, saving and merging disjoint data
you don't have to guess when a set of runs is going to finish
you don't have to write batch submission scripts or leave the notebook to use SLURM, PBS or SGE
you don't have to lose progress if your run is interrupted
you don't have to fiddle with CUDA_VISIBLE_DEVICES or taskset to assign GPU devices or CPU cores to different runs

To this data generation functionality, xyzpy adds a simple plotting interface accessed via ds.xyz.plot() that automatically maps dataset dimensions to visual elements including color, marker, marker size, line style, line width, subplot rows and columns, and text annotations. It also adds various other utilities for timing and tracking memory usage, and for visualizing matrices and high dimensional tensors.

Quick-start

Here's a simple example of generating and plotting a 5D function that uses the high level driver xyz.cultivate() to handle a full cycle of data generation:

import xyzpy as xyz

def foo(x, delta, p, amp=1.0, C=0.0):
    return {"fx": amp * (x - delta) ** p + C}

# cultivate!
# 0. annotate the function
# 1. write missing parameters combinations to disk ('sow')
# 2. compute those, with results stored persistenly to disk ('grow')
# 3. load results into a xarray.Dataset, merging with existing ('reap')
ds = xyz.cultivate(
    foo,
    # this specifies we'll return a dict of named data_vars ourselves
    var_names=None,
    # this specifies we'll harvest results to the file "foo.h5"
    data_name="foo.h5",
    # compute the outer product of these parameter combinations
    combos=dict(
        x=[-2 + i * 0.25 for i in range(17)],
        p=[1, 2, 3],
        delta=[0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
        C=[-2.0, 1.0, 4.0],
        amp=[-1.0, 1.0],
    ),
)

# plot!
# - we can map pretty much any coordinate to any visual property
# - we can map to a palette ("hue") as well as position within that ("color")
fig, axs = ds.xyz.plot(
    x="x",
    y="fx",
    yscale="symlog",
    ylabel="$f(x)$",
    hue="C",
    markeredgecolor="C",
    color="delta",
    marker="delta",
    col="p",
    row="amp",
    markersize=3,
)

# clean up!
# - if we didn't delete the dataset, next run will only compute missing data
!rm foo.h5

Please see the docs for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 517 Commits
.github/workflows		.github/workflows
docs		docs
tests		tests
xyzpy		xyzpy
.codecov.yml		.codecov.yml
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pixi.lock		pixi.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quick-start

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Quick-start

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages