v0.9.9.1
Extended functionality added:
- Filter by cell counts
- Filter by gene counts
- Save expression to raw layer
- Raw layer to main layer
- Save highly variable genes to main layer (X)
- Impose memory limits when instantiation of AnnSQL class
- PCA (highly experimental)
Analysis Benchmarks added
- Filtering runtime comparisons of Seurat added (figure coming sooon)
- Filtering memory comparisons for AnnSQL, AnnData, Seurat
- Benchmark dataset generation using Splatter added for sparser filtering runtimes and memory profiles.
Known Issues
- Importing h5ad files with columns > 30k. We are addressing this issue in the next release
- PCA runtime is slow; however, is memory efficient for larger datasets. We currently do not have plans to optimize this as we consider it to be highly experiment functionality. Currently, no PCA implementations exist using SQL and this is a hybrid SQL/Python approach. Additionally, the PCA method is resource intensive and will use all threads available to the system. We will release an update which limits thread usage in the near future.
Forward Functionality We will be developing extended functionality for the following below. These methods will allow users to complete a very basic full preprocessing single-cell/nuclei workflow.
- Nearest neighbors
- Leiden clustering
- Umap
- Differential expression