Changelog 2025ยถ
2025-06-03 db 1.6.1 | bionty 1.5.0ยถ
Bionty.
โจ Flexible ontology sources PR @sunnyosun
LaminDB.
๐ธ Enable passing
--branchand--spacetolamin savePR @falexwolf๐ Fix query of feature-associated labels from non-ULabel registries PR @sunnyosun
2025-06-01 db 1.6.0 | bionty 1.4.0ยถ
โ ๏ธ Consider lamin migrate deploy
All instances connected to LaminHub have been migrated and there is no need to act.
If you are an admin of a self-managed instance, please migrate your database with lamin migrate deploy.
The migrations in this release do not break old LaminDB clients with the exception of writing to the Param registry: the data in the corresponding SQL table got moved into the Feature registry.
The bulk of database-level changes was made in this PR @falexwolf.
remove unique constraint from
Feature.namereplace hard unique constraint on
Transform.hashandArtifact.hash, with conditional unique constraint: hash can be duplicated for different keysnew names for how instances are referred to for
type, these donโt clash with the newrecordconcept:Ulabel.ulabels,Feature.features,Schema.schemas,Project.projectsdefault space uid is now a
"A"for"All"Feature._expect_manynow defaults toNoneso that the auto-display of single values as opposed to a set makes sense, and a user can enforce one (single) or the other (many) in the futurehash is populated for all
FeatureValuerecords so that there is an easy way to universally identify a unique feature value
Changes to registries.
๐๏ธ Integrate the
Paraminto theFeatureregistry PR @falexwolf โ the change is backward compatible on the Python/R level โ on the SQL level, records are transferred from thelamindb_paramtable to thelamindb_featuretable during migrationsโจ Introduce a
Branchregistry PR @falexwolfโป๏ธ Rename
RecordtoSQLRecordPR PR @falexwolfโจ Introduce a flexible
Recordregistry to manage any kind of entity without database migrations PR @falexwolf
Data curation.
โจ Add schema-based
TiledbsomaExperimentCuratorPR @Zethsonโจ Support curating lists as values in
DataFrameCuratorPR @sunnyosun
Bug fixes.
๐ Fix transfer for cases in which genes are insufficiently populated PR @falexwolf
Dependency changes.
โฌ๏ธ No longer install
contenttypesPR @falexwolf
UX improvements.
๐ธ Do no longer duplicate tracking of predecessors through the corresponding link table on
TransformPR @falexwolf๐ธ Add
is_run_inputtoArtifact.get()andCollection.get()PR @Koncopd๐ธ Clearer error in
parse_cat_dtypeif cat dtype contains a module name and the module is not found PR @Koncopd๐ธ Better error message when user passes manual
uidtotrack()+ anticipate that the user might want to create new transforms in some cases also if hash matches PR @falexwolf๐ธ Improve setting relationships of unsaved records UX PR @Zethson
๐ธ Improve
DoesNotExisterror message uponDBRecord.get()PR @Zethsonโป๏ธ Set current space when transferring records PR @Koncopd
โป๏ธ Mark internal lamindb-produced artifacts with
kind="__lamindb__"instead of_branch_code=0PR @falexwolf
2025-05-13 db 1.5.3ยถ
2025-05-13 db 1.5.2ยถ
๐ Reset
SpatialDatapath when access in-memory representation PR @Zethson๐ธ Do not validate twice within
Artifact.from_X(...)when passing schema PR @falexwolf
2025-05-08 db 1.5.1ยถ
๐ Fix a too strict unique constraint in composite schemas PR @falexwolf
๐ Fix display of parents & children in
view_parents(with_children)PR @falexwolfโฌ๏ธ Adapt
save_tiledbsoma_experimenttotiledbsoma==1.16.2PR @Koncopd
2025-05-07 db 1.5.0 | bionty 1.3.2ยถ
Data lineage.
๐ธ Make notebook & script tracking via
ln.track()robust to renames PR @falexwolfโจ Enable executing notebooks via
jupyter nbconvert --executePR @falexwolf
CLI updates.
โจ Enable cloud paths for
lamin savePR @Koncopdlamin save s3://my-bucket/my-file.txt
โจ Enable labeling with project during
lamin savePR @falexwolflamin save ./my-folder --project my-project
Streaming artifacts.
โจ Enable
polarsinArtifact.open()andCollection.open()PR @Koncopdโจ Enable
.load(),.open(), and.mapped()on query sets of artifacts PR @Koncopd
Curation & schemas.
โจ Enable curating the index of a dataframe PR @falexwolf
schema = ln.Schema( features=[ ln.Feature(name="required_feature", dtype=str).save(), ], index=ln.Feature(name="sample", dtype=ln.ULabel).save(), ).save()
๐ธ Enable passing a
ULabeltype todtypePR @falexwolfperturbation_type = ln.ULabel.get(name="Perturbation") # perturbation_type.is_type is True ln.Feature(name="perturbation", dtype=perturbation_type)
๐ธ Handle schema updates decently PR @falexwolf
๐ธ Do not annotate with more than
n_max_records = 1000PR @falexwolf๐ธ Introduce a submodule
lamindb.exampleswith schemas PR @falexwolf๐ธ Enable validating against nested dicts in
spatialdataPR @falexwolf๐ธ Better handle validation of ensembl gene IDs and add curator representation PR @Zethson
๐ธ Prettier
Schema.describe()PR @sunnyosun๐ธ
AnnData: enable explicit transposition invarschema definition PR @falexwolf๐ Rename the
componentsargument ofSchema()toslotsPR @falexwolf๐ Fix respecting
schema.ordered_setinDataFramevalidation PR @sunnyosun
Bulk annotation with features & queries via features.
โจ Support feature dtype
dictPR @falexwolfln.Feature(name="metadata_details", dtype=dict).save()
๐ธ For artifacts, improve (1) bulk annotation with features + (2) queries by features PR @falexwolf
General UX improvements.
๐ธ Do not raise exceptions on problems with
copy_or_move_to_cachewithinArtifact.savePR @Koncopd๐ธ Allow passing
keytosave_vitessce_config()PR @namsaraeva
Docs.
๐ Document
uidgeneration, prettify API reference docs PR @falexwolf
Refactoring.
General refactoring.
โป๏ธ Eliminate monkey patching of
django.db.models.QuerySetanddjango.db.models.ManagerPR @Koncopdโป๏ธ Avoid non-lazy loads of settings on import of
lamindb.modelsPR @Koncopd
Refactoring for curation & schemas.
โป๏ธ Restore validation error messages & add their fine-grained testing PR @falexwolf
โป๏ธ Can save csv artifacts in
DataFrameCuratorPR @sunnyosunโป๏ธ Clearer naming conventions in the internal curator codebase PR @falexwolf
โป๏ธ Separate
CatManagerusage for.catattribute and as legacy interface PR @falexwolfโป๏ธ Separate legacy curators from new curators PR @falexwolf
โป๏ธ Execute curator examples and also show them in the curation guide PR @falexwolf
โป๏ธ Refactor annotating with inferred feature sets PR @falexwolf
Fine-grained access management (in beta).
๐ธ Better access management errors on
Record.save()PR @Koncopd๐ Fix
.usingwith fine-grained access instances and permissions test PR @Koncopdโ Temp table based authentication (adapt tests) PR @Koncopd
๐ธ Delete version family if user wants to retain store by passing
storage=Falsetoartifact.delete(), but retain warning PR @falexwolf
2025-04-25 bionty 1.3.1ยถ
๐ Fixed downloading old Ensembl versions. PR @sunnyosun
If you upgraded to bionty 1.3.0 and used Ensembl versions below 108, please clear the cached ontology source files.
import bionty as bt
import shutil
shutil.rmtree(bt.base.settings.dynamicdir)
2025-04-24 R 1.1.0ยถ
LaminR is now documented on docs.lamin.ai.
The previous docs site laminr.lamin.ai continues to host developer docs.
๐ Update documentation site to match the main docs website PR PR @lazappi
๐ท Separate Seurat analysis from rest of the introduction notebook PR @falexwolf
โป๏ธ Make R and Python quickstarts parallel PR @falexwolf
โป๏ธ Move
setup.Rmdtolamin-docsPR @falexwolf
New features.
โจ Improved Python dependency management with
reticulate, deprecatedinstall_lamindb()PR @lazappiโจ Add tracking of the R environment using
paklockfiles PR @lazappi
Bug fixes.
๐ Enable setting wrapped object slots like
artifact$description,artifact$key, etc. PR @lazappi๐ Fix an issue that was preventing
lamin_connect()from being run multiple times with the same instance PR @lazappi๐ Properly clear and delete temporary instances created using
lamin_init_temp()PR @lazappi
Other changes.
2025-04-15 db 1.4.0 | bionty 1.3.0ยถ
โจ Add schema as an argument to Artifact.from_X(). PR @falexwolf
artifact = ln.Artifact.from_df(df, key="my_dataset.parquet", schema=schema).save()
โจ Enable defining simple schemas that merely enforce a feature identifier type. PR @falexwolf
schema = ln.Schema(itype=ln.Feature).save() # <-- enforce valid feature identifiers, no need to define specific required features
โจ Enable defining optional features on a per-schema level & improve schema hash calculation. PR @sunnyosun
schema = ln.Schema(
features=[
ln.Feature(name="sample_id", dtype=str).save() # required
ln.Feature(name="sample_name", dtype=str).with_config(optional=True) # optional
],
).save()
โจ Introduce lamin run with a Modal backend. PR @ragyhaddad
lamin run my_script.py --project my_project # <-- will run the script on Modal
โจ Support auto-download of Ensembl genes of all organisms. Guide PR @sunnyosun
gene_ontology = bt.base.Gene(source="ensembl", organism="rabbit", version='release-103')
gene_ontology.register_source_in_lamindb() # register the new ontology source in lamindb
source = bt.Source.get(entity="bionty.Gene", name="ensembl", organism="rabbit", version='release-103')
bt.Gene.import_source(source=source) # import all genes from that source
๐ธ Enable querying by features & params through Artifact.filter() and Run.filter(). Guide PR @falexwolf
ln.Artifact.filter(scientist="Barbara McClintock")
User experience.
๐ธ
from_sourceno longer returnsNonebut throws aNoResultFoundexception if the look up in the public ontology fails PR @sunnyosun๐ธ Allow renaming artifacts & transforms within the same version family PR @falexwolf
๐ธ Better support
minimal_set,maximal_set,ordered_setin curators PR @sunnyosun๐ธ Enable passing the stem uid to
lamin savePR @falexwolf๐ธ No longer throw an error but merely print a warning when attempting to update a schema PR @falexwolf
๐ธ Enable plain notebook uploads by making a default run for notebook in case no run is found PR @falexwolf
๐ธ Enable to authenticate and set the current instance through environment variables PR @falexwolf
๐ธ Show link to hub in
view_lineage()and render lineage through graphviz also in scripts PR @falexwolf๐ธ Order
IsVersioned.versionsquery set PR @falexwolf๐ธ Do not print warning about missing schema modules PR @falexwolf
Refactors.
โป๏ธ Eliminate duplicated parsing & record creation during curation PR @falexwolf
โป๏ธ Remove
verbosityandorganismarguments onCatManagerlevel PR PR @falexwolfโป๏ธ Organize categorical curation code with
CatColumnPR @sunnyosunโป๏ธ Add
return_graphargument toview_lineage()PR @lazappi
Docs.
๐ Compare lamindb with pydantic and pandera in an FAQ doc PR @falexwolf
๐ Document access any Ensembl genes PR @sunnyosun
Bugs.
๐ Fix validation of
var_indexPR @sunnyosun๐ Fix
numcodecs==0.16.0incompatibility withzarr v2PR @Koncopd๐ Fix organism passing to
from_sourcePR @sunnyosun๐ Return an empty set not a dict for modules in instance settings PR @falexwolf
Bionty.
๐ธ Make the default organism
"human"instead ofNonePR @falexwolfโฌ๏ธ Support Python 3.13 & remove support for Python 3.9 PR @Zethson
โป๏ธ Improve Ensembl prefix detection PR @sunnyosun
โป๏ธ Use
UPath.synchronizeins3_bionty_assetsPR @Koncopd
2025-03-27 db 1.3.2 | bionty 1.2.1ยถ
๐ Fix bionty ontology sources sync through
reticulatePR @falexwolf๐ Fix data transfer through when target instances has no schema modules PR @falexwolf
2025-03-26 db 1.3.1 | bionty 1.2.0ยถ
In Bionty, you can now add custom ontology sources through the Source registry.
df = pd.read_csv("./our_inhouse_genes.csv") # a csv describing gene metadata e.g. from parsing a GTF file
custom_source = bt.Source(entity="bionty.Gene", organism="human", name="Our genes", version="2025-04-01").save()
bt.Gene.add_source(custom_source, df=df) # couple the custom source to the Gene registry
Detailed changes
Bionty now relies on a single file source.yaml to reference public sources.
โจ Enable update existing records to a new ontology PRPR @sunnyosun
โจ Robust support of custom sources PR @sunnyosun
โป๏ธ Refactor
sync_public_sourcesPR @sunnyosunโป๏ธ Refactor default source configuration PR @sunnyosun
โป๏ธ Make EFO parsing the same as other ontologies PR @sunnyosun
โป๏ธ No longer use local source yaml files PR @sunnyosun
โป๏ธ Move source tests from lamindb to bionty PR @sunnyosun
โป๏ธ Standardize organism scientific names from ensembl source PR @sunnyosun
โป๏ธ Increase uid length for
Sourceto 8 chars PR @falexwolf
LaminDB changes.
๐ Enable transferring features pointing to multiple labels PR @sunnyosun
๐ More extensive validation for updates to
artifact.keyandartifact.suffixPR @falexwolf๐ธ Refactor conventions for files written during init: the SQLite file is now
.lamindb/lamin.dband the storage marker is.lamindb/storage_uid.txtPR @falexwolf๐ธ Make upload of large directories more robust by reducing batch size PR @Koncopd
๐ธ Avoid requiring
coerce_dtypefor"int"and"float"in case an integer or floatpd.Series.dtypeonly deviates by numerical precision/range PR @falexwolf๐ธ In
AnnDataCurator, make'obs'schema optional and allow'uns'schema PR @falexwolf
2025-03-16 db 1.3.0ยถ
New features.
โจ Add schema-based
SpatialDataCuratorPR1 PR2 PR3 @Zethsonโจ Add schema-based
MuDataCuratorPR @sunnyosunโจ Add
lamin getfor artifacts andlamin loadfor collections PR @Zethson @falexwolf
Other changes.
โฌ๏ธ Support CELLxGENE schema 5.2.0 PR1 PR2 @sunnyosun
๐ธ Skip
ln.track()when connected in read-only mode PR @falexwolf๐ธ Error if trying to register an instance without a storage in the hub PR @Koncopd
๐ธ Refactor
organismconstraints during validation PR @sunnyosun๐ธ Add more constructor signatures and specific inherited types PR @falexwolf
๐ธ No logging message if database is behind by minor version PR @falexwolf
๐ Re-structure curation guides PR1 PR2 @falexwolf
๐ Integrate tutorials into introduction guide PR @falexwolf
2025-03-10 R 1.0.0ยถ
โจ laminr now has feature parity with lamindb. PR @lazappi
Run
install_lamindb(), which will ensurelamindb >= 1.2in the Python environment used byreticulate.Replace
db <- connect()withln <- import_module("lamindb")and see the โDetailed changesโ dropdown.
The ln object is largely similar to the db object in laminr < v1 and matches lamindbโs Python API (. โ $).
Detailed changes
What |
Before |
After |
|---|---|---|
Connect to the default LaminDB instance |
|
|
Start tracking |
|
|
Get an artifact from another instance |
|
|
Create an artifact from a path |
|
|
Finish tracking |
|
|
See the updated โGet startedโ vignette for more information.
User-facing changes:
Add an
import_module()function to import Python modules with additional functionality, e.g.,import_module("lamindb")for lamindbAdd functions for accessing more
laminCLI commandsAdd a new โIntroductionโ vignette that replicates the code from the Python lamindb introduction guide
Internal changes:
Add an internal
wrap_python()function to wrap Python objects while replacing Python methods with R methods as needed, leaving most work to {reticulate}Update the internal
check_requires()function to handle Python packagesAdd custom
cache()/load()methods to theArtifactclassAdd custom
track()/finish()methods to the lamindb module
2025-03-09 db 1.2.0ยถ
โจ Enable to auto-link entities to projects. Guide PR @falexwolf
ln.track(project="My project")
๐ธ Better support for spatialdata with Artifact.from_spatialdata() and artifact.load(). PR1 PR2 @Zethson
๐ธ Introduce .slots in Schema, Curator, and artifact.features to access schemas and curators by dataset slot. PR @sunnyosun
schema.slots["obs"] # -> schema for .obs slot of AnnData
curator.slots["obs"] # -> curator for .obs slot of AnnData
artifact.features["obs"] # -> feature set for .obs slot of AnnData
๐๏ธ Re-structured the internal API away from monkey-patching Django models. PR @falexwolf
โ ๏ธ Use of internal API
If you used the internal API, you might experience a breaking change. The most drastic change is that all internal registry-related functionality is now re-exported under lamindb.models.
๐ธ When re-creating an Artifact, link subsequent runs instead of updating .run and linking previous runs. PR @falexwolf
On the hub.
More details here. @chaichontat
Before |
After |
|---|---|
An artifact is only shown as an output for the latest run that created the artifact. Previous runs donโt show it. |
All runs that (re-)create an artifact show it as an output. |
More changes:
โจ Enable
Artifact.open()andArtifact.load()for.gzfiles PR @Koncopd๐ Fix passing a path to
ln.track()when no path found bynbprojectPR @Koncopd๐ Do not overwrite
._state_dbof records when the current instance is passed to.usingPR @Koncopd๐ธ Do not show track warning for read-only connections PR @Koncopd
๐ธ Raise
NotImplementedErrorinArtifact.load()if there is no loader PR @Koncopd
2025-02-27 db 1.1.1ยถ
๐ธ Make the
obsandvarDataFrameCuratorobjects accessible viaAnnDataCurator.slotsPR @sunnyosun๐ธ Better error message upon re-creation of schema with same name and different hash PR @falexwolf
๐ธ Raise consistency error if a source path suffix doesnโt match the artifact
keysuffix PR @falexwolf๐ธ Automatically add missing columns upon
DataFrameCurator.standardize()ifnullableisTruePR @falexwolf๐ธ Allow specifying
fsspecupload options inArtifact.savePR @Koncopd๐ธ Populate
Artifact.n_observationsinArtifact.from_df()PR @Koncopd๐ Run
pip freezewith current python interpreter PR @apโ๐ Fix notebook re-run with same hash PR @falexwolf
2025-02-18 db 1.1.0ยถ
โ ๏ธ The FeatureSet registry got renamed to Schema.
All your code is backward compatible. The Schema registry encompasses feature sets as a special case.
โจ Conveniently track functions including inputs, outputs, and parameters with a decorator: ln.tracked(). PR1 PR2 @falexwolf
@ln.tracked()
def subset_dataframe(
input_artifact_key: str, # all arguments tracked as parameters of the function run
output_artifact_key: str,
subset_rows: int = 2,
subset_cols: int = 2,
) -> None:
artifact = ln.Artifact.get(key=input_artifact_key)
df = artifact.load() # auto-tracked as input
new_df = df.iloc[:subset_rows, :subset_cols]
ln.Artifact.from_df(new_df, key=output_artifact_key).save() # auto-tracked as output
โจ Make sub-types of ULabel, Feature, Schema, Project, Param, and Reference. PR @falexwolf
On the hub.
More details here. @awgaan @chaichontat
Before |
After |
|---|---|
perturbation = ln.ULabel(name="Perturbation", is_type=True).save()
ln.ULabel(name="DMSO", type=perturbation).save()
ln.ULabel(name="IFNG", type=perturbation).save()
โจ Use an overhauled dataset curation flow. @falexwolf @Zethson @sunnyosun
support persisting validation constraints as a
pandera-compatible schemasupport validating any feature type, no longer just categoricals
make the relationship between features, dataset schema, and curator evident
Detailed changes for the overhauled curation flow.
โ ๏ธ The API gained the lamindb.curators module as the new way to access Curator classes for different data structures.
This release introduces the schema-based
DataFrameCuratorandAnnDataCuratorThe old-style curation flow for categoricals based on
lamindb.Curator.from_objecttype()continues to work
Before |
After |
|---|---|
Key PRs.
โจ Overhaul curation guides + enable default values and filters on valid categories for features PR @falexwolf
โจ Schema-based curators:
AnnDataCuratorPR @falexwolfโจ Schema-based curators:
DataFrameCuratorPR @falexwolf
Enabling PRs.
โจ Allow passing
artifacttoCuratorPR @sunnyosun๐จ A
ManyToManybetweenSchema.componentsand.compositesPR @falexwolfโป๏ธ Mark
Schemafields as non-editable PR @falexwolfโจ Add auxiliary field
nullabletoFeaturePR @falexwolfโป๏ธ Prettify
AnnDataCuratorimplementation PR @falexwolf๐ธ Better error for malformed categorical dtype PR @falexwolf
๐จ A
ManyToManybetweenSchema.componentsand.compositesPR @falexwolf๐ Restore
.feature_setsas aManyToManyFieldPR @falexwolf๐ Rename
CatCuratortoCatManagerPR @falexwolf๐จ Let
Curator.validate()throw an error PR @falexwolfโป๏ธ Re-purpose
BaseCuratorasCurator, introduceCatCuratorand consolidate shared logic underCatCuratorPR @falexwolfโป๏ธ Refactor
organismhandling in curators PR @falexwolf๐ฅ Eliminate all logic related to
using_keyin curators PR @falexwolf๐ Bulk-rename old-style curators to
CatCuratorPR @falexwolf๐จ Self-contained definition of
CellxGeneschema / validation constraints PR @falexwolf๐ Move
PertCuratorfromwetlabhere and addCellxGeneCuratortest PR @falexwolf๐ Move CellXGene
Curatorfromcellxgene-laminhere PR @falexwolf
schema = ln.Schema(
name="small_dataset1_obs_level_metadata",
features=[
ln.Feature(name="CD8A", dtype=int).save(), # integer counts for CD8A marker
ln.Feature(name="perturbation", dtype=ln.ULabel).save(), # a categorical feature that validates against the ULabel registry
ln.Feature(name="sample_note", dtype=str).save(), # a note for the sample
],
).save()
df = pd.DataFrame({
"CD8A": [1, 4, 0],
"perturbation": ["DMSO", ],
"sample_note": ["value_1", "value_2", "value_3"],
"temperature": [22.2, 25.7, 27.3],
})
curator = ln.curators.DataFrameCurator(df, schema)
artifact = curator.save_artifact(key="example_datasets/dataset1.parquet") # validates compliance with schema, annotates with metadata
assert artifact.schema == schema # the validating schema
โจ Easily filter on a validating schema. @falexwolf @Zethson @sunnyosun
On the hub.
With the Schema filter button, find all datasets that satisfy a given schema (โ explore).
schema = ln.Schema.get(name="small_dataset1_obs_level_metadata") # get a schema
ln.Artifact.filter(schema=schema).df() # filter all datasets that were validated by the schema
โจ Collection.open() returns a pyarrow dataset. PR @Koncopd
df = pd.DataFrame({"feat1": [0, 0, 1, 1], "feat2": [6, 7, 8, 9]})
df[:2].to_parquet("df1.parquet", engine="pyarrow")
df[2:].to_parquet("df2.parquet", engine="pyarrow")
artifact1 = ln.Artifact(shard1, key="df1.parquet").save()
artifact2 = ln.Artifact(shard2, key="df2.parquet").save()
collection = ln.Collection([artifact1, artifact2], key="parquet_col")
dataset = collection.open() # backed by files in the cloud storage
dataset.to_table().to_pandas().head()
โจ Support s3-compatible endpoint urls, say your on-prem MinIO deployment. PR @Koncopd
Speed up instance creation through squashed migrations.
โก Squash migrations PR1 PR2 @falexwolf
Tiledbsoma.
โจ Support
endpoint_urlin operations with tiledbsoma PR1 PR2 @Koncopdโจ Add
Artifact.from_tiledbsomato populaten_observationsPR @Koncopd
MappedCollection.
๐ Allow filtering on
np.naninobs_filterofMappedCollectionPR @Koncopd๐ Fix labels for
NaNin categorical columns forMappedCollectionPR @Koncopd
SpatialDataCurator.
๐ Fix
var_indexstandardization ofSpatialDataCuratorPR1 PR2 @Zethson๐ Fix sample level metadata optional in
SpatialDataCatManagerPR @Zethson
Core functionality.
โจ Allow checking the need for syncing without actually syncing PR @Koncopd
โจ Check for corrupted cache in
Artifact.load()&Artifact.open()PR PR @Koncopdโจ Infer
n_observationsinArtifact.from_anndataPR @Koncopd๐ Account for VSCode appending languageid to markdown cell in notebook tracking PR @falexwolf
๐ Normalize module names for robust checking in
_check_instance_setup()PR @Koncopd๐ Fix idempotency of
Featurecreation whendescriptionis passed and improve filter and get error behavior PR @Zethson๐ธ Make new version upon passing existing
keytoCollectionPR @falexwolf๐ธ Throw better error upon checking
instance.moduleswhen loading a lamindb schema module PR @Koncopd๐ธ Validate existing records in the DB irrespective of whether an ontology
sourceis passed or not PR @sunnyosun๐ธ Full guarantee of avoiding duplicating
Transform,Artifact&Collectionin concurrent runs PR @falexwolf๐ธ Better user feedback during keyword validation in
Recordconstructor PR @Zethson๐ธ Improve local storage not found warning message PR @Zethson
๐ธ Better error message when attempting to save a file while not being connected to an instance PR @Zethson
๐ธ Error for non-keyword parameters for
Artifact.from_xmethods PR @Zethson
Housekeeping.
2025-01-23 db 1.0.5ยถ
2025-01-21 db 1.0.4ยถ
๐ Revert Collection.description back to unlimited length TextField. PR @falexwolf
2025-01-21 db 1.0.3ยถ
๐ธ In track(), improve logging in RStudio sessions. PR @falexwolf
2025-01-20 R 0.4.0ยถ
๐ Migrate to lamindb v1 PR @falexwolf
๐ธ Improve the user experience for setting up Python & reticulate PR @lazappi
2025-01-20 db 1.0.2ยถ
๐ Improvments for lamindb v1 migrations. PR @falexwolf
add a
.descriptionfield toSchemaenable labeling
RunwithULabeladd a
.predecessorsand.successorsfield toProjectakin to whatโs present onTransformmake
.uidfields not editable
2025-01-18 db 1.0.1ยถ
๐ Block non-admin users from confirming the dialogue for integrating lnschema-core. PR @falexwolf
2025-01-17 db 1.0.0ยถ
This release makes the API consistent, integrates lnschema_core & ourprojects into the lamindb package, and introduces a breadth of database migrations to enable future features without disruption. Youโll now need at least Python 3.10.
Your code will continue to run as is, but you will receive warnings about a few renamed API components.
What |
Before |
After |
|---|---|---|
Dataset vs. model |
|
|
Python object for |
|
|
Number of files |
|
|
|
|
|
|
|
|
Consecutiveness field |
|
|
Run initiator |
|
|
|
|
|
Migration guide:
Upon
lamin connect account/instanceyou will be prompted to confirm migrating away fromlnschema_coreAfter that, you will be prompted to call
lamin migrate deployto apply database migrations
New features:
โจ Allow filtering by multiple
obscolumns inMappedCollectionPR @Koncopdโจ In git sync, also search git blob hash in non-default branches PR @Zethson
โจ Add relationship with
Projectto everything exceptRun,Storage&Userso that you can easily filter for the entities relevant to your project PR @falexwolfโจ Capture logs of scripts during
ln.track()PR1 PR2 @falexwolf @Koncopdโจ Support
"|"-seperated multi-values inCuratorPR @sunnyosun๐ธ Accept
Noneinconnect()and improve migration dialogue PR @falexwolf
UX improvements:
๐ธ Simplify the
ln.track()experience PR @falexwolfyou can omit the
uidargumentyou can organize transforms in folders
versioning is fully automated (requirement for 1.)
you can save scripts and notebooks without running them (corollary of 1.)
you avoid the interactive prompt in a notebook and the throwing of an error in a script (corollary of 1.)
you are no longer required to add a title in a notebook
๐ธ Raise error when modifying
Artifact.keyin problematic ways PR1 PR2 @sunnyosun @Koncopd๐ธ Better error message on running
ln.track()within Python terminal PR @Koncopd๐ธ Hide traceback for
InstanceNotEmptyusing Click Exception PR @Zethson๐ธ Only auto-search
._name_fieldin sub-classes ofCanCuratePR @falexwolf๐ธ Simplify installation & API overview PR @falexwolf
๐ธ Make
lamin_run_uidcategorical in tiledbsoma stores PR @Koncopd๐ธ Raise
ValueErrorwhen trying to search aNonevalue PR @Zethson
Bug fixes:
๐ Skip deleting storage when deleting outdated versions of folder-like artifacts PR @Koncopd
๐ Let
SOMACurator()validate and annotate all.obscolumns PR @falexwolf๐ Fix renaming of feature sets PR @sunnyosun
๐ Do not raise an exception when default AWS credentials fail PR @Koncopd
๐ Only map synonyms when field is name PR @sunnyosun
๐ Fix
sourcein.from_valuesPR @sunnyosun๐ Fix creating instances with storage in the current local working directory PR @Koncopd
๐ Fix NA values in
Curator.add_new_from()PR @sunnyosun
Refactors, renames & maintenance:
๐๏ธ Integrate
lnschema-coreintolamindbPR1 PR2 @falexwolf @Koncopd๐๏ธ Integrate
ourprojectsinto lamindb PR @falexwolfโป๏ธ Manage
created_at,updated_aton the database-level, makecreated_bynot editable PR @falexwolf๐ Rename transform type โglueโ to โlinkerโ PR @falexwolf
๐ Deprecate the
--schemaargument oflamin initin favor of--modulesPR @falexwolf
DevOps:
Detailed list of database migrations
Those not yet announced above will be announced with the functionality they enable.
โป๏ธ Add
contenttypesDjango plugin PR @falexwolf๐ Prepare introduction of persistable
Curatorobjects by renamingFeatureSettoSchemaon the database-level PR @falexwolf๐ Add a
.typeforeign key toULabel,Feature,FeatureSet,Reference,ParamPR @falexwolf๐ Introduce
RunData,TidyTable, andTidyTableDatain the database PR @falexwolf
All remaining database schema changes were made in this PR @falexwolf. Data migrations happen automatically.
remove
_source_code_artifactfrom Transform, itโs been deprecated since 0.75data migration: for all transforms that have
_source_code_artifactpopulated, populatesource_code
rename
Transform.nametoTransform.descriptionbecause itโs analogous toArtifact.descriptionbackward compat:
in the
Transformconstructor usenameto populatekeyin all cases in which onlynameis passedreturn the same transform based on
keyin casesource_code is Nonevia._name_field = "key"
data migrations:
there already was a legacy
descriptionfield that was never exposed on the constructor; to be safe, we concatenated potential data in it on the new description fieldfor all transforms that have
key=Noneandname!=None, usenameto pre-populatekey
rename
Collection.nametoCollection.keyfor consistency withArtifact&Transformand the high likelihood of you wanting to organize them hierarchicallya
_branch_codeinteger on every record to model pull requestsinclude
visibilitywithin that coderepurpose
visibility=0as_branch_code=0as โarchiveโput an index on it
code a โdraftโ as _branch_code = 2, and โdraft prsโ as negative branch codes
rename values
"number"to"num"in dtypean
._auxjson field onRecorda SmallInteger
run._status_codethat allows to writefinished_atin clean up operations so that there is a run time also for aborted runsrename
Run.is_consecutivetoRun._is_consecutivea
_template_idFK to store the information of the generating template (whether a record is a template is coded via _branch_code)rename
_accessortootypeto publicly declare the data format assuffix, accessorrename
Artifact.typetoArtifact.kinda FK to artifact
run._logfilewhich holds logsa
hashfield onParamValueandFeatureValueto enforce uniqueness without running the danger of failure for large dictionariesadd a boolean field
._expect_manytoFeature/Paramthat defaults toTrue/Falseand indicates whether values for this feature/param are expected to occur a single or multiple times for every single artifact/runfor feature
if itโs
True(default), the values come from an observation-level aggregation and a dtype ofdatetimeon the observation-level meanset[datetime]on the artifact-levelif itโs
Falseitโs an artifact-level value anddatetimemeansdatetime; this is an edge case because an arbitrary artifact would always be a set of arbitrary measurements that would need to be aggregated (โone just happens to measure a single cell line in that artifactโ)
for param
if itโs
False(default), the values mean artifact/run-level values anddatetimemeansdatetimeif itโs
True, the values would be from an aggregation, this seems like an edge case but say when characterizing a model ensemble trained with different parameters it could be relevant
remove the
.transformforeign key from artifact and collection for consistency with all other records; introduce a property and a simple filter statement instead that maintains the same UXstore provenance metadata for
TransformULabel,RunParamValue,ArtifactParamValueenable linking projects & references to transforms & collections
rename
Run.parenttoRun.initiated_by_runintroduce a boolean flag on artifact thatโs called
_overwrite_versions, which indicates whether versions are overwritten or stored separately; it defaults toFalsefor file-like artifacts and toTruefor folder-like artifactsRename
n_objectston_filesfor more clarityAdd a
Spaceregistry to lamindb with an FK on everyBasicRecordadd a name column to
Runso that a specific run can be used as a named specific analysisremove
_previous_runsfield on everything exceptArtifact&Collection