Redis-backed index for tracking datasets in a repository.
Implements the AbstractIndex protocol. Maintains a registry of LocalDatasetEntry objects in Redis, allowing enumeration and lookup of stored datasets.
When initialized with a data_store, insert_dataset() will write dataset shards to storage before indexing. Without a data_store, insert_dataset() only indexes existing URLs.
Attributes
Name
Type
Description
_redis
Redis connection for index storage.
_data_store
Optional AbstractDataStore for writing dataset shards.
Number of stub files removed, or 0 if auto_stubs is disabled.
decode_schema
local.Index.decode_schema(ref)
Reconstruct a Python PackableSample type from a stored schema.
This method enables loading datasets without knowing the sample type ahead of time. The index retrieves the schema record and dynamically generates a PackableSample subclass matching the schema definition.
If auto_stubs is enabled, a Python module will be generated and the class will be imported from it, providing full IDE autocomplete support. The returned class has proper type information that IDEs can understand.
Decode a schema with explicit type hint for IDE support.
This is a typed wrapper around decode_schema() that preserves the type information for IDE autocomplete. Use this when you have a stub file for the schema and want full IDE support.
The decoded type, cast to match the type_hint for IDE support.
Examples
>>># After enabling auto_stubs and configuring IDE extraPaths:>>>from local.MySample_1_0_0 import MySample>>>>>># This gives full IDE autocomplete:>>> DecodedType = index.decode_schema_as(ref, MySample)>>> sample = DecodedType(text="hello", value=42) # IDE knows signature!
Note
The type_hint is only used for static type checking - at runtime, the actual decoded type from the schema is returned. Ensure the stub matches the schema to avoid runtime surprises.
get_dataset
local.Index.get_dataset(ref)
Get a dataset entry by name (AbstractIndex protocol).
>>> index = LocalIndex(auto_stubs=True)>>> ref = index.publish_schema(MySample, version="1.0.0")>>> index.load_schema(ref)>>>print(index.get_import_path(ref))local.MySample_1_0_0>>># Then in your code:>>># from local.MySample_1_0_0 import MySample
get_schema
local.Index.get_schema(ref)
Get a schema record by reference (AbstractIndex protocol).
Schema reference string. Supports both new format (atdata://local/sampleSchema/{name}@version) and legacy format (local://schemas/{module.Class}@version).
Insert a dataset into the index (AbstractIndex protocol).
If a data_store was provided at initialization, writes dataset shards to storage first, then indexes the new URLs. Otherwise, indexes the dataset’s existing URL.
Load a schema and make it available in the types namespace.
This method decodes the schema, optionally generates a Python module for IDE support (if auto_stubs is enabled), and registers the type in the :attr:types namespace for easy access.
>>># Load and use immediately>>> MyType = index.load_schema("atdata://local/sampleSchema/MySample@1.0.0")>>> sample = MyType(name="hello", value=42)>>>>>># Or access later via namespace>>> index.load_schema("atdata://local/sampleSchema/OtherType@1.0.0")>>> other = index.types.OtherType(data="test")
Semantic version string (e.g., ‘1.0.0’). If None, auto-increments from the latest published version (patch bump), or starts at ‘1.0.0’ if no previous version exists.