yaml_include.constructor module#

Include other YAML files in YAML

class yaml_include.constructor.Constructor(fs=<factory>, base_dir=None, autoload=True, custom_loader=None)[source]#

Bases: object

The include constructor for PyYAML Loaders

Use yaml.add_constructor() to register it with PyYAML’s Loaders.

Example

  1. In Python source code, register it with a Loader class:

    import yaml
    import yaml_include
    
    yaml.add_constructor("!inc", yaml_include.Constructor(), yaml.Loader)
    
  2. In a YAML file, use !inc tags to include other YAML files. You can:

    • Include a file from the local file system, either absolute or relative:

      file: !inc /absolute/dir/of/foo/baz.yml
      
      file: !inc ../../foo/baz.yml
      
    • Include a file from a website:

      file: !inc http://localhost:8080/foo/baz.yml
      
    • Include files by wildcard:

      files: !inc foo/**/*.yml
      
  3. Load the YAML in Python:

    data = yaml.load(yaml_string, yaml.Loader)
    

    The variable data contains the parsed Python object(s) from the included file(s).

Parameters:
  • fs (fsspec.AbstractFileSystem)

  • base_dir (Union[str, PathLike, Callable[[], Union[str, PathLike]], None])

  • autoload (bool)

  • custom_loader (Optional[Callable[[str, _ReadStream, Type[Union[_Loader, _CLoader]]], Any]])

autoload: bool = True#

Determines whether to open and parse included file(s) when called.

  • If True: Open the included file(s), parse their content using the current PyYAML Loader, and return the parsed result.

  • If False: Do not open the included file(s). Instead, return a Data object that stores the include statement.

base_dir: str | PathLike | Callable[[], str | PathLike] | None = None#

Base directory used to open or search for included YAML files in relative mode.

  • If it is None: The actual base directory is determined by the fsspec file-system implementation in use.

    • For example, for LocalFileSystem, the default base directory is the current working directory (cwd).

    • For HTTPFileSystem, the base directory is set to the value of client_kwargs.base_url.

  • If it is callable: The actual base directory will be the return value of the callable.

  • Otherwise: It will be used directly as the actual base directory.

custom_loader: Callable[[str, _ReadStream, Type[_Loader | _CLoader]], Any] | None = None#

Custom loader/parser function called when an included file is about to be parsed.

If None, the file is parsed as ordinary YAML using the current Loader class.

Otherwise, it should be a callable object that replaces the ordinary YAML Loader.

Example

The parameter may be defined as follows:

def my_loader(urlpath, file, Loader):
    if urlpath.endswith(".json"):
        return json.load(file)
    if urlpath.endswith(".toml"):
        return toml.load(file)
    return yaml.load(file, Loader)

The definition of the callable parameter is:

Parameters:
  • urlpath (str) –

    URL or path of the file.

    The value passed to this argument may be:

    • The original URL/path string defined in YAML, in cases where:
      • Neither a wildcard nor a scheme is present in the include statement (e.g., !inc foo/baz.yml).

      • Either a wildcard or a scheme is present in the include statement (e.g., !inc http://host/foo/*.yml).

    • Each file name returned by fsspec.spec.AbstractFileSystem.glob(), if a wildcard is present but no scheme in the include statement (e.g., !inc foobar/**/*.yml).

  • file (bytes | str | SupportsRead[bytes | str]) –

    The object returned by fsspec.open() or a member of the list returned by fsspec.open_files().

    This parameter will be used in yaml.load() and can be:

    • An instance of bytes or str.

    • An object that implements the following interface:

      class SupportsRead(bytes | str):
          def read(self, length: int = ...) -> bytes | str: ...
      

    Tip

    The open method of fsspec file-system implementations typically returns a fsspec.spec.AbstractBufferedFile object. However, the exact type is not guaranteed, as open methods can vary across different fsspec file-system implementations.

  • Loader (Type) – The type (not an instance) of the PyYAML Loader currently in use.

Returns:

The parsed result.

Return type:

Any

fs: fsspec.AbstractFileSystem#

fsspec File-system object used to parse paths/URLs and open included files. Defaults to LocalFileSystem.

load(loader_type, data)[source]#

This method is invoked when the PyYAML Loader class encounters an include tag (e.g., !inc).

Parameters:
  • loader_type (Type[Union[_Loader, _CLoader]]) – The type of the current PyYAML Loader class in use.

  • data (Data) – The data object representing the include statement.

Return type:

Any

Returns:

Data from the included YAML file, parsed by a PyYAML Loader class.

Caution

This method is primarily invoked internally by yaml.load(). It is not recommended to call this method directly.

Notes

  • Additional positional or named parameters in YAML include statements are passed to *args and **kwargs in Data.sequence_params and Data.mapping_params. These parameters are then forwarded to the fsspec File-system as implementation-specific options.

  • The use of positional parameters in YAML include statements is discouraged.

The function operates as follows:

  • If there is a protocol/scheme and no wildcard in the YAML include:

    *args and **kwargs are passed to fsspec.open().

Example

The YAML

key: !inc {urlpath: s3://my-bucket/my-file.yml.gz, compression: gzip}

translates to:

with fsspec.open("s3://my-bucket/my-file.yml.gz", compression="gzip") as f:
    yaml.load(f, Loader)

Example

The YAML

key: !inc {urlpath: s3://my-bucket/*.yml.gz, compression: gzip}

translates to:

with fsspec.open_files("s3://my-bucket/*.yml.gz", compression="gzip") as files:
    for file in files:
        yaml.load(file, Loader)
  • If there is no protocol/scheme and no wildcard in the YAML include:

    Data.sequence_params and Data.mapping_params of data are passed to the fsspec file-system’s open method (derived from fsspec.spec.AbstractFileSystem.open()) as *args and **kwargs.

  • If there is no protocol/scheme and a wildcard in the YAML include, the behavior depends on the form of the include statement:

    • For positional-parameter form:

      • If there is one argument, it is passed to fsspec.spec.AbstractFileSystem.glob()’s maxdepth parameter.

      • If there are multiple arguments:

        • The first argument is passed to the glob method.

        • The second argument is passed to the open method.

        • Additional arguments are ignored.

    • For named-parameter form:

      • A key named glob passes its value to the glob method.

      • A key named open passes its value to the open method.

Examples

  • The YAML

    key: !inc [foo/**/*.yml, 2]
    

    translates to:

    for file in fs.glob("foo/**/*.yml", maxdepth=2):
        with fs.open(file) as fp:
            yaml.load(fp, Loader)
    
  • The YAML

    key: !inc {urlpath: foo/**/*.yml.gz, glob: {maxdepth: 2}, open: {compression: gzip}}
    

    translates to:

    for file in fs.glob("foo/**/*.yml.gz", maxdepth=2):
        with fs.open(file, compression=gzip) as fp:
            yaml.load(fp, Loader)
    
managed_autoload(autoload)[source]#

Context manager for temporarily setting the autoload attribute.

This context manager allows you to set a temporary value for autoload within a with statement. The original value is restored once the block is exited.

Parameters:

autoload (bool) – The temporary value to assign to autoload within the with statement.

Yields:

The current instance of Constructor.

Return type:

Iterator[Self]

Example

ctor = yaml_include.Constructor()
# autoload is True here

with ctor.managed_autoload(False):
    # temporary set autoload to False
    yaml.full_load(YAML_TEXT)
# autoload restore True automatic