yaml_include.constructor module

Include other YAML files in YAML

class yaml_include.constructor.Constructor(fs: fsspec.AbstractFileSystem = <factory>, base_dir: Union[str, PathLike, Callable[[], Union[str, PathLike]], None] = None, autoload: bool = True, custom_loader: Optional[Callable[[str, YAML_ReadStream, Type[Union[YAML_Loader, YAML_CLoader]]], Any]] = None)[source]

Bases: object

The include constructor for PyYAML Loaders

Use yaml.add_constructor() to register it on PyYAML’s Loaders.

Example

  1. In Python source code, register it to a Loader class:

    import yaml
    import yaml_include
    
    yaml.add_constructor("!inc", yaml_include.Constructor(), yaml.Loader)
    
  2. In a YAML file, write !inc tags to include other YAML files. We can:

    • include file in local file system, absolute or relative

      file: !inc /absolute/dir/of/foo/baz.yml
      
      file: !inc ../../foo/baz.yml
      
    • include file from a website

      file: !inc http://localhost:8080/foo/baz.yml
      
    • include file by wildcards

      files: !inc foo/**/*.yml
      
  3. load the YAML in python:

    data = yaml.load(yaml_string, yaml.Loader)
    

    The variable data containers the parsed Python object(s) from including file(s)

Parameters:
  • fs (fsspec.AbstractFileSystem)

  • base_dir (Union[str, PathLike, Callable[[], Union[str, PathLike]], None])

  • autoload (bool)

  • custom_loader (Optional[Callable[[str, YAML_ReadStream, Type[Union[YAML_Loader, YAML_CLoader]]], Any]])

fs: fsspec.AbstractFileSystem

fsspec File-system object to parse path/url and open including files. LocalFileSystem by default.

base_dir: str | PathLike | Callable[[], str | PathLike] | None = None

Base directory to which open or search including YAML files in relative mode.

  • If it is None, the actual base directory was decided by the fsspec file-system implementation in use. For example, the base_dir is default to be cwd for LocalFileSystem, and be the value of client_kwargs.base_url for HTTPFileSystem.

  • Else if it is callable, the actual base directory will be it’s return value.

  • Else it will be used directly as the actual base directory.

autoload: bool = True

Whether if open and parse including file(s) when called.

  • If True: open including file(s) then parse its/their content with current PyYAML Loader, and returns the parsed result.

  • If False: will NOT open including file(s), the return value is a Data object stores include statement.

custom_loader: Callable[[str, YAML_ReadStream, Type[YAML_Loader | YAML_CLoader]], Any] | None = None

Custom loader/parser function called when an including file is about to parse.

If None, parse the file as ordinary YAML with current Loader class.

Else it shall be a callable object, as the replacement of ordinary YAML Loader.

Example

The parameter may be like:

def my_loader(urlpath, file, Loader):
    if urlpath.endswith(".json):
        return json.load(file)
    if urlpath.endswith(".toml):
        return toml.load(file)
    return yaml.load(file, Loader)

The definition of the callable parameter is:

Parameters:
  • arg1 (str) –

    url or path of the file.

    Pass-in value of the argument may be:

    • Original url/path string defined in YAML, in the case of:
      • neither wildcard nor scheme exists in the include statement (eg: !inc foo/baz.yml),

      • either wildcard and scheme exists in the include statement (eg: !inc http://host/foo/*.yml)

    • Each file name returned by fsspec.spec.AbstractFileSystem.glob(), if there be wildcard and no scheme in the include statement (eg: !inc foobar/**/*.yml).

  • arg2 (bytes | str | SupportsRead[bytes | str]) –

    What returned by fsspec.open(), or member of fsspec.open_files()’s returned list, will be set to the argument.

    The parameter may later be used in yaml.load(), it could be:

    • bytes or str

    • An object implements

      class SupportsRead(bytes | str):
          def read(self, length: int = ..., /) -> bytes | str: ...
      

    Tip

    The open method of fsspec file-system implementations usually returns a fsspec.spec.AbstractBufferedFile object. However, the type is NOT certain, because open methods of different fsspec file-system implementations are variable.

  • arg3 (Type) – Type (not instance) of PyYAML’s Loader currently in use.

Returns:

Parsed result

Return type:

Any

managed_autoload(autoload: bool) Generator[Self, None, None][source]

with statement context manager for autoload

Parameters:

autoload (bool) – Temporary value of autoload to be set inside the with statement

Return type:

Generator[Self, None, None]

load(loader_type: Type[YAML_Loader | YAML_CLoader], data: Data) Any[source]

The method will be invoked once the PyYAML’s Loader class call the constructor. It happens when an include state tag(eg: "!inc") is met.

Parameters:
  • loader_type (Type[Union[YAML_Loader, YAML_CLoader]]) – Type of current in-use PyYAML Loader class

  • data (Data) – The data class of the include statement

Returns:

Data from the actual included YAML file, which is parsed by a PyYAML’s Loader class.

Return type:

Any

Caution

It’s mainly invoked in yaml.load(), and NOT advised to call it yourself.

Note

Additional positional or named parameters in YAML include statement are passed to *args and **kwargs in Data.sequence_params and Data.mapping_params. The class will pass them to fsspec’s fsspec File-system as implementation specific options.

Note:

To use positional in YAML include statement is discouraged.

  • If there is a protocol/scheme, and no wildcard defined in YAML including, *args and **kwargs will be passed to fsspec.open().

    Example:

    The YAML

    key: !inc {urlpath: s3://my-bucket/my-file.yml.gz, compression: gzip}
    

    means:

    with fsspec.open("s3://my-bucket/my-file.yml.gz", compression="gzip") as f:
        yaml.load(f, Loader)
    
  • If there is a protocol/scheme, and also wildcard defined in YAML including, Data.sequence_params and Data.mapping_params of data will be passed to fsspec.open_files() as *args and **kwargs

    Example:

    The YAML

    key: !inc {urlpath: s3://my-bucket/*.yml.gz, compression: gzip}
    

    means:

    with fsspec.open_files("s3://my-bucket/*.yml.gz", compression="gzip") as files:
        for file in files:
            yaml.load(file, Loader)
    
  • If there is no protocol/scheme, and no wildcard defined in YAML including, Data.sequence_params and Data.mapping_params of data will be passed to fsspec file-system implementation’s open function (derive from fsspec.spec.AbstractFileSystem.open()) as *args and **kwargs

  • If there is no protocol/scheme, and also wildcard defined in YAML including, the situation is complex:

    Example:

    • The YAML

      key: !inc [foo/**/*.yml, 2]
      

      means:

      for file in fs.glob("foo/**/*.yml", maxdepth=2):
          with fs.open(file) as fp:
              yaml.load(fp, Loader)
      
    • The YAML

      key: !inc {urlpath: foo/**/*.yml.gz, glob: {maxdepth: 2}, open: {compression: gzip}}
      

      means:

      for file in fs.glob("foo/**/*.yml.gz", maxdepth=2):
          with fs.open(file, compression=gzip) as fp:
              yaml.load(fp, Loader)