The pyrallis API
Pyrallis is designed to have a relatively small API, you can get familiar with it below 📚
Parsing Interface
pyrallis.parse
def parse(config_class: Type[T],
config_path: Optional[Union[Path, str]] = None,
args: Optional[Sequence[str]] = None) -> T:
config_class
.
Each dataclass attribute is mapped to a matching argparse argument, where nested arguments are concatenated with the dot notation. That is, if config_class
contains a config_class.compute.workers
attribute, the matching argparse argument will simply be --compute.workers=42
. The full set of arguments is visible using the --help
command.
Pyrallis also searches for an optional yaml
configuration file defined either with the default config_path
parameter, or specified from command-line using the dedicated --config_path
argument. The configuration file is mapped according to the yaml
hierarchy.
That is, if config_class
contains a config_class.compute.workers
attribute, the matching yaml
argument would be
The overloading mechanism is defined so that default values can be overridden by yaml
values which can be overridden by command-line arguments. The actual dataclass is initialized only once using the final set of arguments.
Parameters
- config_class (A dataclass class) - The dataclass that will define the parser
- config_path (str) - An optional path to a default
yaml
configuration file. The parser will first load the arguments from there and will override them with command-line arguments. - args - The arguments to parse, as in the
argparse.ArgumentParser.parse_args()
call. If None, parses from command-line
Returns
An initialized dataclass of type config_class
.
Example:
A small working example would look like that
yaml
configuration file, or both
$ python train_model.py --config_path=some_config.yaml --exp_name=my_first_exp
Training my_first_exp with 42 workers...
--help
command 🪄
$ python train_model.py --help
usage: train_model.py [-h] [--config_path str] [--workers int] [--exp_name str]
optional arguments:
-h, --help show this help message and exit
--config_path str Path for a config file to parse with pyrallis (default:
None)
TrainConfig ['options']:
Training config for Machine Learning
--workers int The number of workers for training (default: 8)
--exp_name str The experiment name (default: default_exp)
pyrallis.wrap
Inspired by Hydra, Pyrallis also offers an alternative decorator that wraps your main function and automatically initializes a configuration class to match your function signature.Parameters
- config_path (str) - An optional path to a default
yaml
configuration file. The parser will first load the arguments from there and will override them with command-line arguments.
Returns
A wrapped function, where the first argument is implicitly initialized with pyrallis.
Example
Types and Registration
pyrallis.decode
Parse a given raw value and produce the corresponding type. This function can parse any of the supported pyrallis types, from standard types to enums and dataclasses.Parameters
- t (Type[T]) - The type to parse into
- raw_value - The input to parse
Returns
A value of type T
constructed from raw_value
Examples:
Pyrallis can decode a variety of different types
pyrallis.decode(float, '0.7') # Output: 0.7
pyrallis.decode(typing.List[int],['6','3','4']) # Output: [6, 3, 4]
pyrallis.decode(typing.Union[str, float],'2.3') # Output: '2.3'
The pyrallis.decode
function is also used to parse dataclasses from dictionaires.
The dictionary is assumed to have a structure that matches the dataclass, with possibly missing values.
For example, assuming the following dataclass
@dataclass
class ComputeConfig:
""" Config for training resources """
# The number of workers for training
workers: int = field(default=8)
# The number of workers for training
eval_workers: Optional[int] = field(default=None)
@dataclass
class TrainConfig:
compute: ComputeConfig = field(default_factory=ComputeConfig)
A valid dictionary can be
Where pyrallis.decode(TrainConfig, d)
will generate the matching dataclass. The values will be fed into the dataclass constructor. Values can be missing if they have a default value, otherwise pyrallis will just fail to construct the class.
pyrallis.decode.register
Pyrallis can decode many different types, but there's still a chance you might want to a type that isn't already built-in into pyrallis. This can be easily done using the register mechanism.The @withregistry wrapper
The register method is a result of using the @withregistry wrapper over the pyrallis.decode function, a variant for pyrallis of the @singledispatch that only creates a registry and does not override the default entrypoint.
Parameters
- cls - The type to register
- func - The decoding function to register
- include_subclasses - Whether to also register the function for the
cls
subclasses. If True,func
will also expect the type parameter.
Returns
Registers the function, no return value.
Examples:
Say we want to add an option to decode a numpy array into pyrallis.
The register function can also be used as a wrapper
For inherited types one can either register it directly, or register a parent class
pyrallis.decode.register(SomeClass, lambda x: SomeClass(x))
# or
pyrallis.decode.register(BaseClass, lambda t, x: t(x), include_subclasses=True)
t
would be some subclass of BaseClass
.
pyrallis.encode
Parse a given object into a yaml compatible object that can be serialized.Parameters
- obj - The object to encode
Returns
A yaml-compatible object. For dataclasses, a matching dictionary will be generated
Examples:
For many types the encode function will simply return the same object, as they are already yaml-compatible.
pyrallis.encode(0.7) # Output: 0.7
pyrallis.encode(['6','3','4']) # Output: ['6','3','4']
pyrallis.encode((0.2, 0.3)) # Output: [0.2, 0.3]
Most of the logic in pyrallis.encode
is related to encoding of dataclasses. A given dataclass will be encoded to a dictionary representing the same hierarchy and values.
For example, assuming the following dataclass and its instance
@dataclass
class ComputeConfig:
""" Config for training resources """
# The number of workers for training
workers: int = field(default=8)
# The number of workers for training
eval_workers: Optional[int] = field(default=None)
@dataclass
class TrainConfig:
compute: ComputeConfig = field(default_factory=ComputeConfig)
cfg = TrainConfig(compute=ComputeConfig(workers=2, eval_workers=8))
The output of pyrallis.encode(cfg)
would be
pyrallis.encode.register
If an object you want to encode is not serializable, you can easily register a function that encodes it into a supported type using theregister
mechanism.
The @singledispatch wrapper
The register method is a result of using the @singledispatch wrapper over the pyrallis.encode function, allowing to override it with new encoders.
Parameters
- cls - The type to register
- func - The decoding function to register
Returns
Registers the function, no return value.
Examples:
Say we want to add an option to encode a numpy array into pyrallis.
The register function can also be used as a wrapper
Supported Types
pyrallis comes with many supported types, as detailed below. Additional types can be added using the pyrallis.decode.register
and pyrallis.encode.register
functionality.
Standard Types
str
float
int
bool
bytes
For each standard type values are converted to the type using the type as constructor (i.e., float('1.3')
)
Collection Types
tuple
(or typing.Tuple
)
list
(or typing.List
)
dict
(or typing.Dict
)
set
(or typing.Set
)
When using the collection definition from typing
the inner types will also be checked recursively. For example, one can define a typing.Tuple[int,float,typing.List[int]]
type and the conversion will be done accordingly.
Reading sequences and dictionaries is done using the yaml
syntax. Notice that tuples and sets use the same syntax as lists, where the conversion happens after loading the arguments. For dictionaries, the keys must be space separated so the entire dictionary should be wrapped in string quotation marks.
Typing Types
typing.Any
for typing.Any
no explicit conversion is applied, this means only the base conversion done when loading a yaml file/string is applied.
typing.Optional
Using typing.Optional
you can pass None
values to an argument. Following the yaml
syntax this is specified using the null
keyword.
typing.Union
For unions pyrallis
sequentially tries to apply conversion to each type and sticks with the first one that works. This means there could be cases where the order of values in the Union affects the end result (i.e. typing.Union[int, float]
).
Other Types
Enum
Enums can be really useful for configurations, pyrallis
inherently supports the enum.Enum
class and parses enums using the matching keyword values.
pathlib.Path
The pathlib.Path
type is useful for path manipulation, and hence a useful type for our configuration dataclasses. pyrallis
inherently supports the class and can convert an str
to Path
and back, according to the type hints.
dataclasses
Dataclasses
are also supported by pyrallis.decode
and pyrallis.encode
. In fact, this is how pyrallis operates behind the scenes.
Working with Files
pyrallis.dump
Serialize a configuration dataclass into a file stream. If stream is None, return the produced string instead. Additional arguments are passed along to thedump
function of the current configuration format.
In practice this function uses pyrallis.encode to create a standard dictionary from the dataclass which is then passed along to the configuration parser.
Parameters
- config (dataclass) - The dataclass to serialize
- omit_defaults - If true, does not dump values that are equal to the default Dataclass values.
- stream - An output stream to dump into. If None, the produced string is returned.
Returns
If no stream is provided, returns the produced string. Otherwise, no return value.
Example:
@pyrallis.wrap()
def main(cfg: TrainConfig):
print('The generated config is:')
print(pyrallis.dump(cfg))
# Saving to file
pyrallis.dump(cfg, open('/configs/train_config.yaml','w'))
pyrallis.load
Parse the document from the stream and produce the corresponding dataclass In practice this function first loads a dictionary from the stream and then usespyrallis.decode
to generate a valid dataclass from it.
Parameters
- t (Type[dataclass]) - The dataclass type to load into
- stream - The input stream to load
Returns
A dataclass instance from type t
.
Example:
cfg = pyrallis.load(TrainConfig, open('/configs/train_config.yaml','r'))
print('Loaded config has {cfg.workers} workers')
pyrallis.set_config_type
By default, pyrallis usesyaml
for parsing string and files and this is its recommended format. However, pyrallis also supports json
and toml
. When using pyrallis.set_config_type
the global context of pyrallis will change so that the matching configuration format will be used for pyrallis.parse
, pyrallis.dump
and pyrallis.load
.
To change the configuration type for a specific context use the with
context
Info
Note that the pyrallis.parse
function is also dependent on the configuration format as strings from cmd are first parsed based on the current configuration format. This ensures a unified behavior when parsing from files and cmd arguments.
Parameters
- type_val - A string representing the desired format (
"json", "yaml", "toml"
) or the matching enum value (pyrallis.ConfigType.YAML
)
Example
import pyrallis
pyrallis.set_config_type('json')
@dataclass
class TrainConfig:
""" Training config for Machine Learning """
workers: int = 8 # The number of workers for training
exp_name: str = 'default_exp' # The experiment name
@pyrallis.wrap()
def main(cfg: TrainConfig):
pyrallis.dump(cfg)
with pyrallis.config_type('yaml'):
pyrallis.dump(cfg)
Helper Functions
pyrallis.field
def field(*, default=dataclasses.MISSING, default_factory=dataclasses.MISSING, init=True, repr=True,
hash=None, compare=True, metadata=None, is_mutable=False)
Extends the standard dataclass.field with an additional is_mutable
flag. If this flag is set to True
the default value is considered as mutable and a matching default_factory is generated instead.
Parameters
All the standard dataclasses.field parameters with the additional
- is_mutable - Whether the specified default value is mutable
Returns
An object to identify dataclass fields, same as in dataclasses.field
Example
say we try to code the following dataclass
@dataclass
class OptimConfig:
worker_inds: List[int] = []
# Or the more explicit version
worker_inds: List[int] = field(default=[])
[]
is mutable we would actually initialize every instance of this dataclass with the same list instance, and thus is not allowed. Instead dataclasses
would direct you the default_factory function, which calls a factory function for generating the field in every new instance of your dataclass.
Now, this works great for empty collections, but what would be the alternative for
Well, you would have to create a dedicated factory function that regenerates the object, for example Kind of annoying and could be confusing for a new guest reading your code . This is wherepyrallis.field
can be helpful
The pyrallis.field
behaves like the regular dataclasses.field
with an additional is_mutable
flag. When toggled, the default_factory
is created automatically, offering the same functionally with a more reader-friendly syntax.