run_r2d2_simfish module

running R2D2 on fish environment.

class run_r2d2_simfish.SimfishR2D2Builder(config: R2D2Config)[source]

Bases: R2D2Builder

make_learner(random_key: PRNGKeyArray, networks: UnrollableNetwork, dataset: Iterator[PrefetchingSplit], logger_fn: LoggerFactory, environment_spec: EnvironmentSpec, replay_client: Client | None = None, counter: Counter | None = None) Learner[source]

Creates an instance of the learner.

Parameters:
  • random_key – A key for random number generation.

  • networks – struct describing the networks needed by the learner; this can be specific to the learner in question.

  • dataset – iterator over samples from replay.

  • logger_fn – factory providing loggers used for logging progress.

  • environment_spec – A container for all relevant environment specs.

  • replay_client – client which allows communication with replay. Note that this is only intended to be used for updating priorities. Samples should be obtained from dataset.

  • counter – a Counter which allows for recording of counts (learner steps, actor steps, etc.) distributed throughout the agent.

class run_r2d2_simfish.SimfishR2D2Config(discount: float = 0.997, target_update_period: int = 2500, evaluation_epsilon: float = 0.0, num_epsilons: int = 256, variable_update_period: int = 400, burn_in_length: int = 40, trace_length: int = 80, sequence_period: int = 40, learning_rate: float = 0.001, bootstrap_n: int = 5, clip_rewards: bool = False, tx_pair: ~rlax._src.nonlinear_bellman.TxPair = (<function signed_hyperbolic>, <function signed_parabolic>), samples_per_insert_tolerance_rate: float = 0.1, samples_per_insert: float = 4.0, min_replay_size: int = 50000, max_replay_size: int = 100000, batch_size: int = 64, prefetch_size: int = 2, num_parallel_calls: int = 16, replay_table_name: str = 'priority_table', importance_sampling_exponent: float = 0.6, priority_exponent: float = 0.9, max_priority_weight: float = 0.9, actions: ~simulation.define_actions.Actions = <simulation.define_actions.Actions object>)[source]

Bases: R2D2Config

Configuration options for R2D2 agent.

actions: Actions = <simulation.define_actions.Actions object>
run_r2d2_simfish.build_experiment_config(training_parameters: dict) ExperimentConfig[source]

Builds R2D2 experiment config which can be executed in different ways.

run_r2d2_simfish.main(_)[source]