轻松并行化

Optuna 支持多种方式来运行并行优化。

多线程优化:
- 您可以使用 `n_jobs` 参数在单个进程中并行运行多个 trial（试验）。
多进程优化:
- 您可以运行共享相同存储后端的多个进程，例如 RDB 或文件。
多节点优化:
- 您可以在多台机器上运行相同的优化 study。
- 如果您需要跨越数千个处理节点执行优化，可以使用 `GrpcStorageProxy` 在多台机器上运行分布式优化。

下图显示了哪种策略适用于哪种用例。

$digraph storage_selector { rankdir=LR; node [shape=box]; { rank=same; multithread; single_node; many_nodes; grpc_storage; } multithread [label=< <TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT"> <TR><TD>Multi-thread or Multi-process?</TD></TR> </TABLE> >]; single_node [label=< <TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT"> <TR><TD>Single node/<BR/>Multi-node?</TD></TR> </TABLE> >]; many_nodes [label=< <TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT"> <TR><TD>Do you need<BR/>a very large number of nodes?</TD></TR> </TABLE> >]; multithread_storages [ shape=box, style=rounded, href="#multi-thread-optimization", label=< <TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT"> <TR><TD><U>InMemoryStorage</U></TD></TR> <TR><TD><U>JournalStorage</U></TD></TR> </TABLE> > ]; singlenode_storages [ shape=box, style=rounded, href="#multi-process-optimization", label=< <TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT"> <TR><TD><U>JournalStorage</U></TD></TR> <TR><TD><U>RDBStorage</U></TD></TR> </TABLE> > ] rdb_storage [ shape=box, style=rounded, href="#multi-node-optimization", label=< <TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT"> <TR><TD><U>RDBStorage</U></TD></TR> </TABLE> > ] grpc_storage [ shape=box, style=rounded, href="#grpc-storage-proxy", label=< <TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT"> <TR><TD><U>GrpcStorageProxy</U></TD></TR> </TABLE> > ] multithread -> multithread_storages [label="Multi-thread"]; multithread -> single_node [label="Multi-process"]; single_node -> singlenode_storages [label="Single node"]; single_node -> many_nodes [label="Multi-node"]; many_nodes -> rdb_storage [label="No"]; many_nodes -> grpc_storage [label="Yes"]; }$

多线程优化

注意

推荐的后端:

您只需设置 `optimize()` 中的 `n_jobs` 参数即可并行运行多个 trial。

由于全局解释器锁 (GIL) 的存在，多线程优化在 Python 中传统上效率不高。但是，从 Python 3.14 开始（待官方发布），预计将移除 GIL。这一变化将使多线程成为一个不错的选择，尤其适用于并行优化。

import optuna
from optuna.storages import JournalStorage
from optuna.storages.journal import JournalFileBackend
from optuna.trial import Trial
import threading


def objective(trial: Trial):
    print(f"Running trial {trial.number=} in {threading.current_thread().name}")
    x = trial.suggest_float("x", -10, 10)
    return (x - 2) ** 2


study = optuna.create_study(
    storage=JournalStorage(JournalFileBackend(file_path="./journal.log")),
)
study.optimize(objective, n_trials=20, n_jobs=4)

Running trial trial.number=0 in ThreadPoolExecutor-1_0
Running trial trial.number=1 in ThreadPoolExecutor-1_2
Running trial trial.number=2 in ThreadPoolExecutor-1_1
Running trial trial.number=3 in ThreadPoolExecutor-1_3
Running trial trial.number=4 in ThreadPoolExecutor-1_3
Running trial trial.number=5 in ThreadPoolExecutor-1_1
Running trial trial.number=6 in ThreadPoolExecutor-1_0
Running trial trial.number=7 in ThreadPoolExecutor-1_2
Running trial trial.number=8 in ThreadPoolExecutor-1_1
Running trial trial.number=9 in ThreadPoolExecutor-1_0
Running trial trial.number=10 in ThreadPoolExecutor-1_3
Running trial trial.number=11 in ThreadPoolExecutor-1_2
Running trial trial.number=12 in ThreadPoolExecutor-1_1
Running trial trial.number=13 in ThreadPoolExecutor-1_3
Running trial trial.number=14 in ThreadPoolExecutor-1_0
Running trial trial.number=15 in ThreadPoolExecutor-1_2
Running trial trial.number=16 in ThreadPoolExecutor-1_1
Running trial trial.number=17 in ThreadPoolExecutor-1_3
Running trial trial.number=18 in ThreadPoolExecutor-1_0
Running trial trial.number=19 in ThreadPoolExecutor-1_2

使用 JournalStorage 进行多进程优化

注意

推荐的后端:

您可以通过使用共享存储来运行多个进程进行优化。由于 `InMemoryStorage` 的设计并非用于跨进程共享，因此不能用于多进程优化。

以下示例展示了如何使用 `JournalStorage` 和 `multiprocessing` 模块进行多进程优化。

import optuna
from multiprocessing import Pool
from optuna.storages import JournalStorage
from optuna.storages.journal import JournalFileBackend
import os


def objective(trial):
    print(f"Running trial {trial.number=} in process {os.getpid()}")
    x = trial.suggest_float("x", -10, 10)
    return (x - 2) ** 2


def run_optimization(_):
    study = optuna.create_study(
        study_name="journal_storage_multiprocess",
        storage=JournalStorage(JournalFileBackend(file_path="./journal.log")),
        load_if_exists=True, # Useful for multi-process or multi-node optimization.
    )
    study.optimize(objective, n_trials=3)

if __name__ == "__main__":
    with Pool(processes=4) as pool:
        pool.map(run_optimization, range(12))

输出

$ python3 multiprocess_example.py
Running trial trial.number=1 in process 4605
Running trial trial.number=2 in process 4604
Running trial trial.number=3 in process 4607
Running trial trial.number=4 in process 4606
Running trial trial.number=5 in process 4605
Running trial trial.number=6 in process 4607
Running trial trial.number=7 in process 4604
Running trial trial.number=8 in process 4605
...

使用 RDBStorage 进行多节点优化

由于 `JournalFileBackend` 使用本地文件系统的文件锁，它对同一主机上的多个进程运行是安全的。但是，如果通过 NFS（或类似方式）从多台机器同时访问，文件锁可能无法正常工作，这可能导致竞态条件。

因此，对于多节点优化，建议使用 `RDBStorage`。您可以使用 MySQL、PostgreSQL 或其他 RDB 后端。

例如，使用 MySQL 时，您需要设置一个 MySQL 服务器并为 Optuna 创建一个数据库。

$ mysql -u username -e "CREATE DATABASE IF NOT EXISTS example"

然后，您可以通过将 MySQL URL 设置为 `create_study()` 中 `storage` 参数的值，来使用此 MySQL 数据库作为存储后端。

import optuna


def objective(trial):
    x = trial.suggest_float("x", -10, 10)
    return (x - 2) ** 2


if __name__ == "__main__":
    study = optuna.create_study(
        study_name="distributed_test",
        storage="mysql://username:password@127.0.0.1:3306/example",
        load_if_exists=True,
    )
    study.optimize(objective, n_trials=100)

您可以在多台机器上运行此示例

机器 1

$ python3 distributed_example.py
[I 2025-06-03 14:07:45,306] A new study created in RDB with name: distributed_test
[I 2025-06-03 14:08:45,450] Trial 0 finished with value: 12.694308312865278 and parameters: {'x': -1.5629072837873959}. Best is trial 0 with value: 12.694308312865278.
[I 2025-06-03 14:09:45,482] Trial 2 finished with value: 121.80632032697125 and parameters: {'x': -9.036590067904635}. Best is trial 0 with value: 12.694308312865278.

机器 2

$ python3 distributed_example.py
[I 2025-06-03 14:07:49,318] Using an existing study with name 'distributed_test' instead of creating a new one.
[I 2025-06-03 14:08:49,442] Trial 1 finished with value: 0.21258674253407828 and parameters: {'x': 1.5389287012466746}. Best is trial 31 with value: 9.19159178106083e-05.
[I 2025-06-03 14:09:49,480] Trial 3 finished with value: 0.24343413718999274 and parameters: {'x': 2.493390451052706}. Best is trial 31 with value: 9.19159178106083e-05.

使用 GrpcStorageProxy 进行多节点优化

但是，如果您运行的是数千个进程节点，RDB 服务器可能无法处理负载。在这种情况下，您可以使用 `GrpcStorageProxy` 来分发服务器负载。

`GrpcStorageProxy` 是一个代理存储层，它在内部使用 `RDBStorage` 作为其后端。它可以有效地处理来自多台机器的高吞吐量并发请求。

以下示例展示了如何使用 `GrpcStorageProxy`。由于 `GrpcStorageProxy` 是一个代理存储，您需要先使用 `RDBStorage` 作为后端运行一个 gRPC 服务器。

from optuna.storages import run_grpc_proxy_server
from optuna.storages import get_storage

storage = get_storage("mysql+pymysql://username:password@127.0.0.1:3306/example")
run_grpc_proxy_server(storage, host="localhost", port=13000)

输出

$ python3 grpc_proxy_server.py
[I 2025-06-03 13:57:38,328] Server started at localhost:13000
[I 2025-06-03 13:57:38,328] Listening...

然后，在每台机器上，您可以运行以下代码来连接到 gRPC 代理存储。

import optuna

from optuna.storages import GrpcStorageProxy


def objective(trial):
    x = trial.suggest_float("x", -10, 10)
    return (x - 2) ** 2


if __name__ == "__main__":
    storage = GrpcStorageProxy(host="localhost", port=13000)
    study = optuna.create_study(
        study_name="grpc_proxy_multinode",
        storage=storage,
        load_if_exists=True,
    )
    study.optimize(objective, n_trials=50)

脚本总运行时间： (0 分钟 0.227 秒)

由 Sphinx-Gallery 生成的画廊