注意
转到末尾 下载完整的示例代码。
轻松并行化
Optuna 支持多种方式来运行并行优化。
-
您可以使用 `n_jobs` 参数在单个进程中并行运行多个 trial(试验)。
-
您可以运行共享相同存储后端的多个进程,例如 RDB 或文件。
-
您可以在多台机器上运行相同的优化 study。
如果您需要跨越数千个处理节点执行优化,可以使用 `GrpcStorageProxy` 在多台机器上运行分布式优化。
下图显示了哪种策略适用于哪种用例。
![digraph storage_selector {
rankdir=LR;
node [shape=box];
{ rank=same; multithread; single_node; many_nodes; grpc_storage; }
multithread [label=<
<TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT">
<TR><TD>Multi-thread or Multi-process?</TD></TR>
</TABLE>
>];
single_node [label=<
<TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT">
<TR><TD>Single node/<BR/>Multi-node?</TD></TR>
</TABLE>
>];
many_nodes [label=<
<TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT">
<TR><TD>Do you need<BR/>a very large number of nodes?</TD></TR>
</TABLE>
>];
multithread_storages [
shape=box,
style=rounded,
href="#multi-thread-optimization",
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT">
<TR><TD><U>InMemoryStorage</U></TD></TR>
<TR><TD><U>JournalStorage</U></TD></TR>
</TABLE>
>
];
singlenode_storages [
shape=box,
style=rounded,
href="#multi-process-optimization",
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT">
<TR><TD><U>JournalStorage</U></TD></TR>
<TR><TD><U>RDBStorage</U></TD></TR>
</TABLE>
>
]
rdb_storage [
shape=box,
style=rounded,
href="#multi-node-optimization",
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT">
<TR><TD><U>RDBStorage</U></TD></TR>
</TABLE>
>
]
grpc_storage [
shape=box,
style=rounded,
href="#grpc-storage-proxy",
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLALIGN="LEFT">
<TR><TD><U>GrpcStorageProxy</U></TD></TR>
</TABLE>
>
]
multithread -> multithread_storages [label="Multi-thread"];
multithread -> single_node [label="Multi-process"];
single_node -> singlenode_storages [label="Single node"];
single_node -> many_nodes [label="Multi-node"];
many_nodes -> rdb_storage [label="No"];
many_nodes -> grpc_storage [label="Yes"];
}](../../_images/graphviz-e03a9a38f64c8de64221421b71bdc88bee6871be.png)
多线程优化
注意
您只需设置 `optimize()` 中的 `n_jobs` 参数即可并行运行多个 trial。
由于全局解释器锁 (GIL) 的存在,多线程优化在 Python 中传统上效率不高。但是,从 Python 3.14 开始(待官方发布),预计将移除 GIL。这一变化将使多线程成为一个不错的选择,尤其适用于并行优化。
import optuna
from optuna.storages import JournalStorage
from optuna.storages.journal import JournalFileBackend
from optuna.trial import Trial
import threading
def objective(trial: Trial):
print(f"Running trial {trial.number=} in {threading.current_thread().name}")
x = trial.suggest_float("x", -10, 10)
return (x - 2) ** 2
study = optuna.create_study(
storage=JournalStorage(JournalFileBackend(file_path="./journal.log")),
)
study.optimize(objective, n_trials=20, n_jobs=4)
Running trial trial.number=0 in ThreadPoolExecutor-1_0
Running trial trial.number=1 in ThreadPoolExecutor-1_2
Running trial trial.number=2 in ThreadPoolExecutor-1_1
Running trial trial.number=3 in ThreadPoolExecutor-1_3
Running trial trial.number=4 in ThreadPoolExecutor-1_3
Running trial trial.number=5 in ThreadPoolExecutor-1_1
Running trial trial.number=6 in ThreadPoolExecutor-1_0
Running trial trial.number=7 in ThreadPoolExecutor-1_2
Running trial trial.number=8 in ThreadPoolExecutor-1_1
Running trial trial.number=9 in ThreadPoolExecutor-1_0
Running trial trial.number=10 in ThreadPoolExecutor-1_3
Running trial trial.number=11 in ThreadPoolExecutor-1_2
Running trial trial.number=12 in ThreadPoolExecutor-1_1
Running trial trial.number=13 in ThreadPoolExecutor-1_3
Running trial trial.number=14 in ThreadPoolExecutor-1_0
Running trial trial.number=15 in ThreadPoolExecutor-1_2
Running trial trial.number=16 in ThreadPoolExecutor-1_1
Running trial trial.number=17 in ThreadPoolExecutor-1_3
Running trial trial.number=18 in ThreadPoolExecutor-1_0
Running trial trial.number=19 in ThreadPoolExecutor-1_2
使用 JournalStorage 进行多进程优化
注意
- 推荐的后端:
您可以通过使用共享存储来运行多个进程进行优化。由于 `InMemoryStorage` 的设计并非用于跨进程共享,因此不能用于多进程优化。
以下示例展示了如何使用 `JournalStorage` 和 `multiprocessing` 模块进行多进程优化。
import optuna
from multiprocessing import Pool
from optuna.storages import JournalStorage
from optuna.storages.journal import JournalFileBackend
import os
def objective(trial):
print(f"Running trial {trial.number=} in process {os.getpid()}")
x = trial.suggest_float("x", -10, 10)
return (x - 2) ** 2
def run_optimization(_):
study = optuna.create_study(
study_name="journal_storage_multiprocess",
storage=JournalStorage(JournalFileBackend(file_path="./journal.log")),
load_if_exists=True, # Useful for multi-process or multi-node optimization.
)
study.optimize(objective, n_trials=3)
if __name__ == "__main__":
with Pool(processes=4) as pool:
pool.map(run_optimization, range(12))
输出
$ python3 multiprocess_example.py
Running trial trial.number=1 in process 4605
Running trial trial.number=2 in process 4604
Running trial trial.number=3 in process 4607
Running trial trial.number=4 in process 4606
Running trial trial.number=5 in process 4605
Running trial trial.number=6 in process 4607
Running trial trial.number=7 in process 4604
Running trial trial.number=8 in process 4605
...
使用 RDBStorage 进行多节点优化
由于 `JournalFileBackend` 使用本地文件系统的文件锁,它对同一主机上的多个进程运行是安全的。但是,如果通过 NFS(或类似方式)从多台机器同时访问,文件锁可能无法正常工作,这可能导致竞态条件。
因此,对于多节点优化,建议使用 `RDBStorage`。您可以使用 MySQL、PostgreSQL 或其他 RDB 后端。
例如,使用 MySQL 时,您需要设置一个 MySQL 服务器并为 Optuna 创建一个数据库。
$ mysql -u username -e "CREATE DATABASE IF NOT EXISTS example"
然后,您可以通过将 MySQL URL 设置为 `create_study()` 中 `storage` 参数的值,来使用此 MySQL 数据库作为存储后端。
import optuna
def objective(trial):
x = trial.suggest_float("x", -10, 10)
return (x - 2) ** 2
if __name__ == "__main__":
study = optuna.create_study(
study_name="distributed_test",
storage="mysql://username:password@127.0.0.1:3306/example",
load_if_exists=True,
)
study.optimize(objective, n_trials=100)
您可以在多台机器上运行此示例
机器 1
$ python3 distributed_example.py
[I 2025-06-03 14:07:45,306] A new study created in RDB with name: distributed_test
[I 2025-06-03 14:08:45,450] Trial 0 finished with value: 12.694308312865278 and parameters: {'x': -1.5629072837873959}. Best is trial 0 with value: 12.694308312865278.
[I 2025-06-03 14:09:45,482] Trial 2 finished with value: 121.80632032697125 and parameters: {'x': -9.036590067904635}. Best is trial 0 with value: 12.694308312865278.
机器 2
$ python3 distributed_example.py
[I 2025-06-03 14:07:49,318] Using an existing study with name 'distributed_test' instead of creating a new one.
[I 2025-06-03 14:08:49,442] Trial 1 finished with value: 0.21258674253407828 and parameters: {'x': 1.5389287012466746}. Best is trial 31 with value: 9.19159178106083e-05.
[I 2025-06-03 14:09:49,480] Trial 3 finished with value: 0.24343413718999274 and parameters: {'x': 2.493390451052706}. Best is trial 31 with value: 9.19159178106083e-05.
使用 GrpcStorageProxy 进行多节点优化
但是,如果您运行的是数千个进程节点,RDB 服务器可能无法处理负载。在这种情况下,您可以使用 `GrpcStorageProxy` 来分发服务器负载。
`GrpcStorageProxy` 是一个代理存储层,它在内部使用 `RDBStorage` 作为其后端。它可以有效地处理来自多台机器的高吞吐量并发请求。
以下示例展示了如何使用 `GrpcStorageProxy`。由于 `GrpcStorageProxy` 是一个代理存储,您需要先使用 `RDBStorage` 作为后端运行一个 gRPC 服务器。
from optuna.storages import run_grpc_proxy_server
from optuna.storages import get_storage
storage = get_storage("mysql+pymysql://username:password@127.0.0.1:3306/example")
run_grpc_proxy_server(storage, host="localhost", port=13000)
输出
$ python3 grpc_proxy_server.py
[I 2025-06-03 13:57:38,328] Server started at localhost:13000
[I 2025-06-03 13:57:38,328] Listening...
然后,在每台机器上,您可以运行以下代码来连接到 gRPC 代理存储。
import optuna
from optuna.storages import GrpcStorageProxy
def objective(trial):
x = trial.suggest_float("x", -10, 10)
return (x - 2) ** 2
if __name__ == "__main__":
storage = GrpcStorageProxy(host="localhost", port=13000)
study = optuna.create_study(
study_name="grpc_proxy_multinode",
storage=storage,
load_if_exists=True,
)
study.optimize(objective, n_trials=50)
脚本总运行时间: (0 分钟 0.227 秒)