我设计了一个 C++ 系统,它从在单独线程中运行的过程调用用户定义的回调。简化system.hpp后如下所示:
system.hpp
#pragma once #include <atomic> #include <chrono> #include <functional> #include <thread> class System { public: using Callback = std::function<void(int)>; System(): t_(), cb_(), stop_(true) {} ~System() { stop(); } bool start() { if (t_.joinable()) return false; stop_ = false; t_ = std::thread([this]() { while (!stop_) { std::this_thread::sleep_for(std::chrono::milliseconds(100)); if (cb_) cb_(1234); } }); return true; } bool stop() { if (!t_.joinable()) return false; stop_ = true; t_.join(); return true; } bool registerCallback(Callback cb) { if (t_.joinable()) return false; cb_ = cb; return true; } private: std::thread t_; Callback cb_; std::atomic_bool stop_; };
它运行良好,可以通过这个简短的例子进行测试main.cpp:
main.cpp
#include <iostream> #include "system.hpp" int g_counter = 0; void foo(int i) { std::cout << i << std::endl; g_counter++; } int main() { System s; s.registerCallback(foo); s.start(); while (g_counter < 3) { std::this_thread::sleep_for(std::chrono::milliseconds(1)); } s.stop(); return 0; }
它将输出1234几次然后停止。但是我在尝试为我的 创建 python 绑定时遇到了一个问题System。如果我将一个 python 函数注册为回调,我的程序将在调用 后死锁。我调查了一下这个主题,似乎我遇到了GILSystem::stop的问题。可重现的示例:
1234
System
System::stop
binding.cpp:
binding.cpp
#include "pybind11/functional.h" #include "pybind11/pybind11.h" #include "system.hpp" namespace py = pybind11; PYBIND11_MODULE(mysystembinding, m) { py::class_<System>(m, "System") .def(py::init<>()) .def("start", &System::start) .def("stop", &System::stop) .def("registerCallback", &System::registerCallback); }
Python脚本:
#!/usr/bin/env python import mysystembinding import time g_counter = 0 def foo(i): global g_counter print(i) g_counter = g_counter + 1 s = mysystembinding.System() s.registerCallback(foo) s.start() while g_counter < 3: time.sleep(1) s.stop()
我已阅读pybind11 文档部分,其中介绍了在 C++ 端获取或释放 GIL 的可能性。但是,我未能摆脱本案例中出现的死锁:
PYBIND11_MODULE(mysystembinding, m) { py::class_<System>(m, "System") .def(py::init<>()) .def("start", &System::start) .def("stop", &System::stop) .def("registerCallback", [](System* s, System::Callback cb) { s->registerCallback([cb](int i) { // py::gil_scoped_acquire acquire; // py::gil_scoped_release release; cb(i); }); }); }
如果我py::gil_scoped_acquire acquire;在调用回调之前调用,无论如何都会发生死锁。如果我py::gil_scoped_release release;在调用回调之前调用,我会得到
py::gil_scoped_acquire acquire;
py::gil_scoped_release release;
致命的 Python 错误:PyEval_SaveThread:NULL tstate
我应该怎么做才能将 python 函数注册为回调并避免死锁?
我发现保护启动和加入 C++ 线程的函数似乎可以解决问题:gil_scoped_release
gil_scoped_release
PYBIND11_MODULE(mysystembinding, m) { py::class_<System>(m, "System") .def(py::init<>()) .def("start", &System::start, py::call_guard<py::gil_scoped_release>()) .def("stop", &System::stop, py::call_guard<py::gil_scoped_release>()) .def("registerCallback", &System::registerCallback); }
显然,死锁的发生是因为 Python 在调用负责 C++ 线程操作的绑定时持有锁。我仍然不确定我的推理是否正确,因此我非常感谢任何专家的评论。