Embedding Python in C/C++: Part I
Introduction介绍Inspired by the article "Embedding Python in Multi-Threaded C/C++ Applications" (LinuxJournal), I felt the need for a more comprehensive coverage on the topic of embedding P
Introduction
介绍
Inspired by the article "Embedding Python in Multi-Threaded C/C++ Applications" (Linux Journal), I felt the need for a more comprehensive coverage on the topic of embedding Python. While writing this article, I had two objectives:
受《在多线程C/C++应用程序中嵌入Python》(Linux杂志)这篇文章的启发,我觉得自已需要对Pyrthon的嵌入作更全面的理解。当写这篇文章时,我有两个目标:
- This is written for programmers who are more experienced in C/C++ than in Python, the tutorial takes a practical approach and omits all theoretical discussions.这篇文章是写给那些相比Python,更熟悉C/C++的程序员的,教程以实际编程为主,同时也引发所有理论探讨。
- Try to maintain Python's cross-platform compatibility when writing the embedding code.当写嵌入的代码时,尝试去保持Python的跨平台的兼容性,
Now, you have got some modules written in Python by others and you want to use them. You are experienced in C/C++, but fairly new to Python. You might wonder if you can find a tool to convert them to C code, like the conversion from FORTRAN. The answer is no. Some tools can help you generate the executable from a Python module. Is the problem solved? No. Converting code into executables usually makes things even more complicated, as you must figure out how your C/C++ application communicates with the executable "black-box".
现在,你已经得到一些由其它人写的Python模块,并且你想使用它们。你熟悉C/C++,但是对Python却很陌生。你可能想知道是不是有一个工具,它可以将这些Python代码转化成C代码,就像FORTRAN一样。答案是否定的。一些工具可以帮助你从Python模块中生成可执行的东西。问题解决了吗?没有。将代码转化成可执行文件通常将事情变得更加复杂,因为你必须弄清楚你的C/C++应用程序怎样与这些可执行的黑例子通信。
I am going to introduce C/C++ programmers to Python/C API, a C library that helps to embed python modules into C/C++ applications. The API library provides a bunch of C routines to initialize the Python Interpreter, call into your Python modules and finish up the embedding. The library is built with Python and distributed with all the recent Python releases.
我将向C/C++程序员介绍Pythogn/C的API,一个帮助将Puthon模块嵌入到C/C++程序中的C库。这个API库提供一个C选择时分支来初始化Python解释器,调用你的Python模块并且完成嵌入。这个库是和Python一起编译的,而且安装在所有最近的发布版中。
Part I of this article series discusses the basics of Python embedding. Part II will move on to more advanced topics. This tutorial does not teach the Python language systematically, but I will briefly describe how the Python code works when it comes up. The emphasis will be on how to integrate Python modules with your C/C++ applications. See article: "Embedding Python in C/C++: Part II".
这篇文章的第一部分讨论Python嵌入的基础。第二部分讨论更高级的话题。这篇教程不教Python语法,但是当有需要时,我将会简洁的描述Python代码是怎样工作的。重点是怎样将Python模块整合进你的C/C++应用程序中。见文章《在C/C++中嵌入Python:Part: II》
In order to use the source code, you should install a recent Python release, Visual C++ (or GCC compiler on Linux). The environment that I have used to test is: Python 2.4 (Windows and Linux), Visual C++ 6.0 (Windows) or GCC 3.2 (RedHat 8.0 Linux). With Visual C++, select the Release configuration to build. Debug configuration requires Python debug library "python24_d.lib", which is not delivered with normal distributions.
为了使用源码,你应该安装一个最近的Python版本,VisualC++(或者Linux上的GCC编译器)。我用来测试的环境是:。。。。略
Background
背景
Python is a powerful interpreted language, like Java, Perl and PHP. It supports a long list of great features that any programmer would expect, two of my favorite features are "simple" and "portable". Along with the available tools and libraries, Python makes a good language for modeling and simulation developers. Best of all, it's free and the tools and libraries written for Python programmers are also free. For more details on the language, visit the official website.
Python是一个强大的粘合语言,像Java, Perl和PHP。它支持一长串的任何程序员都期望的特性。我最喜欢的两个特性是“简单”和“轻便”。结合一些可用的工具和库,Pyhotn成为一个优秀的建模语言和模块开发者。最优的是,它以及所有用它写成的库也是自由免费的。更多细节见官方主页。
Embedding basics: functions, classes and methods
基础嵌入:函数,类和方法
First, let us start from a sample C program that calls a function within a Python module. Here is the source file "call_function.c":
首先,让我们以一个简单的C程序调用一个在Python模块中的函数起步。下面是“call_function.c”中的代码:
// call_function.c - A sample of calling
// python functions from C code
//
#include <Python.h>
int main(int argc, char *argv[])
{
PyObject *pName, *pModule, *pDict, *pFunc, *pValue;
if (argc < 3)
{
printf("Usage: exe_name python_source function_name\n");
return 1;
}
// Initialize the Python Interpreter
Py_Initialize();
// Build the name object
pName = PyString_FromString(argv[1]);
// Load the module object
pModule = PyImport_Import(pName);
// pDict is a borrowed reference
pDict = PyModule_GetDict(pModule);
// pFunc is also a borrowed reference
pFunc = PyDict_GetItemString(pDict, argv[2]);
if (PyCallable_Check(pFunc))
{
PyObject_CallObject(pFunc, NULL);
} else
{
PyErr_Print();
}
// Clean up
Py_DECREF(pModule);
Py_DECREF(pName);
// Finish the Python Interpreter
Py_Finalize();
return 0;
}
The Python source file "
py_function.py
" is as follows:
Python源码文件“py_function.py”如下:
'''py_function.py - Python source designed to '''
'''demonstrate the use of python embedding'''
def multiply():
c = 12345*6789
print 'The result of 12345 x 6789 :', c
return c
Note that checks for the validity of objects are omitted for brevity. On Windows, simply compile the C source and get the executable, which we call "
call_function.exe
". To run it, enter the command line "
call_function py_function multiply
". The second argument is the name of the Python file (without extension), which when loaded becomes the module name. The third argument is the name of the Python function you are going to call within the module. The Python source does not take part in compiling or linking; it's only loaded and interpreted at run-time. The output from the execution is:
注意,为了简明起见,去掉了检查对象的有效性。在Window上,直接编译C源码得到可执行程序“call_function.exe”。为了运行它, 在命令行输入“call_function py_function multiply”, 第二个参数是Python文件的名字(除去后缀),当加载时它会变成模块名。第三个参数是你即将要从这个模块中调用的Python函数的名字。Python源文件没有参与编译或是链接。它只是是运行时加载和翻译。可执行文件的输出如下:
The result of 12345 x 6789 : 83810205
The C code itself is self-explanatory, except that:
C代码是自解释的,除了以下:
- Everything in Python is an object.
pDict
andpFunc
are borrowed references so we don't need toPy_DECREF()
them.Python中的一切都是对象,pDict和pFunc是borrowed reference,所以我们不需要在它们上运行Py_DECREF() - All the Py_XXX and PyXXX_XXX calls are Python/C API calls.所有的Py_XXX和PyXXX_XXX调用都是Python/C的API调用
- The code will compile and run on all the platforms that Python supports.这段代码可以在所有Python支持的平台上编译运行。
Now, we want to pass arguments to the Python function. We add a block to handle arguments to the call:
现在,我们想给Python函数传递参数,我们增加一个代码块来处理参数调用:
if (PyCallable_Check(pFunc))
{
// Prepare the argument list for the call
if( argc > 3 )
{
pArgs = PyTuple_New(argc - 3);
for (i = 0; i < argc - 3; i++)
{
pValue = PyInt_FromLong(atoi(argv[i + 3]));
if (!pValue)
{
PyErr_Print();
return 1;
}
PyTuple_SetItem(pArgs, i, pValue);
}
pValue = PyObject_CallObject(pFunc, pArgs);
if (pArgs != NULL)
{
Py_DECREF(pArgs);
}
} else
{
pValue = PyObject_CallObject(pFunc, NULL);
}
if (pValue != NULL)
{
printf("Return of call : %d\n", PyInt_AsLong(pValue));
Py_DECREF(pValue);
}
else
{
PyErr_Print();
}
// some code omitted...
}
The new C source adds a block of "Prepare the argument list for the call" and a check of the returned value. It creates a tuple (list-like type) to store all the parameters for the call. You can run the command "
call_function py_source multiply1 6 7
" and get the output:
新的C代码增加一块“为调用准备参数列表”的代码块,并且检查了返回值。它新建了一个元组(像列表类型)来存储所有的调用参数。你可以运行命令“call_function py_source multiply1 6 7”,得到的结果如下:
The result of 6 x 7 : 42
Return of call : 42
Writing classes in Python is easy. It is also easy to use a Python class in your C code. All you need to do is create an instance of the object and call its methods, just as you call normal functions. Here is an example:
写一个类在Python中是一件简单的事件,在C代码中使用Python的类也同样简单。你所需要做的就是创建一个对象实例,然后调用它的方法,就像你调用普通函数一样,下面是例子:
// call_class.c - A sample of python embedding
// (calling python classes from C code)
//
#include <Python.h>
int main(int argc, char *argv[])
{
PyObject *pName, *pModule, *pDict,
*pClass, *pInstance, *pValue;
int i, arg[2];
if (argc < 4)
{
printf(
"Usage: exe_name python_fileclass_name function_name\n");
return 1;
}
// some code omitted...
// Build the name of a callable class
pClass = PyDict_GetItemString(pDict, argv[2]);
// Create an instance of the class
if (PyCallable_Check(pClass))
{
pInstance = PyObject_CallObject(pClass, NULL);
}
// Build the parameter list
if( argc > 4 )
{
for (i = 0; i < argc - 4; i++)
{
arg[i] = atoi(argv[i + 4]);
}
// Call a method of the class with two parameters
pValue = PyObject_CallMethod(pInstance,
argv[3], "(ii)", arg[0], arg[1]);
} else
{
// Call a method of the class with no parameters
pValue = PyObject_CallMethod(pInstance, argv[3], NULL);
}
if (pValue != NULL)
{
printf("Return of call : %d\n", PyInt_AsLong(pValue));
Py_DECREF(pValue);
}
else
{
PyErr_Print();
}
// some code omitted...
}
The third parameter to
PyObject_CallMethod()
,
"
(ii)
"
is a format string, which indicates that the next arguments are two integers. Note that
PyObject_CallMethod()
takes C variables types as its arguments, not Python objects. This is different from the other calls we have seen so far. The Python source "
py_class.py
" is as follows:
PyObject_CallMethod()的第三个参数“(ii)”是一个格式字符串,它指示接下来的参数是两个整数,注意,PyObject_CallMethod()使用C变作类型作为它的参数,而不是Python objects。这与我们以前所见的情况不一样。Py_call.py源码中的内容如下:
'''py_class.py - Python source designed to demonstrate'''
'''the use of python embedding'''
class Multiply:
def __init__(self):
self.a = 6
self.b = 5
def multiply(self):
c = self.a*self.b
print 'The result of', self.a, 'x', self.b, ':', c
return c
def multiply2(self, a, b):
c = a*b
print 'The result of', a, 'x', b, ':', c
return c
To run the application, you add a class name between the module and function names, which is "Multiply
" in this case. The command line becomes "call_class py_class Multiply multiply
" or "call_class py_class Multiply multiply2 9 9
".
为了运行程序,你在模块名和函数名之间增加一个类名, 在当前情况下就是"Multiply",命令行变成"call_call py_class Multiply multiply "或者"call_class py_class Multiply multiply2 9 9"
Multi-threaded Python embedding
多线程下的Python嵌入
With the above preparations, we are ready for some serious business. The Python module and your C/C++ application have to run simultaneously from time to time. This is not uncommon in simulation communities. For instance, the Python module to be embedded is part of a real-time simulation and you run it in parallel with the rest of your simulation. Meanwhile, it interacts with the rest at run-time. One conventional technique is multi-threading. There are a number of options for multi-threaded embedding. We are going to discuss two of them here.
有了以上的准备,我们已经可以开始一些严肃事情了。Python模块和你的C/C++应用程序必须同时运行。这在模拟通信中并不罕见。例如,Python模块嵌入到一个实时的模块系统中,它与你其它的仿真同时运行。同时,在运行时它也与其余的仿真交互。一个方便技术是多线程。已经有大量的关系多线程嵌入的观点。我们将讨论其中的两个:
In one approach you create a separate thread in C and call the Python module from the thread function. This is natural and correct, except that you need to protect the Python Interpreter state. Basically, we lock the Python Interpreter before you use it and release after the use so that Python can track its states for the different calling threads. Python provides the global lock for this purpose. Let us look at some source code first. Here is the complete content of "call_thread.c":
一种方法是你在C产生一个独立的线程,然后从线程函数里调用Python模块。这是非常自然和正确的,除非你需要保护Python解释器状态。基本上,我们在你使用Python解释器之前锁住它,然后在你使用完后释放它,以致Python能够跟踪到不同的线程调用。Python提供全局锁来达到这个目的。让我们先看一下源码。下面是“call_thread.c”的完整内容:
// call_thread.c - A sample of python embedding
// (C thread calling python functions)
//
#include <Python.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#ifdef WIN32 // Windows includes
#include <Windows.h>
#include <process.h>
#define sleep(x) Sleep(1000*x)
HANDLE handle;
#else // POSIX includes
#include <pthread.h>
pthread_t mythread;
#endif
void ThreadProc(void*);
#define NUM_ARGUMENTS 5
typedef struct
{
int argc;
char *argv[NUM_ARGUMENTS];
} CMD_LINE_STRUCT;
int main(int argc, char *argv[])
{
int i;
CMD_LINE_STRUCT cmd;
pthread_t mythread;
cmd.argc = argc;
for( i = 0; i < NUM_ARGUMENTS; i++ )
{
cmd.argv[i] = argv[i];
}
if (argc < 3)
{
fprintf(stderr,
"Usage: call python_filename function_name [args]\n");
return 1;
}
// Create a thread
#ifdef WIN32
// Windows code
handle = (HANDLE) _beginthread( ThreadProc,0,&cmd);
#else
// POSIX code
pthread_create( &mythread, NULL,
ThreadProc, (void*)&cmd );
#endif
// Random testing code
for(i = 0; i < 10; i++)
{
printf("Printed from the main thread.\n");
sleep(1);
}
printf("Main Thread waiting for My Thread to complete...\n");
// Join and wait for the created thread to complete...
#ifdef WIN32
// Windows code
WaitForSingleObject(handle,INFINITE);
#else
// POSIX code
pthread_join(mythread, NULL);
#endif
printf("Main thread finished gracefully.\n");
return 0;
}
void ThreadProc( void *data )
{
int i;
PyObject *pName, *pModule, *pDict,
*pFunc, *pInstance, *pArgs, *pValue;
PyThreadState *mainThreadState, *myThreadState, *tempState;
PyInterpreterState *mainInterpreterState;
CMD_LINE_STRUCT* arg = (CMD_LINE_STRUCT*)data;
// Random testing code
for(i = 0; i < 15; i++)
{
printf("...Printed from my thread.\n");
sleep(1);
}
// Initialize python inerpreter
Py_Initialize();
// Initialize thread support
PyEval_InitThreads();
// Save a pointer to the main PyThreadState object
mainThreadState = PyThreadState_Get();
// Get a reference to the PyInterpreterState
mainInterpreterState = mainThreadState->interp;
// Create a thread state object for this thread
myThreadState = PyThreadState_New(mainInterpreterState);
// Release global lock
PyEval_ReleaseLock();
// Acquire global lock
PyEval_AcquireLock();
// Swap in my thread state
tempState = PyThreadState_Swap(myThreadState);
// Now execute some python code (call python functions)
pName = PyString_FromString(arg->argv[1]);
pModule = PyImport_Import(pName);
// pDict and pFunc are borrowed references
pDict = PyModule_GetDict(pModule);
pFunc = PyDict_GetItemString(pDict, arg->argv[2]);
if (PyCallable_Check(pFunc))
{
pValue = PyObject_CallObject(pFunc, NULL);
}
else {
PyErr_Print();
}
// Clean up
Py_DECREF(pModule);
Py_DECREF(pName);
// Swap out the current thread
PyThreadState_Swap(tempState);
// Release global lock
PyEval_ReleaseLock();
// Clean up thread state
PyThreadState_Clear(myThreadState);
PyThreadState_Delete(myThreadState);
Py_Finalize();
printf("My thread is finishing...\n");
// Exiting the thread
#ifdef WIN32
// Windows code
_endthread();
#else
// POSIX code
pthread_exit(NULL);
#endif
}
The thread function needs a bit of explanation. PyEval_InitThreads()
initializes Python's thread support.PyThreadState_Swap(myThreadState)
swaps in the state for the current thread, andPyThreadState_Swap(tempState)
swaps it out. The Python Interpreter will save what happens between the two calls as the state data related to this thread. As a matter of fact, Python saves the data for each thread that is using the Interpreter so that the thread states are mutually exclusive. But it is your responsibility to create and maintain a state for each C thread. You may wonder why we didn't call the first PyEvel_AcquireLock()
. Because PyEval_InitThreads()
does so by default. In other cases, we do need to use PyEvel_AcquireLock()
and PyEvel_ReleaseLock()
in pairs.
线程函数需要一些解释。PyEval_InitThreads初始化Python的线程支持。PyThreadState_Swap(myThreadState)将当前进程状态换入,PyThreadState_Swap(tempState)则是换出。Python解释器将保存这两个换进换出之间的状态数据。事实上,Python保存每个使用解释器的线程数据,所以线程状态是互斥的。但是,为每个C线程道理和维护这个状态是你的责任。你可能会好奇为什么我们不先调用PyEvel_AcquireLock().因为它默认做这些事情。在其它情况下,我们确实需要使用PyEvel_AcquireLock()和PyEvel_ReleaseLock()成对的使用
Run "call_thread py_thread pythonFunc
" and you can get the output as shown below. The file "py_thread.py" defines a function called pythonFunc()
in which a similar random testing block prints "print from pythonFunc..." to screen fifteen times.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
...Printed from my thread.
Printed from the main thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Main Thread waiting for My Thread to complete...
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
My thread is finishing...
Main thread finished gracefully.
Obviously, the implementation is getting complicated because writing multi-threading code in C/C++ is not trivial. Although the code is portable, it contains numerous patches, which require detailed knowledge of the system call interfaces for specific platforms. Fortunately, Python has done most of this for us, which brings up the second solution to our question under discussion, namely, letting Python handle the multi-threading. This time the Python code is enhanced to add a threading model:
''' Demonstrate the use of python threading'''
import time
import threading
class MyThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
for i in range(15):
print 'printed from MyThread...'
time.sleep(1)
def createThread():
print 'Create and run MyThread'
background = MyThread()
background.start()
print 'Main thread continues to run in foreground.'
for i in range(10):
print 'printed from Main thread.'
time.sleep(1)
print 'Main thread joins MyThread and waits until it is done...'
background.join() # Wait for the background task to finish
print 'The program completed gracefully.'
The C code does not handle threading any more. All it needs to do is just call createThread()
. Try to use the previous "call_function.c". You can run "call_function py_thread createThread
" to see how the output looks like. In this case, the second solution is cleaner and simpler. More importantly, Python threading model is portable. While the C threading code for Unix and Windows is different, it remains the same in Python.
The Python function createThread()
is not required if our C code calls the thread class' start()
and joint()
methods. The relevant changes are listed in the following (from the C source file "call_thread_2.c"):
// Create instance
pInstance = PyObject_CallObject(pClass, NULL);
PyObject_CallMethod(pInstance, "start", NULL);
i = 0;
while(i<10)
{
printf("Printed from C thread...\n");
// !!!Important!!! C thread will not release CPU to
// Python thread without the following call.
PyObject_CallMethod(pInstance, "join", "(f)", 0.001);
Sleep(1000);
i++;
}
printf(
"C thread join and wait for Python thread to complete...\n");
PyObject_CallMethod(pInstance, "join", NULL);
printf("Program completed gracefully.\n");
Basically, after you create the class instance, call its
start()
method to create a new thread and execute its
run()
method. Note that without frequent short joins to the created thread, the created thread can only get executed at the beginning and the main thread will not release any CPU to it until it's done. You may try this out by commenting the joint call within the
while
loop. The behavior is somehow different from that of the previous case where we had called
start()
from within the Python module. This seems to be a feature of multi-threading which is not documented in the
Python Library Reference
.
更多推荐
所有评论(0)