文章目录
- 背景
- Import 报错
- 是版本问题吗?
- 删除pandas
- 在Visual Studio中设置Python 环境
- 为什么要在Visual Studio IDE下继续安装Python package
- 在Visual Studio安装numpy和pandas
- PYTHONPATH
- Debug模式下继续报错
- 配置Release 工程
- 优化
- 不便之处 1
- 不便之处 2
- 后续
- 参考
背景
C++ application 工程中增加python 脚本,实现混编
Import 报错
在python脚本中import pandas,
import os
import pandas as pd
是版本问题吗?
参考:
-
numpy与python版本不匹配-ImportError: Unable to import required dependencies: numpy
-
Archived: Unofficial Windows Binaries for Python Extension Packages
这篇博文中介绍的链接能看到package版本和python以及windows系统的对应关系
-
查看版本命令
C:\Windows\System32>pip show numpy
WARNING: Ignoring invalid distribution -umpy (c:\users\junjie\appdata\roaming\python\python310\site-packages)
Name: numpy
Version: 1.22.4
Summary: NumPy is the fundamental package for array computing with Python.
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email:
License: BSD
Location: c:\program files\python310\lib\site-packages
Requires:
Required-by: pandasC:\Windows\System32>pip show pandas
WARNING: Ignoring invalid distribution -umpy (c:\users\junjie\appdata\roaming\python\python310\site-packages)
Name: pandas
Version: 1.4.3
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: The Pandas Development Team
Author-email: pandas-dev@python.org
License: BSD-3-Clause
Location: c:\program files\python310\lib\site-packages
Requires: numpy, python-dateutil, pytz
Required-by:
删除pandas
C:\Windows\System32>pip uninstall pandas -y
WARNING: Ignoring invalid distribution -umpy (c:\users\junjie\appdata\roaming\python\python310\site-packages)
WARNING: Ignoring invalid distribution -umpy (c:\users\junjie\appdata\roaming\python\python310\site-packages)
Found existing installation: pandas 1.4.3
Uninstalling pandas-1.4.3:Successfully uninstalled pandas-1.4.3
此时VS Code里面提示报错:
import os
import numpy as np
import pandas as pd #提示无法找到该moduleprint("PYTHONPATH:", os.environ.get('PYTHONPATH'))
print("PATH:", os.environ.get('PATH'))print("pandas version:", pd.__version__)
print("numpy version:",np.__version__)
继续运行App程序也会出错,这一次出错很明显和之前的不同,也是没有找到pandas module.
Traceback (most recent call last):File "C:\Resource\Git_Source\perf_utils\perf_utils\TextParser\x64\Debug\mergePython.py", line 2, in <module>import pandas as pd
ModuleNotFoundError: No module named 'pandas'
但是令人意外的是,在cmd 命令模式下执行:
pip uninstall pandas -y
并不会卸载在Visual Stuido中安装的pandas
在Visual Studio中设置Python 环境
为什么要在Visual Studio IDE下继续安装Python package
- AI回答的内容
Good catch! Seeing an “unresolved import” error for numpy but not os in Visual Studio specifically points to a configuration issue with the IDE. A few things to check:
- Make sure the Python environment being used in Visual Studio has numpy installed. The VS Python environment is separate from your global Python install.
- Go to Tools -> Python Environments in VS and verify numpy is installed for the environment being used. If not, install it.
- Open the Python Environments window (View -> Other Windows -> Python Environments), right click on your environment and select “Install Python Package” and install numpy.
- Check that the Python environment and interpreter path are correctly configured in VS under Tools -> Python Tools -> Environment Options.
- Review the Python tutorial in Visual Studio to ensure Python tools, environments, and paths are set up correctly: https://learn.microsoft.com/en-us/visualstudio/python/tutorial-working-with-python-in-visual-studio-step-05-installing-packages?view=vs-2022
在Visual Studio安装numpy和pandas
上面提到说,VS python 的环境和全局的Python是独立的,所以在VS 环境中安装完pandas和numpy,会有如下显示:
我们也看到了安装pandas过程中额外安装的包
Installing collected packages: pytz, tzdata, six, python-datautil, pandas
虽然在cmd中卸载了pandas,但是在Visual Studio环境下,仍然可以查询到pandas的版本。说明之前pip install 和 uninstall和通过Visual Studio安装的package不在同一个位置
- Visutal Stuido pip isntall 的安装目录:
C:\Users\Junjie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages
- cmd pip install 安装的目录:
C:\Users\Junjie\AppData\Roaming\Python\Python310\site-packages
但是当我通过命令行的方式卸载pandas之后,Visual Studio并没有去它自身安装的目录下去寻找pandas, 因此报了:
ModuleNotFoundError: No module named 'pandas'
这是否说明我的工程目录设置的不对呢?我继续问了下
Yes, I have install pandas from VS IDE and uninstall the pandas from cmd side with this cmd: pip uninstall pandas -y. After this action, I think that the application will import pandas correctly from the VS IDE installed directory. But in fact, when I run the C++ application, an error occured: “import pandas as pd ModuleNotFoundError: No module named ‘pandas’” It seams that the application import the pandas from default pip install directory, but not the VS installed directory. Forget to tell the application, the app is constructed from a C++ project but not an python project. In this C++ project, I need to call PyImport_ImportModule to run python script, in which need to “Import pandas as pd” , so how do I to do next step
下面是人工智能给我的回答:
Ah I see, since this is a C++ application importing Python code, the Python environment and module import behavior will be different than just running Python code directly.
Here are a few things you can try:
- Make sure the Python environment you have configured in Visual Studio is the same one being used by PyImport_ImportModule in your C++ code. Double check the Python interpreter path and version.
- Set the PYTHONPATH environment variable before calling PyImport_ImportModule to include the path to the VS installed pandas module. This will override the default paths.
- Alternatively, you can manipulate sys.path before importing pandas to prepend the VS installed site-packages dir.
- As a test, try installing pandas globally via pip instead of in VS, and see if your C++ code picks it up from the default path.
Consider using virtual environments to better control the Python environment, packages, and paths for your C++ project separately from any global installs.
这让我注意到在之前打印PYTHONPATH信息时,是空的
import os
print("PYTHONPATH:", os.environ.get('PYTHONPATH'))
蓝色字样中提到:
在调用 PyImport_ImportModule 之前设置 PYTHONPATH 环境变量以包含 VS 安装的 pandas 模块的路径。这将覆盖默认路径。
那么问题就转到PYTHONPATH是什么,并且如何设置
或者,您可以在导入 pandas 之前操作 sys.path 以添加 VS 安装的站点包目录。
PYTHONPATH
PYTHONPATH 是一个环境变量,您可以设置它来添加 python 将在其中查找模块和包的其他目录
print(sys.path)
sys.path.append('C:\\Users\\Junjie\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python310\\site-packages')
print(sys.path)
输出:
['C:\\Program Files\\Python310\\python310.zip', 'C:\\Program Files\\Python310\\Lib', 'C:\\Program Files\\Python310\\DLLs', 'C:\\Resource\\Git_Source\\perf_utils\\perf_utils\\TextParser\\x64\\Release', 'C:\\Users\\Junjie\\AppData\\Roaming\\Python\\Python310\\site-packages', 'C:\\Users\\Junjie\\AppData\\Roaming\\Python\\Python310\\site-packages\\win32', 'C:\\Users\\Junjie\\AppData\\Roaming\\Python\\Python310\\site-packages\\win32\\lib', 'C:\\Users\\Junjie\\AppData\\Roaming\\Python\\Python310\\site-packages\\Pythonwin', 'C:\\Program Files\\Python310', 'C:\\Program Files\\Python310\\lib\\site-packages']['C:\\Program Files\\Python310\\python310.zip', 'C:\\Program Files\\Python310\\Lib', 'C:\\Program Files\\Python310\\DLLs', 'C:\\Resource\\Git_Source\\perf_utils\\perf_utils\\TextParser\\x64\\Release', 'C:\\Users\\Junjie\\AppData\\Roaming\\Python\\Python310\\site-packages', 'C:\\Users\\Junjie\\AppData\\Roaming\\Python\\Python310\\site-packages\\win32', 'C:\\Users\\Junjie\\AppData\\Roaming\\Python\\Python310\\site-packages\\win32\\lib', 'C:\\Users\\Junjie\\AppData\\Roaming\\Python\\Python310\\site-packages\\Pythonwin', 'C:\\Program Files\\Python310', 'C:\\Program Files\\Python310\\lib\\site-packages', 'C:\\Users\\Junjie\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python310\\site-packages']
可以看到后面的print就是多了一个‘C:\Users\Junjie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages’
这就是Visual Studio安装的package目录
Debug模式下继续报错
报错的内容和第一节一开始的内容相同,根据报错内容中的提示:
Numpy ImportError
-
Debug build on Windows
Rather than building your project in DEBUG mode on windows, try building in RELEASE mode with debug symbols and no optimization. Full DEBUG mode on windows changes the names of the DLLs python expects to find, so if you wish to truly work in DEBUG mode you will need to recompile the entire stack of python modules you work with including NumPy -
建议:release模式,保留debug符号,不进行优化
try building RELEASE mode with debug symbols and no optimization
配置Release 工程
参考《Visual Studio在Release模式下开启debug调试》 配置Release模式
- cpp
void merge_csv(string PythonModuleName, string PythonFunctionName, string csv_dir)
{Py_Initialize();// ... 省略PyObject *pModule = NULL;pModule = PyImport_ImportModule(PythonModuleName.c_str());if (!pModule){PyErr_Print();cout << "Python import failed!" << endl;return;}//...省略Py_Finalize();
}
- python
import os
import sys
print(sys.path)
sys.path.append('C:\\Users\\Junjie\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python310\\site-packages')
print(sys.path)
import pandas as pddef merge_csv(csv_dir):#... 省略return
经过上述配置和操作后,程序能正常运行。 这些操作包括:
- 在Visual Studio Python环境下安装package
- 编译Release版本
优化
如果在python中添加自定义的Visual Studio 的install path,这样肯定还是不行的。因为移植给其他人用的时候,路径并不一定正确。 此时有一个想法,既然在重新配置sys.path.append()
后只增加了Visual Studio的package install目录,那么如果保持pip安装的目录下,两者的版本一致,是不是也一样呢?想到这一点的原因是参考了博文:
python脚本中的sys.path.append(“…”)详解 中提到的
当我们导入一个模块时: import xxx ,默认情况下python解释器会搜索当前目录、已安装的内置模块和第三方模块。
package | pip | VS IDE |
---|---|---|
numpy | 1.22.4 | 1.25.2 |
pandas | 1.4.3 | 2.0.3 |
- 优化:
-
- 卸载掉pip 安装的numpy和pandas
-
- 安装VS 相同的版本
-
- 去掉Python 脚本中额外增加的
sys.path.append(xxx)
- 去掉Python 脚本中额外增加的
-
C:\Windows\System32>pip uninstall pandas -y
C:\Windows\System32>pip uninstall numpy -y
C:\Windows\System32>pip install numpy==1.25.2
C:\Windows\System32>pip install pandas==2.0.3
C:\Windows\System32>pip show numpy
C:\Windows\System32>pip show pandas
此时程序同样能跑起来,也证明了上面的猜想~
不便之处 1
到这里,虽然实现了C++ 和 Python 的混编, 但是整个配置过程较复杂,如果是简单的python脚本,原始安装的python 目录下应该可以满足。但是如果是像pandas这类,可能会遇到版本不一致和Debug/Release模式的问题,这给调试带来了不方便之处。
当然,可以使用虚拟环境去配置C++应用开发环境,从而和global 的环境区分开来。这里暂时没有去研究了。
- Consider using virtual environments to better control the Python environment, packages, and paths for your C++ project separately from any global installs.
不便之处 2
打包问题。在实现功能后,整个混编环境的打包也难住了我。单纯的python 脚本可以通过pyinstaller
命令打包所需要的package并生成exe可执行文件。但是这类混编的打包好像没有一个命令去实现,这对于工具类的软件不是很方便。当我share这类工具给别人使用的时候,可能需要安装一堆package才能使用。设想了下别人使用该工具时需要做的步骤:
- 安装指定版本的python
- 安装指定版本的package(此例中为numpy-1.25.2, pandas-2.0.3)
后续
如果想实现混编并且比较好的打包,看来得研究一下虚拟环境下的C++工程的配置了。
参考
numpy与python版本不匹配-ImportError: Unable to import required dependencies: numpy
PYTHONPATH 变量为空
Step 5: Install packages in your Python environment
Pandas 如何查找已安装的版本
Numpy Troubleshooting
visual stdio c++调用python混合编程
Visual Studio在Release模式下开启debug调试
python脚本中的sys.path.append(“…”)详解