前言
最近要用C++折腾一些东西,涉及到矩阵运算,看了一下网上推荐的数学库,貌似MKL还是蛮不错滴,放到VS2013里面试试
国际惯例,来波地址
blas, cblas, openblas, atlas, lapack, mkl性能对比
Compiling and Linking Intel® Math Kernel Library with Microsoft* Visual C++*
Visual Studio 2013配置Intel MKL
Intel MKL 在VS中的配置与安装笔记
Getting Started with Intel® Math Kernel Library 2017 for Windows
Developer Reference for Intel® Math Kernel Library 2017 - C
Multiplying Matrices Using dgemm
官方mkl开发文档
安装
下载
MKL安装文件云盘共享:链接:http://pan.baidu.com/s/1qYRRIKs 密码:x9db
安装的时候还是得去官网申请序列号的,不然只能试用了。我的序列号刮开可见:
33RM-RDRJWB75 |
然后就是一直不断下一步就行了,安装完毕,会有这个目录C:\Program Files (x86)\IntelSWTools
我这个目录可能有点多,主要是因为在后期update了一下,可以发现有compilers_and_libraries_2017.0.109和compilers_and_libraries_2017.2.187,但是前缀都是一样的compilers_and_libraries_2017后面应该是新版的发布日期
安装
安装的话,主要按照官网的教程来,分为 Automatically和Manually两种方法,这里就尝试自动第一种自动方法吧,就两步搞定。
- 随便新建一个C++工程文件和源文件
- 然后右键test1->属性->Intel Performance Librarys->use Intel MKL,选择Parallel
- 在C/C++->代码生成->运行库,直接选择多线程(/MT)即可,也就是选择了lib静态链接库文件,如果是动态链接库文件,还得添加挺多lib文件的,以后遇到再补充。详细可以看看动态库和静态库的区别
测试
直接使用官网提供的代码Multiplying Matrices Using dgemm
实例做的运算是矩阵乘法
调用函数是
cblas_dgemm
,查官方文档第111页,得到参数列表
void cblas_dgemm (const CBLAS_LAYOUT Layout, const CBLAS_TRANSPOSE transa, const
CBLAS_TRANSPOSE transb, const MKL_INT m, const MKL_INT n, const MKL_INT k, const double
alpha, const double *a, const MKL_INT lda, const double *b, const MKL_INT ldb, const
double beta, double *c, const MKL_INT ldc);
各参数的意思也在112页有详细说明,这里简单说说
Layout:二维矩阵是以行为主,还是列为主
transa:指定对第一个输入矩阵的操作,也就是在与第二个矩阵相乘之前的变换,提供了三种参数,CblasNoTrans代表原封不动输入,CblasNoTrans代表转置再输入,CblasConjTrans代表共轭转置输入
transb:同transa,对矩阵的预处理操作
m:矩阵A和C的行数
n:矩阵B和C的列数,因为是矩阵相乘嘛,自己想想m*k与k*n的相乘结果
k:矩阵A的列数,矩阵B的行数
alpha:缩放因子
a、lda、b、ldb:针对前两个参数的输入均有不同的四种情况,具体看文档
c:针对行优先还是列优先有不同的输出
ldc:指定c矩阵是行优先还是列优先
具体使用方法,主要还是C++的基本步骤:声明变量,注意矩阵使用指针类型定义;然后用mkl_malloc
开辟空间,接下来for
循环初始化矩阵;调用cblas_dgemm
运算;输出,并利用mkl_free
释放内存。
/* C source code is found in dgemm_example.c */#define min(x,y) (((x) < (y)) ? (x) : (y))#include <stdio.h>
#include <stdlib.h>
#include "mkl.h"int main()
{double *A, *B, *C;int m, n, k, i, j;double alpha, beta;printf("\n This example computes real matrix C=alpha*A*B+beta*C using \n"" Intel(R) MKL function dgemm, where A, B, and C are matrices and \n"" alpha and beta are double precision scalars\n\n");m = 2000, k = 200, n = 1000;printf(" Initializing data for matrix multiplication C=A*B for matrix \n"" A(%ix%i) and matrix B(%ix%i)\n\n", m, k, k, n);alpha = 1.0; beta = 0.0;printf(" Allocating memory for matrices aligned on 64-byte boundary for better \n"" performance \n\n");A = (double *)mkl_malloc(m*k*sizeof(double), 64);B = (double *)mkl_malloc(k*n*sizeof(double), 64);C = (double *)mkl_malloc(m*n*sizeof(double), 64);if (A == NULL || B == NULL || C == NULL) {printf("\n ERROR: Can't allocate memory for matrices. Aborting... \n\n");mkl_free(A);mkl_free(B);mkl_free(C);return 1;}printf(" Intializing matrix data \n\n");for (i = 0; i < (m*k); i++) {A[i] = (double)(i + 1);}for (i = 0; i < (k*n); i++) {B[i] = (double)(-i - 1);}for (i = 0; i < (m*n); i++) {C[i] = 0.0;}printf(" Computing matrix product using Intel(R) MKL dgemm function via CBLAS interface \n\n");cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,m, n, k, alpha, A, k, B, n, beta, C, n);printf("\n Computations completed.\n\n");printf(" Top left corner of matrix A: \n");for (i = 0; i<min(m, 6); i++) {for (j = 0; j<min(k, 6); j++) {printf("%12.0f", A[j + i*k]);}printf("\n");}printf("\n Top left corner of matrix B: \n");for (i = 0; i<min(k, 6); i++) {for (j = 0; j<min(n, 6); j++) {printf("%12.0f", B[j + i*n]);}printf("\n");}printf("\n Top left corner of matrix C: \n");for (i = 0; i<min(m, 6); i++) {for (j = 0; j<min(n, 6); j++) {printf("%12.5G", C[j + i*n]);}printf("\n");}printf("\n Deallocating memory \n\n");mkl_free(A);mkl_free(B);mkl_free(C);printf(" Example completed. \n\n");return 0;
}
最好在运行时候,看看#include"mkl.h"
是否有智能提示,或者会不会有红线说找不到库文件等错误
运行结果
This example computes real matrix C=alpha*A*B+beta*C usingIntel(R) MKL function dgemm, where A, B, and C are matrices andalpha and beta are double precision scalarsInitializing data for matrix multiplication C=A*B for matrixA(2000x200) and matrix B(200x1000)Allocating memory for matrices aligned on 64-byte boundary for betterperformanceIntializing matrix dataComputing matrix product using Intel(R) MKL dgemm function via CBLAS intComputations completed.Top left corner of matrix A:1 2 3 4 5 6201 202 203 204 205 206401 402 403 404 405 406601 602 603 604 605 606801 802 803 804 805 8061001 1002 1003 1004 1005 1006Top left corner of matrix B:-1 -2 -3 -4 -5 -6
-1001 -1002 -1003 -1004 -1005 -1006
-2001 -2002 -2003 -2004 -2005 -2006
-3001 -3002 -3003 -3004 -3005 -3006
-4001 -4002 -4003 -4004 -4005 -4006
-5001 -5002 -5003 -5004 -5005 -5006
Top left corner of matrix C:
-2.6666E+009-2.6666E+009-2.6667E+009-2.6667E+009-2.6667E+009-2.6667E+009
-6.6467E+009-6.6467E+009-6.6468E+009-6.6468E+009-6.6469E+009 -6.647E+009
-1.0627E+010-1.0627E+010-1.0627E+010-1.0627E+010-1.0627E+010-1.0627E+010
-1.4607E+010-1.4607E+010-1.4607E+010-1.4607E+010-1.4607E+010-1.4607E+010
-1.8587E+010-1.8587E+010-1.8587E+010-1.8587E+010-1.8588E+010-1.8588E+010
-2.2567E+010-2.2567E+010-2.2567E+010-2.2567E+010-2.2568E+010-2.2568E+010
Deallocating memoryExample completed.请按任意键继续. . .
后续
下一篇将看看MKL具有的矩阵运算功能