[关闭]
@CrazyHenry 2018-04-20T14:02:11.000000Z 字数 6850 阅读 3030

install faiss尝试intel MKL

hhhhfaiss


1.到官网下载Intel MKL

  1. 版本:MLK 2017.0.098 (2017 Initial Release)

2.install

  1. # you may have to set the LD_LIBRARY_PATH=$MKLROOT/lib/intel64 at runtime.
  2. # If at runtime you get the error:
  3. # Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.
  4. # You may add set
  5. # LD_PRELOAD=$MKLROOT/lib/intel64/libmkl_core.so:$MKLROOT/lib/intel64/libmkl_sequential.so
  6. # at runtime as well.
  7. echo 'export LD_LIBRARY_PATH="$MKLROOT/lib/intel64:$LD_LIBRARY_PATH"' >> ~/.bashrc
  8. echo 'export LD_PRELOAD="$MKLROOT/lib/intel64/libmkl_core.so:$MKLROOT/lib/intel64/libmkl_sequential.so:$LD_PRELOAD"' >> ~/.bashrc
  9. source ~/.bashrc
  1. 解压
  2. bash install.sh
  3. 3user级别安装
  4. 但之后貌似还是需要by root install如果没有lisence file
  5. $MKLROOT = /home/liyingmin/intel/compilers_and_libraries/linux/mkl
  6. echo 'export LD_LIBRARY_PATH="/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64:$LD_LIBRARY_PATH"' >> ~/.bashrc
  7. echo 'export LD_PRELOAD="/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_core.so:/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_sequential.so:$LD_PRELOAD"' >> ~/.bashrc
  8. source ~/.bashrc

3.重新make

  1. make uninstall #也许要切换到root
  2. make clean
  3. make

4.测试

  1. A basic usage example is in
  2. demos/demo_ivfpq_indexing
  3. it makes a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.

速度确实快了很多!

5.GPU机子install

先修改makefie.inc

取消MKL的注释,注释掉openblas

修改路径:$MKLROOT = /home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl

  1. # Copyright (c) 2015-present, Facebook, Inc.
  2. # All rights reserved.
  3. #
  4. # This source code is licensed under the BSD+Patents license found in the
  5. # LICENSE file in the root directory of this source tree.
  6. # -*- makefile -*-
  7. # tested on CentOS 7, Ubuntu 16 and Ubuntu 14, see below to adjust flags to distribution.
  8. CC=gcc
  9. CXX=g++
  10. CFLAGS=-fPIC -m64 -Wall -g -O3 -mavx -msse4 -mpopcnt -fopenmp -Wno-sign-compare -fopenmp
  11. CXXFLAGS=$(CFLAGS) -std=c++11
  12. LDFLAGS=-g -fPIC -fopenmp
  13. # common linux flags
  14. SHAREDEXT=so
  15. SHAREDFLAGS=-shared
  16. FAISSSHAREDFLAGS=-shared
  17. ##########################################################################
  18. # Uncomment one of the 4 BLAS/Lapack implementation options
  19. # below. They are sorted # from fastest to slowest (in our
  20. # experiments).
  21. ##########################################################################
  22. #
  23. # 1. Intel MKL
  24. #
  25. # This is the fastest BLAS implementation we tested. Unfortunately it
  26. # is not open-source and determining the correct linking flags is a
  27. # nightmare. See
  28. #
  29. # https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
  30. #
  31. # The latest tested version is MLK 2017.0.098 (2017 Initial Release) and can
  32. # be downloaded here:
  33. #
  34. # https://registrationcenter.intel.com/en/forms/?productid=2558&licensetype=2
  35. #
  36. # The following settings are working if MLK is installed on its default folder:
  37. MKLROOT=/home/liyingmin/intel/compilers_and_libraries/linux/mkl/
  38. BLASLDFLAGS=-Wl,--no-as-needed -L$(MKLROOT)/lib/intel64 -lmkl_intel_ilp64 \
  39. -lmkl_core -lmkl_gnu_thread -ldl -lpthread
  40. BLASCFLAGS=-DFINTEGER=long
  41. # you may have to set the LD_LIBRARY_PATH=/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64 at runtime.
  42. # If at runtime you get the error:
  43. # Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.
  44. # You may add set
  45. # LD_PRELOAD=/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_core.so:/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_sequential.so
  46. # at runtime as well.
  47. #
  48. # 2. Openblas
  49. #
  50. # The library contains both BLAS and Lapack. About 30% slower than MKL. Please see
  51. # https://github.com/facebookresearch/faiss/wiki/Troubleshooting#slow-brute-force-search-with-openblas
  52. # to fix performance problemes with OpenBLAS
  53. #BLASCFLAGS=-DFINTEGER=int
  54. # This is for Centos:
  55. #BLASLDFLAGS?=/usr/lib64/libopenblas.so.0
  56. # for Ubuntu 16:
  57. # sudo apt-get install libopenblas-dev python-numpy python-dev
  58. # BLASLDFLAGS?=/usr/lib/libopenblas.so.0
  59. # for Ubuntu 14:
  60. # sudo apt-get install libopenblas-dev liblapack3 python-numpy python-dev
  61. # BLASLDFLAGS?=/usr/lib/libopenblas.so.0 /usr/lib/lapack/liblapack.so.3.0
  62. #
  63. # 3. Atlas
  64. #
  65. # Automatically tuned linear algebra package. As the name indicates,
  66. # it is tuned automatically for a give architecture, and in Linux
  67. # distributions, it the architecture is typically indicated by the
  68. # directory name, eg. atlas-sse3 = optimized for SSE3 architecture.
  69. #
  70. # BLASCFLAGS=-DFINTEGER=int
  71. # BLASLDFLAGS=/usr/lib64/atlas-sse3/libptf77blas.so.3 /usr/lib64/atlas-sse3/liblapack.so
  72. #
  73. # 4. reference implementation
  74. #
  75. # This is just a compiled version of the reference BLAS
  76. # implementation, that is not optimized at all.
  77. #
  78. # BLASCFLAGS=-DFINTEGER=int
  79. # BLASLDFLAGS=/usr/lib64/libblas.so.3 /usr/lib64/liblapack.so.3.2
  80. #
  81. ##########################################################################
  82. # SWIG and Python flags
  83. ##########################################################################
  84. # SWIG executable. This should be at least version 3.x
  85. SWIGEXEC=swig
  86. # The Python include directories for a given python executable can
  87. # typically be found with
  88. #
  89. # python -c "import distutils.sysconfig; print distutils.sysconfig.get_python_inc()"
  90. # python -c "import numpy ; print numpy.get_include()"
  91. #
  92. # or, for Python 3, with
  93. #
  94. # python3 -c "import distutils.sysconfig; print(distutils.sysconfig.get_python_inc())"
  95. # python3 -c "import numpy ; print(numpy.get_include())"
  96. #
  97. PYTHONCFLAGS=-I/usr/include/python2.7/ -I/usr/lib64/python2.7/site-packages/numpy/core/include/
  98. ###########################################################################
  99. # Cuda GPU flags
  100. ###########################################################################
  101. # root of the cuda 8 installation
  102. CUDAROOT=/usr/local/cuda-8.0/
  103. CUDACFLAGS=-I$(CUDAROOT)/include
  104. NVCC=$(CUDAROOT)/bin/nvcc
  105. NVCCFLAGS= $(CUDAFLAGS) \
  106. -I $(CUDAROOT)/targets/x86_64-linux/include/ \
  107. -Xcompiler -fPIC \
  108. -Xcudafe --diag_suppress=unrecognized_attribute \
  109. -gencode arch=compute_35,code="compute_35" \
  110. -gencode arch=compute_52,code="compute_52" \
  111. -gencode arch=compute_60,code="compute_60" \
  112. --std c++11 -lineinfo \
  113. -ccbin $(CXX) -DFAISS_USE_FLOAT16
  114. # BLAS LD flags for nvcc (used to generate an executable)
  115. # if BLASLDFLAGS contains several flags, each one may
  116. # need to be prepended with -Xlinker
  117. BLASLDFLAGSNVCC=-Xlinker $(BLASLDFLAGS)
  118. # Same, but to generate a .so
  119. BLASLDFLAGSSONVCC=-Xlinker $(BLASLDFLAGS)
  1. scp l_mkl_2017.0.098.tgz yingmin.li@yz-gpu023.hogpu.cc:/home/users/yingmin.li/temp
  2. 只要输入了serial number3VGW-N6PJ7GCN)就行,选3user安装
  3. $MKLROOT = /home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl
  4. echo 'export LD_LIBRARY_PATH="/home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl/lib/intel64:$LD_LIBRARY_PATH"' >> ~/.bashrc
  5. echo 'export LD_PRELOAD="/home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_core.so:/home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_sequential.so:$LD_PRELOAD"' >> ~/.bashrc
  6. source ~/.bashrc
  7. make clean
  8. make

测试:

  1. A basic usage example is in
  2. demos/demo_ivfpq_indexing
  3. it makes a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.

GPU机子上疑似变慢了!

怀疑有人占用GPU资源:

  1. nvidia-smi #查询GPU占用率
  2. watch -n 1 nvidia-smi #1s刷新一次
  3. free -m #内存
  4. top #CPU使用率
  5. htop #CPU使用: https://linux.cn/article-3141-1.html
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注