-
Notifications
You must be signed in to change notification settings - Fork 115
Description
What type of issue is this?
- Bug in the code or other problem
- Inadequate/incorrect documation
- Feature request
If this is a bug report, please use the following template.
Otherwise, please delete the rest of the template.
Where does this bug appear?
Check all that apply:
- MacOS
- Linux
- Cray
- GCC
- Clang
- Intel compiler
- MPICH and derivatives (MVAPICH2, Intel MPI, Cray MPI, etc.)
- Open-MPI
Operating system
What is the output of uname -a?
Linux l0 4.18.0-17-generic #18~18.04.1-Ubuntu SMP Fri Mar 15 15:27:12 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Compiler
gcc
What is the output of ${COMPILER} -v or ${COMPILER} --version?
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
PRK build information
Please attach or inline make.defs.
#
# This file shows the GCC toolchain options for PRKs using
# OpenMP, MPI and/or Fortran coarrays only.
#
# Base compilers and language options
#
VERSION=-7
# C99 is required in some implementations.
CC=gcc${VERSION} -std=c11 -pthread
#EXTRA_CLIBS=-lrt
# All of the Fortran code is written for the 2008 standard and requires preprocessing.
FC=gfortran${VERSION} -std=f2008 -cpp
# C++11 may not be required but does no harm here.
CXX=g++${VERSION} -std=gnu++17 -pthread
#
# Compiler flags
#
# -mtune=native is appropriate for most cases.
# -march=native is appropriate if you want portable binaries.
DEFAULT_OPT_FLAGS=-O3 -mtune=native -ffast-math
#DEFAULT_OPT_FLAGS=-O0
DEFAULT_OPT_FLAGS+=-g3
#DEFAULT_OPT_FLAGS+=-fsanitize=undefined
#DEFAULT_OPT_FLAGS+=-fsanitize=undefined,leak
#DEFAULT_OPT_FLAGS+=-fsanitize=address
#DEFAULT_OPT_FLAGS+=-fsanitize=thread
# If you are compiling for KNL on a Xeon login node, use the following:
# DEFAULT_OPT_FLAGS=-g -O3 -march=knl
# See https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html for details.
#
#DEFAULT_OPT_FLAGS+=-fopt-info-vec-missed
DEFAULT_OPT_FLAGS+=-Wall #-Werror
DEFAULT_OPT_FLAGS+=-Wno-ignored-attributes -Wno-deprecated-declarations
#DEFAULT_OPT_FLAGS+=-mavx -mfma
#
# OpenMP flags
#
OPENMPFLAG=-fopenmp
OPENMPSIMDFLAG=-fopenmp-simd
OFFLOADFLAG=-foffload="-O3 -v"
ORNLACCFLAG=-fopenacc
#
# OpenCL flags
#
# MacOS
OPENCLFLAG=-framework OpenCL
# Linux
#OPENCLDIR=/etc/alternatives/opencl-intel-tools
#OPENCLFLAG=-I${OPENCLDIR} -L${OPENCLDIR}/lib64 -lOpenCL
OPENCLFLAG+=-Wno-ignored-attributes -Wno-deprecated-declarations
METALFLAG=-framework MetalPerformanceShaders
#
# SYCL flags
#
# triSYCL
# https://github.com/triSYCL/triSYCL is header-only so just clone in Cxx11 directory...
SYCLDIR=./triSYCL
SYCLCXX=${CXX} -std=c++17 ${OPENMPFLAG}
SYCLFLAG=-I$(SYCLDIR)/include
# ProGTX
# https://github.com/ProGTX/sycl-gtx
#SYCLDIR=${HOME}/Work/OpenCL/sycl-gtx
#SYCLCXX=${CXX} ${OPENMPFLAG}
#SYCLFLAG=-DUSE_SYCL -I${SYCLDIR}/sycl-gtx/include -L${SYCLDIR}/build/sycl-gtx -lsycl-gtx ${OPENCLFLAG}
METALFLAG=-framework MetalPerformanceShaders
#
# OCCA
#
#OCCADIR=${HOME}/prk-repo/Cxx11/occa
#
# Cilk
#
#CILKFLAG=-fcilkplus
#
# TBB
#
TBBDIR=/usr/local/Cellar/tbb/2019_U5_1
TBBFLAG=-I${TBBDIR}/include -L${TBBDIR}/lib -ltbb
#
# Parallel STL, Boost, etc.
#
BOOSTFLAG=-I/usr/local/Cellar/boost/1.69.0_2/include
RANGEFLAG=-DUSE_BOOST_IRANGE ${BOOSTFLAG}
#RANGEFLAG=-DUSE_RANGES_TS -I./range-v3/include
PSTLFLAG=${OPENMPSIMDFLAG} ${TBBFLAG} ${RANGEFLAG}
#PSTLFLAG=${OPENMPSIMDFLAG} ${TBBFLAG} -DUSE_INTEL_PSTL -I./pstl/include ${RANGEFLAG}
KOKKOSDIR=/opt/kokkos/gcc
KOKKOSFLAG=-I${KOKKOSDIR}/include -L${KOKKOSDIR}/lib -lkokkos ${OPENMPFLAG}
RAJADIR=/opt/raja/gcc
RAJAFLAG=-I${RAJADIR}/include -L${RAJADIR}/lib -lRAJA ${OPENMPFLAG} ${TBBFLAG}
THRUSTDIR=/Users/jrhammon/Work/NVIDIA/thrust
THRUSTFLAG=-I${THRUSTDIR} ${RANGEFLAG}
#
# SYCL flags
#
# triSYCL
# https://github.com/triSYCL/triSYCL is header-only so just clone in Cxx11 directory...
SYCLDIR=./triSYCL
SYCLCXX=${CXX} -O3 -Wall -std=c++17 ${OPENMPFLAG}
SYCLFLAG=-I${SYCLDIR}/include ${BOOSTFLAG} -DTRISYCL
# ProGTX
# https://github.com/ProGTX/sycl-gtx
#SYCLDIR=${HOME}/Work/OpenCL/sycl-gtx
#SYCLCXX=${CXX} ${OPENMPFLAG}
#SYCLFLAG=-I${SYCLDIR}/sycl-gtx/include -L${SYCLDIR}/build/sycl-gtx -lsycl-gtx ${OPENCLFLAG}
SYCLFLAG+=${RANGEFLAG}
#
# SYCL flags
#
# triSYCL
# https://github.com/triSYCL/triSYCL is header-only so just clone in Cxx11 directory...
SYCLDIR=./triSYCL
SYCLCXX=${CXX} -std=c++17 ${OPENMPFLAG}
SYCLFLAG=-I${SYCLDIR}/include ${BOOSTFLAG}
# ProGTX
# https://github.com/ProGTX/sycl-gtx
#SYCLDIR=${HOME}/Work/OpenCL/sycl-gtx
#SYCLCXX=${CXX} ${OPENMPFLAG}
#SYCLFLAG=-DUSE_SYCL -I${SYCLDIR}/sycl-gtx/include -L${SYCLDIR}/build/sycl-gtx -lsycl-gtx ${OPENCLFLAG}
#
# CBLAS for C++ DGEMM
#
BLASFLAG=-DACCELERATE -framework Accelerate
CBLASFLAG=-DACCELERATE -framework Accelerate -flax-vector-conversions
#
# CUDA flags
#
# Mac w/ CUDA emulation via https://github.com/hughperkins/coriander
#NVCC=/opt/llvm/cocl/bin/cocl
# Linux w/ NVIDIA CUDA
NVCC=nvcc
CUDAFLAGS=-g -O3 -std=c++11 -arch=sm_50
# https://github.com/tensorflow/tensorflow/issues/1066#issuecomment-200574233
CUDAFLAGS+=-D_MWAITXINTRIN_H_INCLUDED
#
# ISPC
#
ISPC=ispc
ISPCFLAG=-O3 --target=host --opt=fast-math
#
# MPI
#
# We assume you have installed an implementation of MPI-3 that is in your path.
MPICC=mpicc -std=c99
#
# Fortran 2008 coarrays
#
# see https://github.com/ParRes/Kernels/blob/master/FORTRAN/README.md for details
# single-node
COARRAYFLAG=-fcoarray=single -lcaf_single
# multi-node
# COARRAYFLAG=-fcoarray=lib -lcaf_mpi
MEMKINDDIR=/home/parallels/PRK/deps
MEMKINDFLAGS=-I${MEMKINDDIR}/include -L${MEMKINDDIR}/lib -lmemkind -Wl,-rpath=${MEMKINDDIR}/lib
Output showing problem
When the sparse OpenMP benchmark is run with ./sparse 12 2 11 10 the program tries to write data to memory which has not been allocated. To find this error, please comment out line 262 and 263 of sparse.c, and then outside the for loop, on line 265 add printf("nent: %llu elm: %llu \n", nent, elm+4);. This will show that the length of col_index, nent, is 171966464, and the program tries to write to 171966467. Due to the parallel nature of this program, there is a strong chance that each thread is writing outside of it the boundaries of its own array. Furthermore, if the program is compiled with icc --check-pointers=rw on linux, the solution fails to validate.
If the output is short, please inline it here.
Otherwise, please pipe it to a plain text file and attach that file.
Note that you may need to use $command 2>&1 $log to capture the error messages.
Please do not attach screenshots of your terminal.