Native FORTRAN, C and C++ Compilers for Linux, Mac OS X and Windows
The packaging of PGI licenses has been changed with the release of version PGI 2016 version 16.10. Alfasoft and PGI recommends PGI Professional Edition which includes PGI Fortran, C and C++ compilers and tools for x86-64 and OpenPOWER multicore CPUs and NVIDIA Tesla GPUs, including all OpenACC, OpenMP and CUDA Fortran features. PGI Professional Edition is a perpetual license offered with technical support with frequent PGI updates that include the latest PGI feature enhancements, performance improvements and bug fixes. The PGI Professional Edition is for HPC experts who need cutting edge compilers and support for production software development.
On Intel Haswell CPUs with OpenMP, PGI delivers multicore performance more than 50% faster than the latest GCC compilers. That’s like buying a cluster with 50% more compute nodes. PGI compilers deliver world-class multicore CPU performance and accelerator programming features that can dramatically increase the performance of applications on GPU accelerators.
Accelerate Your Code with OpenACC
Is your application 10s or 100s of thousands of lines of Fortran, C and C++ code? With OpenACC directives, you don’t have to parallelize all of it at once. You can identify hot loops and code regions using the PGPROF profiler, then incrementally parallelize and tune them one by one. OpenACC code remains 100% standard-compliant and portable to other compilers and platforms, and enables parallel processing on CPUs and GPUs using identical source code.
PGI Supports All Major HPC Platforms
HPC servers are quickly expanding beyond multicore x86 CPUs to OpenPOWER, ARM and GPU accelerators. PGI Fortran, C and C++ compilers and OpenACC are designed to deliver high performance on all of these processors. PGI compilers for x86 and GPUs are available now, including OpenACC parallelization across all cores of a multicore CPU or a GPU. The Beta Evaluation Program for PGI compilers on OpenPOWER CPUs coupled to NVIDIA Tesla GPUs is coming this summer. PGI and OpenACC deliver the performance you need today, and the flexibility you need tomorrow. PGI compilers can take you there.
Performance Profiling and Optimization
PGPROF is a powerful and easy-to-use interactive performance profiler for parallel pro- grams written with OpenMP or OpenACC directives, or using CUDA. Use PGPROF to vi- sualize and analyze the performance of your Fortran, C and C++ programs. PGPROF can correlate execution time with procedures, source code and instructions, allowing you to quickly see where and how execution time is spent. Through resource utilization data and compiler feedback information, PGPROF provides features that will help you under- stand why parts of your program have high execution times and how you can modify your source code or compiler options to improve performance. PGPROF is included with all PGI products.
A Fortran-friendly Debugger
PGDBG is a graphical debugger for Fortran, C and C++ that supports of debugging se- rial and parallel programs including MPI, OpenMP and hybrid MPI/OpenMP applications. PGDBG can debug programs on SMP workstations, servers, distributed-memory clusters and hybrid clusters where each node contains multiple multicore x86 processors. PG- DBG allows you to control threads or processes individually or in groups, and allows you to examine state down to the register level. PGDBG is also included with all PGI products.
- PGFORTRAN™ native OpenMP, OpenACC and auto-parallel Fortran 2003 compiler with CUDA extensions
- PGCC® OpenMP, OpenACC and auto-parallel ANSI and K&R C11 compiler
- PGC++® OpenMP, OpenACC and auto-parallel GNU 4.8 g++ compatible C++14 compiler with CUDA-x86 extensions (not available on Windows)
- PGDBG® OpenMP and MPI parallel graphical debugger*
- PGPROF® OpenMP and OpenACC parallel graphical performance profiler
- Full 64-bit support on multi-core OpenPOWER and x86
- Full support for OpenMP 3.1 on up to 256 cores
- Comprehensive OpenACC 2.5 support
- PGI Unified Binary™ technology combines into a single executable or object file code optimized for multiple x86 processors, NVIDIA GPUs or AMD GPUs
- Complete uniform development environment across x86 processor-based systems running Linux, OS X or Windows and OpenPOWER processor-based systems running Linux
- Comprehensive set of compiler optimizations including one pass interprocedural analysis (IPA)*, interprocedural optimization of libraries*, profile feedback optimization*, dependence analysis and global optimization, function inlining including library functions, vectorization, invariant conditional removal, loop interchange, loop splitting, loop unrolling, loop fusion and more.
- Support for 64-bit integers (-r8/-i8 compilation flags)
- Memory hierarchy and memory allocation optimizations including huge pages support
- Auto-parallelization of loops specifically optimized for multi-core processors
- Concurrent subroutine call support
- Highly tuned Intel MMX and SSE intrinsics library routines (C/C++ only)
- Tuning for non-uniform memory access (NUMA) architectures
- Process/CPU affinity support in SMP/OpenMP applications
- Support for creating shared objects on Linux, dynamic libraries on macOS and DLLs on Windows
- Integrated cpp pre-processing
- Cray/DEC/IBM extensions (including Cray POINTERs & DEC STRUCTURES/UNIONS); support for SGI-compatible DOACROSS in Fortran
- Full support for Common Compiler Feedback Format compiler optimization listings
- User modules support simplifies switching between multiple compiler environments/versions
- C/C++ plug-in for Eclipse*
- Bundled precompiled libraries including ScaLAPACK (Linux & macOS), Open MPI (Linux only), MPICH (macOS) and MS-MPI library (Windows only)
- Includes optimized 64-bit OpenBLAS (LAPACK/BLAS) and 32-bit LAPACK math libraries
- Supports multi-threaded execution with Intel Math Kernel Libraries (MKL) 10.1 and later on x86
- UNIX-compatible build/edit environment for Windows, including the BASH shell, vi editor, make, tar, gzip, sed, grep, awk, and over 100 other shell commands!
- Interoperable with TotalView* (Linux only) and Allinea DDT*.
- Interoperable with gcc, g77, g++ and gdb
- Unconditional 30 day money back guarantee
Features marked with an asterisk (*) are not currectly supported on OpenPOWER.
PGI 17.10 Now Available
The newest update to PGI Fortran, C and C++ compilers & tools for scientists and engineers. Includes Volta support, using CUDA Unified Memory with OpenACC, Open 4.5 CPU support, C++14 lambda and capture support within OpenACC and more.
Tesla V100 GPU Support
PGI OpenACC and CUDA Fortran now support Tesla V100. Based on the new NVIDIA Volta GV100 GPU, Tesla V100 offers more memory bandwidth, more streaming multiprocessors, next generation NVLink and new microarchitectural features that add up to better performance and programmability. For OpenACC and CUDA Fortran programmers, Tesla V100 offers improved hardware support and performance for CUDA Unified Memory features on both x86-64 and OpenPOWER processor-based systems.
OpenACC for CUDA Unified Memory
PGI compilers now leverage Pascal and Volta GPU hardware features, NVLink and CUDA Unified Memory to simplify OpenACC programming on GPU-accelerated x86-64 and OpenPOWER processor-based servers. When OpenACC allocatable data is placed in CUDA Unified Memory, no explicit data movement or data directives are needed. This simplifies GPU acceleration of applications that make extensive use of allocatable data, and allows you to focus on parallelization and scalability of your algorithms. See the OpenACC and CUDA Unified Memory PGInsider post for details.
Automatic Deep Copy of Fortran Derived Types
Automatic deep copy of Fortran derived types allows you to port applications with modern deeply nested data structures to Tesla GPUs using OpenACC. The PGI 17.7 compilers allow you to list aggregate Fortran data objects in OpenACC COPY, COPYIN, COPYOUT and UPDATE directives to move them between host and device memory including traversal and management of pointer-based objects within the aggregate data object.-64 and OpenPOWER processor-based servers. When OpenACC allocatable data is placed in CUDA Unified Memory, no explicit data movement or data directives are needed. This simplifies GPU acceleration of applications that make extensive use of allocatable data, and allows you to focus on parallelization and scalability of your algorithms. See the OpenACC and CUDA Unified Memory PGInsider post for details.
The updated PGI C++ compiler includes incremental C++17 features, and is supported as a CUDA 9.0 NVCC host compiler on both Linux/x86-64 and Linux/OpenPOWER platforms. It delivers an average 20% performance improvement on the LCALS loops benchmarks with no abstraction penalty, now supports lambdas with capture in OpenACC GPU-accelerated compute regions and is now interoperable with GNU 6.3.
Use C++14 Lambdas with Capture in OpenACC Regions
C++ lambda expressions provide a convenient way to define anonymous function objects at the location where they are invoked or passed as arguments. The auto type specifier can be applied to lambda parameters to create a polymorphic lambda-expression. Starting with the PGI 17.7 release, you can now use lambdas in OpenACC compute regions in your C++ programs. Using lambdas with OpenACC is useful for a variety of reasons. One example is to drive code generation customized to different programming models or platforms. C++14 has opened up doors for more and more lambda use cases, especially for polymorphic lambdas, and all of those capabilities are now usable in your OpenACC programs.
PGI Compilers Are Now Interoperable With the cuSOLVER Library
You can now call optimized cuSolverDN routines from CUDA Fortran and OpenACC Fortran using the PGI-supplied interface module and the PGI-compiled version of the cuSOLVER library bundled with PGI 17.7. This same cuSolver library is callable from PGI OpenACC C/C++ as they are built using PGI compilers and are compatible with and use the PGI OpenMP runtime. Read more about Using the cuSOLVER Library from CUDA Fortran.
PGI Unified Binary for Tesla and Multicore
Use OpenACC to build applications for both GPU acceleration and parallel execution across all the cores of a multicore server. When you run the application on a GPU-enabled system, the OpenACC regions will offload and execute on the GPU. When the same application executable is run on a system without GPUs installed, the OpenACC regions will be executed in parallel across all CPU cores in the system. If you develop commercial or production applications, now you can accelerate your code with OpenACC and deploy a single binary usable on any system, with or without GPUs.
New Profiling Features for OpenACC and CUDA Unified Memory
The PGI Profiler adds new OpenACC profiling features including support on multicore CPUs with or without attached GPUs, and a new summary view that shows time spent in each OpenACC construct. New CUDA Unified Memory features include correlating CPU page faults with the source code lines where the associated data was allocated, support for new CUDA Unified Memory page thrashing, throttling and remote map events, NVLINK support and more. Read about the new CUDA Unified Memory profiling features in CUDA 9 Features Revealed.
Host Processor: 64-bit OpenPOWER, 64-bit AMD64, 64-bit Intel 64 or 32-bit x86 processor-based workstation or server with one or more single core or multi-core microprocessors.
Accelerator (optional): NVIDIA CUDA-enabled GPU with compute capability 2.0 or later. AMD Radeon HD 7700, 7800 and 7900 series, and R7 series GPUs (Cape Verde, Tahiti or Spectre).
- OpenPOWER Linux: Ubuntu 14.04, 14.10 and Red Hat Enterprise Linux 7.3 beta.
- x86 Linux: CentOS 5 or newer, SUSE 11 or newer, SUSE Linux Enterprise Server (SLES) 11 or newer, OpenSUSE 10.2 or newer, Red Hat Enterprise Linux 5 or newer, Fedora Core 6 or newer or Ubuntu 12.04 or newer. Fully interoperable with versions of Linux using kernel revision 2.6 and glibc 2.5 or newer.
- Apple OS X version 10.7 Lion or newer (64-bit and 32-bit) and Xcode 4.2 or newer. Radeon accelerators are not supported on OS X.
- Microsoft Windows 10, Windows 8.1, Windows 8, Server 2012, Windows 7, and Server 2008 R2. Both 64-bit and 32-bit versions are supported where available.
Building 64-bit executables requires a 64-bit operating system.
Please Note: 32-bit development is deprecated with the PGI 2016 release and will no longer be available with the PGI 2017 release
Memory: 16 MB or more.
Hard Disk: 1.5 GB during installation, 700 MB to hold installed software.
Peripherals: Mouse or compatible pointing device for use of optional graphical user interfaces.
The PGI Support Service entitles the subscriber to new licenses for new releases. Typically, support is valid for one year from date of purchase. New license purchases include 30 days of support service. If you did not purchase support when you purchased your license, or if your support has expired, you can qualify for the current relase by bringing your suport current.
PGI Support includes the following:
- Ongoing technical support by electronic mail. Support requests may be sent by fax to +1-503-682-2637, or online using the PGI technical support request form.
- Release updates for licensed product(s) at no additional cost, except for any administrative fee that may apply.
- Full license fee credits on product upgrades, except for any administrative fee that may apply. "Product upgrades" refer to exchanging one product license for a more expensive product license, and is not the same as a version or "Release" upgrade referenced above.
- Full license fee credits on user-count upgrades, except for any administrative fee that may apply.
- Support subscriber-only forums and other services available at www.pgroup.com
- Support subscriber-only special offers, discounts and promotions.
PGI is offered in three different editions:
PGI Community Edition – PGIs unsupported version of the PGI compilers including essentially all features of our previous PGI Accelerator Fortran/C/C++ Workstation product, but with a license-to-use that is limited to 1 year from the date of release. CE is now available for Linux, OpenPOWER and Mac. A Windows version may be available at a later date.
PGI Professional Edition – PGIs recommended compiler for professional users. A perpetual license to current and all previous releases of the PGI Fortran, C and C++ compilers and tools for multicore CPUs and NVIDIA Tesla GPUs, including all OpenACC, OpenMP and CUDA Fortran features. Enables development of performance-portable HPC applications with uniform source code across the most widely used parallel processors and systems. The PGI Professional Edition is for HPC experts who need cutting edge compilers and support for production software development.
PGI Enterprise – For large organizations that want a site-wide license and Premier Services, PGI also offer this annual, unlimited seat-count license.
A perpetual license to current and all previous releases of the PGI Fortran, C and C++ compilers and tools for multicore CPUs and NVIDIA Tesla GPUs, including all OpenACC, OpenMP and CUDA Fortran features. Enables development of performance-portable HPC applications with uniform source code across the most widely used parallel processors and systems.
Offers technical support with frequent PGI updates that include the latest PGI feature enhancements, performance improvements and bug fixes. The PGI Professional Edition is for HPC experts who need cutting edge compilers and support for production software development.
The price for an Academic license of PGI Professional Edition is discounted with 50% compared to a commercial license.
All PGI Professional Edition licenses are permanent and perpetual—they never stop working with the PGI release version for which they were created. FlexNet-managed licenses also work with all earlier PGI release versions back to 7.2. Except as noted below, all PGI permanent licenses are specific to individual PGI products running on a single operating system (e.g. PGI Fortran/C/C++ Workstation for Linux).
Node-Locked licenses restrict use to a particular host and one user at a time. They are very useful when a number of users want to share a PGI product. Node-locked licenses for x86 systems require the license service (lmgrd) run on the same machine the compilers are run. Executables can run on any other compatible machine. Node-locked licenses for OpenPOWER use a proprietary licensing scheme.
Floating licenses allow the license service to run on a machine different from the machines running the compilers. Floating licenses allow a mix of system types running different operating systems. The maximum number of concurrent users is determined by counting usage across all of the systems running the PGI compilers. Floating licenses usually require only a single license server. Users with floating licenses can "borrow" seats for out-of-office compiling.