loki.backend.cufgen module

cufgen(ir, depth=0, conservative=False, linewidth=132)

Generate CUDA Fortran code from one or many IR objects/trees.

Implemented by extending the FortranCodegen to support CUDA Fortran specific syntax. Refer to the CUDA_FORTRAN_PROGRAMMING_GUIDE for more information.

Supported subset of the CUDA Fortran specifications:

  • variable qualifiers e.g. attributes(device)

  • chevron syntax for to launch kernels e.g. call kernel<<<grid,block[,bytes][,streamid]>>>(arg1,arg2,...)

Natively supported (via FortranCodegen):

  • subroutine/function qualifiers e.g. attributes(global) via loki.Subroutine.prefix

  • kernel loop directives via loki.ir.Pragma

class CudaFortranCodegen(depth=0, indent='  ', linewidth=90, conservative=True)

Bases: FortranCodegen

Tree visitor that extends FortranCodegen with Cuda Fortran (CUF) language variations.

visit_CallStatement(o, **kwargs)
Format call statement as

CALL(<chevron>) <name>(<args>) with the chevron as launch configuration for device offloading, resulting in something like call kernel<<<grid,block[,bytes][,streamid]>>>(arg1,arg2,…)

visit_SymbolAttributes(o, **kwargs)
Format declaration attributes as

<typename>[(<spec>)] [, <attributes>]