Univerzitní knihovna OU - Úplné zobrazení záznamu

	.
	0 (hodnocen0 x )
	EB
	ONLINE
	Sato, Mitsuhisa
	XcalableMP PGAS Programming Language : From Programming Model to Applications
	Singapore : Springer Singapore Pte. Limited, 2020
	1 online resource (265 pages)
Externí odkaz	Plný text PDF
	* Návod pro vzdálený přístup


	ISBN 9789811576836 (electronic bk.)
	ISBN 9789811576829
	Print version: Sato, Mitsuhisa XcalableMP PGAS Programming Language Singapore : Springer Singapore Pte. Limited,c2020 ISBN 9789811576829
	Intro -- Preface -- Contents -- XcalableMP Programming Model and Language -- 1 Introduction -- 1.1 Target Hardware -- 1.2 Execution Model -- 1.3 Data Model -- 1.4 Programming Models -- 1.4.1 Partitioned Global Address Space -- 1.4.2 Global-View Programming Model -- 1.4.3 Local-View Programming Model -- 1.4.4 Mixture of Global View and Local View -- 1.5 Base Languages -- 1.5.1 Array Section in XcalableMP C -- 1.5.2 Array Assignment Statement in XcalableMP C -- 1.6 Interoperability -- 2 Data Mapping -- 2.1 nodes Directive -- 2.2 template Directive -- 2.3 distribute Directive -- 2.3.1 Block Distribution -- 2.3.2 Cyclic Distribution -- 2.3.3 Block-Cyclic Distribution -- 2.3.4 Gblock Distribution -- 2.3.5 Distribution of Multi-Dimensional Templates -- 2.4 align Directive -- 2.5 Dynamic Allocation of Distributed Array -- 2.6 template_fix Construct -- 3 Work Mapping -- 3.1 task and tasks Construct -- 3.1.1 task Construct -- 3.1.2 tasks Construct -- 3.2 loop Construct -- 3.2.1 Reduction Computation -- 3.2.2 Parallelizing Nested Loop -- 3.3 array Construct -- 4 Data Communication -- 4.1 shadow Directive and reflect Construct -- 4.1.1 Declaring Shadow -- 4.1.2 Updating Shadow -- 4.2 gmove Construct -- 4.2.1 Collective Mode -- 4.2.2 In Mode -- 4.2.3 Out Mode -- 4.3 barrier Construct -- 4.4 reduction Construct -- 4.5 bcast Construct -- 4.6 wait_async Construct -- 4.7 reduce_shadow Construct -- 5 Local-View Programming -- 5.1 Introduction -- 5.2 Coarray Declaration -- 5.3 Put Communication -- 5.4 Get Communication -- 5.5 Synchronization -- 5.5.1 Sync All -- 5.5.2 Sync Images -- 5.5.3 Sync Memory -- 6 Procedure Interface -- 7 XMPT Tool Interface -- 7.1 Overview -- 7.2 Specification -- 7.2.1 Initialization -- 7.2.2 Events -- References -- Implementation and Performance Evaluation of Omni Compiler -- 1 Overview -- 2 Implementation -- 2.1 Operation Flow.
	2.2 Example of Code Translation -- 2.2.1 Distributed Array -- 2.2.2 Loop Statement -- 2.2.3 Communication -- 3 Installation -- 3.1 Overview -- 3.2 Get Source Code -- 3.2.1 From GitHub -- 3.2.2 From Our Website -- 3.3 Software Dependency -- 3.4 General Installation -- 3.4.1 Build and Install -- 3.4.2 Set PATH -- 3.5 Optional Installation -- 3.5.1 OpenACC -- 3.5.2 XcalableACC -- 3.5.3 One-Sided Library -- 4 Creation of Execution Binary -- 4.1 Compile -- 4.2 Execution -- 4.2.1 XcalableMP and XcalableACC -- 4.2.2 OpenACC -- 4.3 Cooperation with Profiler -- 4.3.1 Scalasca -- 4.3.2 tlog -- 5 Performance Evaluation -- 5.1 Experimental Environment -- 5.2 EP STREAM Triad -- 5.2.1 Design -- 5.2.2 Implementation -- 5.2.3 Evaluation -- 5.3 High-Performance Linpack -- 5.3.1 Design -- 5.3.2 Implementation -- 5.3.3 Evaluation -- 5.4 Global Fast Fourier Transform -- 5.4.1 Design -- 5.4.2 Implementation -- 5.4.3 Evaluation -- 5.5 RandomAccess -- 5.5.1 Design -- 5.5.2 Implementation -- 5.5.3 Evaluation -- 5.6 Discussion -- 6 Conclusion -- References -- Coarrays in the Context of XcalableMP -- 1 Introduction -- 2 Requirements from Language Specifications -- 2.1 Images Mapped to XMP Nodes -- 2.2 Allocation of Coarrays -- 2.3 Communication -- 2.4 Synchronization -- 2.5 Subarrays and Data Contiguity -- 2.6 Coarray C Language Specifications -- 3 Implementation -- 3.1 Omni XMP Compiler Framework -- 3.2 Allocation and Registration -- 3.2.1 Three Methods of Memory Management -- 3.2.2 Initial Allocation for Static Coarrays -- 3.2.3 Runtime Allocation for Allocatable Coarrays -- 3.3 PUT/GET Communication -- 3.3.1 Determining the Possibility of DMA -- 3.3.2 Buffering Communication Methods -- 3.3.3 Non-blocking PUT Communication -- 3.3.4 Optimization of GET Communication -- 3.4 Runtime Libraries -- 3.4.1 Fortran Wrapper -- 3.4.2 Upper-layer Runtime (ULR) Library.
	3.4.3 Lower-layer Runtime (LLR) Library -- 3.4.4 Communication Libraries -- 4 Evaluation -- 4.1 Fundamental Performance -- 4.2 Non-blocking Communication -- 4.3 Application Program -- 4.3.1 Coarray Version of the Himeno Benchmark -- 4.3.2 Measurement Result -- 4.3.3 Productivity -- 5 Related Work -- 6 Conclusion -- References -- XcalableACC: An Integration of XcalableMP and OpenACC -- 1 Introduction -- 1.1 Hardware Model -- 1.2 Programming Model -- 1.2.1 XMP Extensions -- 1.2.2 OpenACC Extensions -- 1.3 Execution Model -- 1.4 Data Model -- 2 XcalableACC Language -- 2.1 Data Mapping -- Example -- 2.2 Work Mapping -- Restriction -- Example 1 -- Example 2 -- 2.3 Data Communication and Synchronization -- Example -- 2.4 Coarrays -- Restriction -- Example -- 2.5 Handling Multiple Accelerators -- 2.5.1 devices Directive -- Example -- 2.5.2 on_device Clause -- 2.5.3 layout Clause -- Example -- 2.5.4 shadow Clause -- Example -- 2.5.5 barrier_device Construct -- Example -- 3 Omni XcalableACC Compiler -- 4 Performance of Lattice QCD Application -- 4.1 Overview of Lattice QCD -- 4.2 Implementation -- 5 Performance Evaluation -- 5.1 Result -- 5.2 Discussion -- 6 Productivity Improvement -- 6.1 Requirement for Productive Parallel Language -- 6.2 Quantitative Evaluation by Delta Source Lines of Codes -- 6.3 Discussion -- References -- Mixed-Language Programming with XcalableMP -- 1 Background -- 2 Translation by Omni Compiler -- 3 Functions for Mixed-Language -- 3.1 Function to Call MPI Program from XMP Program -- 3.2 Function to Call XMP Program from MPI Program -- 3.3 Function to Call XMP Program from Python Program -- 3.3.1 From Parallel Python Program -- 3.3.2 From Sequential Python Program -- 4 Application to Order/Degree Problem -- 4.1 What Is Order/Degree Program -- 4.2 Implementation -- 4.3 Evaluation -- 5 Conclusion -- References.
	3.1 Overview -- 3.2 YML -- 3.3 OmniRPC-MPI -- 4 Application Development in the mSPMD Programming Environment -- 4.1 Task Generator -- 4.2 Workflow Development -- 4.3 Workflow Execution -- 5 Experiments -- 6 Eigen Solver on the mSPMD Programming Model -- 6.1 Implicitly Restarted Arnoldi Method (IRAM), Multiple Implicitly Restarted Arnoldi Method (MIRAM) and Their Implementations for the mSPMD Programming Model -- 6.2 Experiments -- 7 Fault-Tolerance Features in the mSPMD Programming Model -- 7.1 Overview and Implementation -- 7.2 Experiments -- 8 Runtime Correctness Check for the mSPMD Programming Model -- 8.1 Overview and Implementation -- 8.2 Experiments -- 9 Summary -- References -- XcalableMP 2.0 and Future Directions -- 1 Introduction -- 2 XcalableMP on Fugaku -- 2.1 Performance of XcalableMP Global View Programming -- 2.2 Performance of XcalableMP Local View Programming -- 3 Global Task Parallel Programming -- 3.1 OpenMP and XMP Tasklet Directive -- 3.2 A Proposal for Global Task Parallel Programming -- 3.3 Prototype Design of Code Transformation -- 3.4 Preliminary Performance -- 3.5 Communication Optimization for Manycore Clusters -- 4 Retrospectives and Challenges for Future PGAS Models -- 4.1 Low-Level Communication Layer for PGAS Model -- 4.2 XcalableMP as a DSL for Stencil Applications -- 4.3 XcalableMP API: Compiler-Free Approach -- 4.4 Global Task Parallel Programming Model for Accelerators -- References.
	Three-Dimensional Fluid Code with XcalableMP -- 1 Introduction -- 2 Global-View Programming Model -- 2.1 Domain Decomposition Methods -- 2.2 Performance on the K Computer -- 2.2.1 Comparison with Hand-Coded MPI Program -- 2.2.2 Optimization for SIMD -- 2.2.3 Optimization for Allocatable Arrays -- 3 Local-View Programming Model -- 3.1 Communications Using Coarray -- 3.2 Performance on the K Computer -- 4 Summary -- References -- Hybrid-View Programming of Nuclear Fusion Simulation Code in XcalableMP -- 1 Introduction -- 2 Nuclear Fusion Simulation Code -- 2.1 Gyrokinetic PIC Simulation -- 2.2 GTC -- 3 Implementation of GTC-P by Hybrid-view Programming -- 3.1 Hybrid-View Programming Model -- 3.2 Implementation Based on the XMP-Localview Model: XMP-localview -- 3.3 Implementation Based on the XMP-Hybridview Model: XMP-Hybridview -- 4 Performance Evaluation -- 4.1 Experimental Setting -- 4.2 Results -- 4.3 Productivity and Performance -- 5 Related Research -- 6 Conclusion -- References -- Parallelization of Atomic Image Reconstruction from X-ray Fluorescence Holograms with XcalableMP -- 1 Introduction -- 2 X-ray Fluorescence Holography -- 2.1 Reconstruction of Atomic Images -- 2.2 Analysis Procedure of XFH -- 3 Parallelization -- 3.1 Parallelization of Reconstruction of Two-Dimensional Atomic Images by OpenMP -- 3.2 Parallelization of Reconstruction of Three-dimensional Atomic Images by XcalableMP -- 4 Performance Evaluation -- 4.1 Performance Results of Reconstruction of Two-Dimensional Atomic Images -- 4.2 Performance Results of Reconstruction of Three-dimensional Atomic Images -- 4.3 Comparison of Parallelization with MPI -- 5 Conclusion -- References -- Multi-SPMD Programming Model with YML and XcalableMP -- 1 Introduction -- 2 Background: International Collaborations for the Post-Petascale and Exascale Computing -- 3 Multi-SPMD Programming Model.
	* elektronické knihy
	001894923
	express
	(Au-PeEL)EBL6403583
	(MiAaPQ)EBC6403583
	(OCoLC)1231606603

Anotace

Annotation