My first time at the EuroPython conference
At the EuroPython 2017 in Rimini, I have presented the talk titled: “Big Data Analytics at the MPCDF: GPU Crystallography with Python”, based on the current work I am involved in at the Max Planck Computing and Data Facility.
In close collaboration with scientists from different MPG the Max Planck Computing and Data Facility is engaged in the development and optimisation of algorithms and applications for high performance computing, as well as in the design and implementation of solutions for data-intensive projects. Python is now used at MPCDF in the emerging area of “atom probe crystallography” (APT): a Fourier spectral analysis in 3D reciprocal space can be simulated in order to reveal both composition and crystallographic structure at the atomic scale of billions APT experimental data sets. The Python data ecosystem has proved to be well suited to this, as it has grown beyond the confines of single machines to embrace scalability. This talk aims to describe our approach to scaling across multiple GPUs, and the role of our visualisation methods too. Our data workflow analysis relies on the GPU-accelerated Python software package called PyNX, an open source Python library which provides fast parallel computation scattering. The code is well suited for GPU computing, using both the pyCUDA and pyOpenCL libraries. Exploratory data analysis and performance tests are initially carried on through Jupyter notebooks and Python packages e.g., pandas, matplotlib/plotly. In production stage, interactive visualization is realized by using standard scientific tool, e.g. Paraview, an open-source 3D visualization program which e.g. requires Python modules to generate visualization components within VTK files.
Here, I would like to share with you the slides of my talk