ALCF summer students gain hands-on HPC experience
October 4, 2022 – As part of the ALCF Summer Student Program, more than 30 undergraduate and graduate students worked alongside staff mentors to gain real-world experience with supercomputing projects , data science and AI.
Every summer the Argonne Leadership Computing Facility (ALCF), a U.S. Department of Energy, Office of Science User Facility located at the DOE’s Argonne National Laboratory, welcomes a new group of students to undertake real-world scientific computing projects, providing valuable opportunities to work with research teams and learn new skills .
“It’s important to provide students with educational and professional opportunities to take the next steps, gain confidence, and have new experiences working on impactful research projects outside of the classroom,” says Michael Papka, director of the ALCF and professor of computer science at the University of Illinois Chicago. “Our student summer program gives them the chance to see the possibilities in their careers.”
This year’s ALCF summer student class, which included more than 30 students ranging from undergraduates to doctoral students, discussed projects to use artificial intelligence (AI) to analyze songwriting. birds, visualize large scientific datasets, advance high-energy physics research, and more. In the summaries below, five of the students talked about what they’ve been working on this summer and where they think the experience will take them next.
AI analysis of bird audio
Saumya Singh, an AI graduate student at Northwestern University, is interested in researching self-supervised learning and reinforcement learning in AI and natural language processing. This summer, she worked with mentors Michael Papka and Argonne computer scientist Nicola Ferrier on a project that used AI to analyze the sound of bird songs collected from microphones in forests to provide information on their ecosystems.
Singh was drawn to this project because of its environmental significance and what it can reveal about forest ecosystems. “Birds or animals are great predictors of the environment they live in,” she says.
Using a new algorithm pioneered by Facebook AI Research for analysis, his project used self-supervised learning, meaning the algorithm did not require labels to be provided by researchers.
“The main thing that I think is going to help me is self-supervised learning, because the main problem we have for one of the data science projects is labeling the pre-processing data, so that will be great if we can solve the problem,” Singh says. “I can apply it to several other projects.”
Having previously worked with images and text, this project allowed working with sound, large datasets and new algorithms. “All these new techniques that I worked on,” says Singh, “seemed to be really fruitful for me to continue on this career path in data science and machine learning.”
Command Line Interface, Python Concurrency, and AI Models
Alan Wang, a computer science student at the University of Illinois, wanted to work at the ALCF because of the powerful supercomputers and software tools it makes available for research. Although he is primarily interested in systems security, Wang’s research at the ALCF has focused on facility operations, the Python programming language, and AI.
This summer, he worked on three projects with ALCF mentors Paul Rich, George Rojas, and Bill Allcock: a command-line interface project to make it easier for system administrators to search the home directory of the entire ALCF; a Python competition project comparing the speeds and performance of different currency libraries; and a project running AI models that use the open-source machine learning frameworks PyTorch and TensorFlow on the ALCF AI Testbed’s Cerebras and SambaNova systems.
Wang says one of the biggest things he took away from this summer was learning more about using Python. He started the internship with about five years of Python experience, saying, “I thought I had it all down, but not even close. So I learned a lot about Python and was exposed to its use in many different environments. Wang also discovered new software tools, such as the Emacs text editor, and worked with AI for the first time.
“I was surprised at how interconnected AI was with systems, so knowing both sides and having an AI background will also be extremely helpful to me in the future,” Wang says.
Benchmarking Graph Neural Networks for Science on AI Accelerators
Ryien Hosseini’s work with the ALCF team was at the intersection of neural network algorithms and high performance computing. “My projects have used computing resources to see how far we can push these algorithms known as graphical neural networks for various scientific applications,” he says.
Hosseini, a graduate student in electrical and computer engineering at the University of Michigan, was interested in working at the ALCF because of the research-oriented nature of the internship and to gain access to the institution’s powerful computing resources. This summer, with ALCF mentors Filippo Simini and Venkat Vishwanath, he co-authored a workshop paper that evaluated the performance of graphical neural networks on NVIDIA GPUs (graphics processing units) and worked on a another project that examined the performance of graphical neural networks on specialized hardware platforms.
Additionally, Hosseini has contributed to an effort that uses chemical docking for drug discovery. The project builds on previous work because, instead of just using neural networks to select molecules, they now use “the neural network as a pre-filter in order to choose a high percentage of candidates, and then those- These will go into a non-classical algorithm based on machine learning, which is more efficient in arriving at these final numerical estimates,” says Hosseini.
“I feel like I learned a lot both from thinking about high-level research ideas, high-level algorithms, and then really getting into the thick of it and doing the programming in order to put implement these algorithms,” says Hosseini, who will apply for doctoral programs in the fall. “Having this structured and rigorous research experience was really helpful.”
High-quality visualizations for large scientific datasets
Alina Kanayinkal is interested in computer graphics, especially the computer side of animation. During her summer at the ALCF, she worked with Message Passing Interface, or MPI (a communications protocol for parallel computer programming), and image rendering, continuing the work she started in as a student assistant to Tommy Marrinan, an Argonne scientist teaching at the University. of Saint Thomas.
For his summer project, Kanayinkal’s work focused on creating a workflow for rendering high-quality visualizations of large-scale datasets. His research aims to take advantage of cinematic rendering tools (similar to those used by Pixar and Dreamworks) to create visualizations of scientific datasets that are too large or time-consuming to render on a single computer. While the workflow is generic enough for many types of scientific data, Kanayinkal worked with data from a coupled fluid and particle flow simulation to study cancer cell transport as well as a simulation of molecular dynamics to study the friction of materials. The ultimate goal of his studies is to develop an easier and less time-consuming way to create these visualizations.
Kanayinkal says one of his biggest takeaways from that summer at the ALCF was realizing that research is “not a big, scary thing. It’s a big thing, but it’s not so big that it’s overwhelming. She also became more comfortable with learning on the fly, for example learning MPI and the OpenEXR format for imaging applications.
Going forward, she continues to work with Marrinan and pick projects she likes to work on, saying “if it’s something you like and you’re frustrated, you’ll just take a five-minute break, then come back and continue working on it rather than just saying, “Forget it.” I’m going to do something else.
Hyperparameter Optimization and Scaling Studies for ML Models in Physics Research
As a student at the University of Notre Dame, Sirak Negash worked with machine learning (ML) to help analyze data from particle physics experiments. This inspired him to pursue machine learning studies, especially for high energy physics. He first applied for a summer research assistant position to gain more experience in physics research. “I was pleasantly surprised when I was approached for a position at the ALCF that involved working with an ML model in physics,” he says.
Together with ALCF mentor Sam Foreman, Negash worked on determining the impact of different hyperparameter configurations on model performance and training cost for lattice quantum chromodynamics simulations (or strong interactions between quarks and gluons).
“I was able to perform a series of detailed studies on the impact of network volume scaling on training cost when running on the ALCF’s Theta supercomputer,” Negash says. .
The effort has been useful to the ALCF because future research in quantum chromodynamics “can benefit greatly from an understanding of how the performance of these simulations scales with increasing lattice size,” he says. he.
After spending his summer at the ALCF, Negash says he “developed a new appreciation for science beyond the classroom and even beyond a physical lab, and the lessons and skills that I learned through this ML research opportunity sparked in me the desire to pursue a career in data analysis.
Source: Emily Stevens, ALCF