Like an invisible yet masterful weaver, Illinois Tech Associate Professor of Applied Mathematics Lulu Kang quietly works behind the scenes to enable scientists and engineers to design and build the best systems possible. As a statistician, she develops efficient data-collection and data-analysis methodologies and theories to create statistical models for complicated engineering and scientific systems.
overnment agencies and industry partners alike have funded Kang’s research in this area since 2011. She is currently one of three principal investigators on a $117,888 grant from the National Science Foundation to study and improve four different systems—organ transplantation, semiconductor wafer production, thermal spray coating, and crystal growth processes. These systems are defined as quantitative-qualitative (Q-Q) systems, since the data collected are of both types. For instance, the quality of a product can be categorized to be “good” or “bad,” while many other quantitative measurements are also collected to characterize the product’s quality.
“We can control, optimize, and monitor such systems,” says Kang. “To achieve that we collect data from the system and develop a surrogate statistical model for our analysis schemes.”
The first benefit of this model is that it allows her to determine how her team will collect data in order to solve a problem. “The second,” she says, “is that when we collect the data, it is very rigorous and has less noise.”
When data is collected in a way that minimizes noise (variations such as environmental fluctuations and process variability), the resulting model is more precise. “A computer simulation is deterministic, so every time you choose a data setup or setting for operational parameters, you will always get the same response,” she says.
When the project concludes later this year, Kang anticipates that the team will have developed a best practices framework for the modeling and quality improvement of Q-Q systems. As associate director of Illinois Tech’s Master of Data Science program, she also notes that case studies from the project will be useful as training aids for students.
Another area of Kang’s research is on uncertainty quantification, the science of quantifying and examining uncertainty in computational simulation systems. Kang develops statistical surrogate models from the computer simulations she and her collaborators construct. Such surrogate models allow investigators to quickly understand a system’s strengths and weaknesses, generate more simulation results with cheaper costs, and achieve more efficient optimization of the system.
Looking beyond the NSF grant, Kang would like to apply her methodologies and theories honed in biomedicine and engineering to big-data sets, which is an area that presents new challenges. While in theory larger sets of data may provide a more accurate model of data, Kang notes that “data sets can be inherently biased. If you are extracting information from very big data sets, you have to make sure your sampling is reasonable and meaningful. Is it sufficient to build an accurate model? This is crucial in this age of big data.”