According to research by a team at Concordia University in Montreal, two-thirds of the vulnerabilities found in container images can be eliminated by regularly updating software components, while minimizing the number of libraries can also reduce the attack surface area in some cases.
The research which was focused on containerized applications used in high-performance computing (HPC) environments for neuroimage processing, found that the average container image had more than 320 vulnerabilities after analyzing 44 container images using vulnerability scanners. . Containers based on lightweight Linux distributions, such as Alpine Linux, had far less vulnerability that suggested, vulnerabilities can be reduced by minimizing the volume of code.
While the researchers focused on containerized applications of analyzing images of the brain, the issue with vulnerabilities is not particular to that discipline or data science packages. The problem is not specific to particular data analysis software or OS distribution as it is a very general problem and there is no particular origin of vulnerabilities.
To reduce the number of vulnerabilities in the software, users of Docker and Singularity containers should update the packages included in images which are a proven way.
The researchers, however, urged other scientists and data specialists to become more proactive about container security.
Software updates are generally discouraged because they can affect analysis results by introducing numerical perturbations in the computations, especially in neuroimaging, as in other disciplines. From the perspective of IT security the position is not viable and that it could endanger the entire Big Data processing infrastructure, starting with the HPC centers.
A script was used by the research team to determine the package manager for a specific image and then ran the manager’s update function to install the most recent software versions. The original as well as the updated image was scanned with a variety of vulnerability scanners including Anchore, Vuls, and Clair for Docker images, and the Singularity Container Tools for Singularity images.
The number of vulnerabilities discovered varied from 1700 from one image to almost zero for other images. However, the average number of vulnerabilities per image stood at 460 while the median image had 321 vulnerabilities. According to the research, the number depended fairly linearly on the number of packages, with about 1.7 security issues discovered per software component on average.
Even though the impact was uneven, minimizing the number of packages often reduced the number of vulnerabilities. However, doing so when there were few extraneous packages had no impact. However, using the Alpine Linux distribution typically reduced the attack surface area.
Container images based on Alpine Linux are an exception since they have less vulnerabilities overall.
Data scientists are often worried that updates to their enterprise software will change or break their analyses, and so they avoid updating the software components in an image. They are however urged to use updated software.
In addition, data scientists and the users of scientific software should make their analyses more robust to changes, which can ensure that software updates don’t affect the results of data analysis.
To read more, please check eScan Blog