Basics of Linux¶
Modern web, cloud, high performance computing, and most data science applications are run on operating systems (OS) other than Microsoft Windows. To do data intensive science, you need a familiarity with Linux. We’ve scheduled several sections during Container Camp for working on Linux Systems using CyVerse’ Atmosphere Cloud, which runs Linux OS virtual machines.
The good news comes in two parts. First, whether you know it or not, you probably already use Linux or a platform based on Linux, on a daily basis. Do you have an Android or iOS phone? If you own a Mac OS X device, you already enjoy many of the benefits of a Linux-like OS, including access to a terminal. Second, the Linux experience has generally been described as satisfying, and many users report moving on from Windows OS to Linux comes without regret.
Over 87% of the personal computer market still relies on the popular Microsoft OS. However, the landscape changes completely for mobile apps (99% Linux or Linux-like [Android, iOS], <0.1% Windows), web (66% Linux, 32% Windows), and cloud or HPC (100% Linux). Microsoft is acutely aware of this disparity, and is actively working to integrate Linux into their OS, including their acquisition of GitHub (and how it has changed), and the release of Windows Subsystem for Linux (WSL) 2.
Common Linux Operating Systems¶
The most common operating systems you’ll see used for data science are:
Alpine - small and lightweight, useful in container applications
CentOS - stable, reliable, most commonly used on web and cloud servers
Debian - lightweight, utilitarian, stable
Ubuntu - utilitarian, user friendly, most popular distribution, based on Debian
Red Hat - based on open source software, you pay for customer support
Windows Subsystem for Linux¶
The so-called “WSL” is a complete linux subsystem that runs under Windows 10. Microsoft recently announced WSL 2.0.
Windows Linux Dual boot¶
Not ready to take the Linux plunge yet? Why not set up a Windows-Linux dual boot?