When a budding data scientist decides to buy a new laptop for college, they inevitably come to their friends, classmates, or Stack Overflow to ask the question, “What operating system should I use for data science?” This question is as contentious among programmers as “Should I use tabs or spaces?” If you ask me, there is not a single _best _answer to this question. The answer for each person is based on individual preferences, experience, and external limitations.
For some people, the answer may be Windows. For example, are you limited to Windows because of workplace requirements? Do you not have the budget to switch to a new Apple machine? Do you not have the time or knowledge to install a Linux OS on your PC? Are Windows and Microsoft Office suite what you are most comfortable with? Or perhaps you don’t want to have to replace all the Windows software for which you have spent a small fortune purchasing licenses.
Windows _can _handle data science, especially if you aren’t going much further than installing Python, Anaconda, and R with some common packages. If you are in your first year of college, this setup will work beautifully. But, the more complex your work becomes, the more drawbacks you will encounter with Windows. You may run into issues with speed, compatibility is woefully lacking, debugging is much harder than it should be, and almost every Stack Overflow post gives you instructions that _only _work in Unix/Linux.
To get past the limitations of programming in Windows, you could dual-boot Linux and Windows on the same machine and switch between the partitions as your tasks require. But you may not be comfortable with this process, and oh what a pain it is to log in and out every time you switch tasks. So next you may consider spinning up virtual machine environments in Windows. These are often difficult to set up, difficult to reproduce, and full of compatibility issues despite their supposed isolation from the Windows OS.
If you want to vastly improve your development experience on Windows, I suggest you forego the aforementioned options and instead consider Windows Subsystem for Linux (WSL). Not only will you get better performance with less issues, but you will be training yourself to program in an environment that is very close to what you would use on a full Linux or Apple OS. Switching to one of these systems later, or transferring your code to one of these systems should be easy.
However, this option might be too much for you if you are just starting out and are completely new to using the command line. WSL 2 currently does not support GUI apps in Linux. (Though Microsoft plans to support them in the future.) So while you can use Anaconda 100% via the command line, you will miss out on helpful GUIs like Anaconda Navigator. Eventually you will want to graduate to working without the GUI — if not for the speed, then for the incredible feeling of having family and friends believe you are a super hacker. But that transition will come with time, and there is nothing wrong with learning to walk before you run.
Tenor, https://tenor.com/view/patience-young-grasshopper-gif-10391979
The original release of WSL was a layer on top of Windows that let you run Linux executables on Windows 10. That was a good start, but still nowhere close to offering full Linux/GNU support. The real magic came with Microsoft’s 2019 release of WSL 2. This new architecture offers its own Linux kernel instead of a compatibility layer. So you get faster performance, full compatibility for system calls, the ability to run a lot more apps (like Docker) on the Linux kernel, and updates will be released without waiting for Microsoft to “translate” the changes for WSL.
Basically, with WSL 2 you will be able to work smoothly in a fully functional Linux environment while on Windows! There are also features that blend the two systems quite elegantly. You can have the Linux environment open Windows programs like your browser or IDE. And you can browse files from both the Windows and WSL filesystems using the normal file explorer or the command line. On some IDEs, you can use the Linux kernel to run and debug your code, even though the IDE software is installed in Windows.
Follow Microsoft’s instructions on how to install WSL 2. You must be on Windows 10, and you must be updated to version 2004, build 19041 or higher. Getting this update can be a little tricky. For some users (like myself), checking for updates and installing the latest available still did not get me to this required version. If this happens, you can manually install the required build by using the Windows Update Assistant as directed.
#wsl-2 #data-science #windows-10 #dev-environment-setup #ubuntu