> Why is it so hard to install PyTorch, or CUDA, or libraries like FlashAttention or DeepSpeed that build against PyTorch and CUDA?
This is so true! On Windows (and WSL) it is also exacerbated by some packages requiring the use of compilers bundled with outdated Visual Studio versions, some of which are only available by manually crafting download paths. I can't wait for a better dev experience.
Stuff like that led me fully away from Ruby (due to Rails), which is a shame, I see videos of people chugging along with Ruby and loving it, and it looks like a fun language, but when the only way I can get a dev environment setup for Rails is using DigitalOcean droplets, I've lost all interest. It would always fail at compiling something for Rails. I would have loved to partake in the Rails hype back in 2012, but over the years the install / setup process was always a nightmare.
I went with Python because I never had this issue. Now with any AI / CUDA stuff its a bit of a nightmare to the point where you use someone's setup shell script instead of trying to use pip at all.
Lets be honest here - whilst some experiences are better/worse than others, there doesn't seem to be a dependency management system that isn't (at least half) broken.
I use Go a lot, the journey has been
- No dependency management
- Glide
- Depmod
- I forget the name of the precursor - I just remembered, VGo
- Modules
We still have proxying, vendoring, versioning problems
Python: VirtualEnv
Rust: Cargo
Java: Maven and Gradle
Ruby: Gems
Even OS dependency management is painful - yum, apt (which was a major positive when I switched to Debian based systems), pkg (BSD people), homebrew (semi-official?)
Dependency Management is the wild is a major headache, Go (I only mention because I am most familiar with) did away with some compilation dependency issues by shipping binaries with no dependencies (meaning that it didn't matter which version of linux you built your binary for, it will run on any of the same arch linux - none of that "wrong libc" 'fun'), but you still have issues with two different people building the same binary in need of extra dependency management (vendoring brings with it caching problems - is the version in the cache up to date, will updating one version of one dependency break everything - what fun)
NuGet for C# has always been fantastic, and I like Cargo, though sometimes waiting for freaking ever for things to build does kill me on the inside a little bit. I do wish Go had a better package manager trajectory, I can only hope they continue to work on it, there were a few years I refused to work on any Go projects because setup was a nightmare.
The main NuGet was problematic for a long time e.g. not providing any control over transitive dependencies (like pip at the time). You had to use https://fsprojects.github.io/Paket/ if you wanted safe and consistent resolution. NuGet since got their act together and it’s not as flawed now.
I agree. Here are some things that I (a science researcher and professor) like about R and CRAN:
1. There are a lot of build checks for problems involving mismatches between documentation and code, failed test suites, etc. These tests are run on the present R release, the last release, and the development version. And the tests are run on a routine basis. So, you can visit the CRAN site and tell at a glance whether the package has problems.
2. There is a convention in the community that code ought to be well-documented and well-tested. These tend not to be afterthoughts.)
3. if the author of package x makes changes, then all CRAN packages that use x will be tested (via the test suite) for new problems. This (again because of the convention of having good tests) prevents lots of ripple-effect problems.
4. Many CRAN packages come with so-called vignettes, which are essays that tend to supply a lot of useful information that does not quite fit into manpages for the functions in the package.
5. Many CRAN packages are paired with journal/textbook publications, which explain the methodologies, applications, limitations, etc in great detail.
6. CRAN has no problem rejecting packages, or removing packages that have problems that have gone unaddressed.
7. R resolves dependencies for the user and, since packages are pre-built for various machine/os types, installing packages is usually a quick operation.
PS. Julia is also very good on package management and testing. However, it lacks a central repository like CRAN and does not seem to have as strong a culture of pairing code with user-level documentation.
Do I get it right that this issue is within Windows? I've never heard of the issues you describe while working with Linux.. I've seen people struggle with MacOS a bit due to brew different versions of some library or the other, mostly self compiling Ruby.
The mmdetection library (https://github.com/open-mmlab/mmdetection/issues) also has hundreds of version-related issues. Admittedly, that library has not seen any updates for over a year now, but it is sad that things just break and become basically unusable on modern Linux operating systems because NVIDIA can't stop breaking backwards and forwards compatibility for what is essentially just fancy matrix multiplication.
I had issues on Mac, Windows and Linux... It was obnoxious. It led me to adopt a very simple rule: if I cannot get your framework / programming language up and running in under 10 minutes (barring compilation time / download speeds) I am not going to use your tools / language. I shouldn't be struggling with the most basic of hello worlds with your language / framework. I don't in like 100% of the other languages I already use, why should I struggle to use a new language?
On Linux good luck if you're not using anything besides the officially nvidia-supported Ubuntu version. Just 24.04 instead of 22.04 has regular random breakages and issues, and running on archlinux is just endless pain.
Have you tried conda? Since the integration of mamba its solver is fast and the breadth of packages is impressive. Also, if you have to support Windows and Python with native extensions, conda is a godsend.
It is not fast. Mamba and micromamba are still much faster than conda and yet lack basic features that conda has to provide. Everyone is dropping conda like a hot plate since the licensing changes in 2024.
I would recommend learning a little bit of C compilation and build systems. Ruby/Rails is about as polished as you could get for a very popular project. Maybe libyaml will be a problem once in a while if you're compiling Ruby from scratch, but otherwise this normally works without a hassle. And those skills will apply everywhere else. As long as we have C libraries, this is about as good as it gets, regardless of the language/runtime.
Have you tried JRuby? It might be a bit too large for your droplet, but it has the java versions of most gems and you can produce cross-platform jars using warbler.
And yet operating distributed systems built on it is a world of pain. Elasticsearch, I am looking at you. Modern hardware resources leave the limitations of running/scaling on top JVM to be an expensive, frustrating endeavor.
In addition to elasticsearch's metrics, there's like 4 JVM metrics I have to watch constantly on all my clusters to make sure the JVM and its GC is happy.
In-house app that uses jdbc, is easy to develop and needs to be cross-platform (windows, linux, aix, as400). The speed picks up as it runs, usually handling 3000-5000 eps over UDP on decade old hardware.
I'm surprised to hear that. Ruby was the first language in my life/career where I felt good about the dependency management and packaging solution. Even when I was a novice, I don't remember running into any problems that weren't obviously my fault (for example, installing the Ruby library for PostgreSQL before I had installed the Postgres libraries on the OS).
Meanwhile, I didn't feel like Python had reached the bare minimum for package management until Pipenv came on the scene. It wasn't until Poetry (in 2019? 2020?) that I felt like the ecosystem had reached what Ruby had back in 2010 or 2011 when bundler had become mostly stable.
Bundler has always been the best package manager of any language that I've used, but dealing with gem extensions can still be a pain. I've had lots of fun bugs where an extension worked in dev but not prod because of differences in library versions. I ended up creating a docker image for development that matched our production environment and that pretty much solved those problems.
That's one of the reason I prefer a dev environment (either physical install or VM) that matches prod. Barring that I would go with with a build system (container-based?) that can be local. Otherwise it's painful.
This is the right direction for Python packaging, especially for GPU-heavy workflows. Two concrete things I'm excited about: 1) curated, compatibility-tested indices per accelerator (CUDA/ROCm/CPU) so teams stop bikeshedding over torch/cu* matrixes, and 2) making metadata queryable so clients can resolve up front and install in parallel. If pyx can reduce the 'pip trial-and-error' loop for ML by shipping narrower, hardware-targeted artifacts (e.g., SM/arch-specific builds) and predictable hashes, that alone saves hours per environment. Also +1 to keeping tools OSS and monetizing the hosted service—clear separation builds trust. Curious: will pyx expose dependency graph and reverse-dependency endpoints (e.g., "what breaks if X→Y?") and SBOM/signing attestation for supply-chain checks?
Given that WSL is pretty much just Linux, I don't see what relevance Visual Studio compiler versions have to it. WSL binaries are always built using Linux toolchains.
At the same time, even on Windows, libc has been stable since Win10 - that's 10 years now. Which is to say, any binary compiled by VC++ 2015 or later is C-ABI-compatible with any other such binary. The only reasons why someone might need a specific compiler version is if they are relying on some language features not supported by older ones, or because they're trying to pass C++ types across the ABI boundary, which is a fairly rare case.
If you have to use, e.g., CUDA Toolkit 11.8, then you need a specific version of VS and its build tools for CUDA's VS integration to work. I don't know why exactly that is and I wish I didn't have to deal with it.
In my experience, Anaconda (including Miniconda, Micromamba, IntelPython, et al.) is still the default choice in scientific computing and machine learning.
It's useful because it also packages a lot of other deps like CUDA drivers, DB drivers, git, openssl, etc. When you don't have admin rights, it's really handy to be able to install them and there's no other equivalent in the Python world. That being said, the fact conda (and derivatives) do not follow any of the PEPs about package management is driving me insane. The ergonomics are bad as well with defaults like auto activation of the base env and bad dependency solver for the longest time (fixed now), weird linking of shared libs, etc.
I agree. Pixi solves all of that issues and is fully open source including the packages from conda-forge.
Too bad there is nowadays the confusion with anaconda (the distribution that requires a license) and the FOSS pieces of conda-forge. Explain that to your legacy IT or Procurement -.-
When was that ever a part of the definition? It was part of the early Unix culture, sure, but even many contemporary OSes didn't ship with compilers, which were a separate (and often very expensive!) piece of software.
OTOH today most Linux distros don't install any dev tools by default on a clean install. And, ironically, a clean install of Windows has .NET, which includes a C# compiler.
This is so true! On Windows (and WSL) it is also exacerbated by some packages requiring the use of compilers bundled with outdated Visual Studio versions, some of which are only available by manually crafting download paths. I can't wait for a better dev experience.