We have too many dependencies. The programming world is on another roller coaster ride down to dependency hell, if you’ll believe the skeptics. I don’t think so. I think we are living in the gully of “The Dependency Valley,” with not quite enough dependencies to come out the other side.
I can see why 3rd party packages get a bad rap. Bright-eyed and bushy-tailed programmers fresh out of crash courses on “How To Become A Rockstar Ninja Node Developer in 4 Weeks” wear silly smiles while forcing module after module down the throats of their ever-bloated projects. They do this because they understand the benefits of libraries without seeing the very serious drawbacks.
This isn’t about “kids these days” or throwing NPM under the bus. Although recent grads and shiny technologies do represent a monumental new wave of dependency tomfoolery, they are just the latest in a long line of transgressions. For every new technology and crop of devs, there have been dependency issues. For the Microsoft stack, “DLL Hell” was prevalent when conflicts arose between Dynamically Linked Libraries. Developers who wanted to download a 3rd party library went to Codeplex and literally downloaded a .zip file. Check the WayBackMachine out if you don’t believe me.
But way back then you typically didn’t use a ton of 3rd party libraries. Maybe 90% of your libraries were written by button-down-shirted CS grads who Skype with their superiors in Redmond and received paychecks with Mr. Gates’ signature at the bottom. You trusted them. Maybe the trust was unwarranted (defects are found in open source projects more quickly than closed source, according to some), but at least you knew who had hands in the soup. The code they wrote lived in every PC’s Global Assembly Cache and were shared among all programs, allowing you to get the current date and round numbers. In short, the source of these libraries were reliable. WinForms developers the world over rejoiced at the simplicity.
I won’t comment too much on the history of Unix or Java devs, but as far as I can divine for Java, the .jar is the .dll equivalent, and Oracle is your Microsoft. And the same is true of pretty much every major platform; you used what was built in and ad hocked the rest.
But lets take off the rose tinted glasses for a moment. Updates to Windows came in terms of months or years, leaving impatient developers to create the likes of Ruby and Node and all the fun, fast, new tools while Windows slogged along. To the Microsoft suits, the story of “move fast and break things” was something you told as a haunted horror story around a campfire. The lawyers and the shareholders couldn’t afford for things to break, so months of testing were included in every update cycle. If that sounds like a slow and painful way to operate, it is.
So now we’ve made it to the other end of the spectrum. The current trend is to shy away from monolithic tools, the “putting your eggs in one basket” approach, and instead rely on a litany of specific single-responsibility packages. The conversation often goes like this:
Greg: “Ok, we’ll run this on Ubuntu, and we need Node.js just to get started, and of course we need NPM to manage our packages.”
Greg: “And we need Gulp as our build tool.”
Steve: “That makes sense.”
Greg: “Ok, Steve, here’s a list of dependencies that Gulp has. There are 741 nested dependencies we took on just from that.”
Steve: “You are literally the devil.”
I agree, Steve. Not to mention, Greg is the one that keeps stealing your Venti Teavana® Mango Black Tea Lemonades from the fridge.
This is a serious problem. I use Gulp. I love Gulp. But jeez, we haven’t even got to the part where we are putting dependencies into our actual project, Gulp is just the build tool. And we’re already up to 741 dependencies, any of which could be unpublished like left-pad at any time. Or there’s a versioning conflict. Or a bug. Or a server is down. Or one was written by a malicious hacker, or more likely, a buggy package written by an inexperienced dev looking to build her GitHub history.
I just can’t get over that every time you write
$ npm install -g gulp, you are taking on 741 dependencies. We have a lot of trust in these people. Is that smart?
Gulp is used by thousands of developers, so I doubt it in particular suffers from these issues, but the truth is that most packages trend toward the end of the long tail in popularity. That very specific package you are using to provide a Python wrapper for a banking API might depend on a dozen other packages, any of which could be handling banking data.
Even Microsoft has seen the benefits of many specific libraries instead of a few monolithic ones. They are actively looking to break up its DLLs into smaller pieces and are even entering the open source foray with abandon. They now have 27 pages of public repositories on GitHub (which includes the big nut, .NET itself). Pull requests are treated seriously and all code is thoroughly tested, but development cycles are more like six weeks than six months. Microsoft has become more agile and has reaped the rewards, but the consequence is that you have a ton of smaller packages loaded into your project now.
Even the oasis of Microsoft is not enough for those seeking asylum from a long list of dependencies.
But let’s be realistic. Packages, modules, dependencies, libraries or whatever you choose to call them aren’t going away. Users expect a lot from products these days, and the developers who efficiently provide that are going to win, period. Users don’t care about dependencies. They only care that their app works and works well, and while you may pull your hair out from time to time, you are delivering the most polished pile of dependencies you can shovel together.
I predict it won’t be like this forever. I predict we are living in what I call “The Dependency Valley.” Like a mad scientist before my time, this is going to sound a little hair-brained.
I think we need more dependencies.
Or more accurately, we need smarter dependencies. This is where AI and machine learning will come to our rescue. I predict that within three years our package management layer won’t just blindly search around for the most up to date package that meets all of the dependency’s criteria, it will actively search for packages that fulfill your needs and automatically include the pieces you need.
I mean, this isn’t a completely new idea. There’s the Bing Code Search extension for Visual Studio that helps developers by bringing in samples of code to save you a manual search that inevitably lands on Stack Overflow. It sort of fills in the gaps, but ultimately you are still putting code in your own project – not calling out to a library. And, you are making a manual decision about what to include and how.
But the fundamentals are there: they are contextually looking at your code and sussing out what you are trying to do, and then making a somewhat informed decision on how to better help you do that. I believe that in the future, when you name your variable
bankInfoDTO and type
.send, you will get a ton of libraries offering different APIS to send that bank info.
In fact, in the future you may use a library that you call into with your bank info, let’s call it SuperBankerSender. You ended up choosing this library because the API was great. So, SuperBankerSender updates their API from 1.9.5 to 2.0, and of course the interface changes.
This is where the fun in having a smart package manager layer comes in. In fact, let’s call it the Package Fabric from now on, because it’s gonna be a whole new thing. So, it figures out that all these new projects are sending data on SuperBankerSender on 2.0. It figures out that data sent over that version has 20% less errors or is 50% quicker or some other metric that it determines makes this version “better.” So, it modifies your code to send data over the new interface OR it transforms the data from your old code to work with the new interface.
That’s scary, right? Sure, but if it worked flawlessly, it would be awesome.
And to add on to that, it wouldn’t just be one AI that is running. It would be many, and they would be evolving. The AI would mutate, and then be tested, and then mutate again. I’m not crazy, this is already happening in other areas of data science.
This is your cue to say “Yes, but there’s a clear outcome of distance traveled that was created as a goal for those bipedal creatures. Also, I snorted out my coffee and now my nose burns.”
Yes, but with “Reinforced Learning” computers can now create their own tactics to overcome obstacles. For example, here’s a computer overcoming Flappy Bird. It was told the end goal of “don’t let the bird die,” and the “how” was created by the learning algorithm. If you’ve ever played Flappy Bird, we can agree that this computer does it better than you.
If you still are skeptical, let’s get a little more geeky. As this scholarly paper reports, researchers threw different Atari games at a neural network and gave it this info: “Hey, here’s a video feed. Here’s the keys you can press. This is what it looks like when you win. Go.”
And then they didn’t touch it. It learned over time how to win, and it won well. Now imagine the same thing happening with your package manager. “Hey, you already pretty much know what an error state looks like. Go ahead and try different package versions on your own from time to time (don’t use my production version) and modify the data as needed to get to the most stable point. Ok, thanks.”
Eventually, floating around in this Package Fabric will be the sum of all open source code people have written. Some will be verified, some fast, some robust, some popular. The Package Fabric will find the right packages you need to “get ‘er done” for you, and the tradeoffs of security, speed, robustness, and all of that will become managed. Oh sure, you’ll have a configuration file to override stuff, but eventually you won’t need to touch it. You’ll be living in this wonderful world where you can use any open source code seamlessly, almost as if you or your coworker wrote it. It will be nearly transparent. The code singularity will nearly be complete.