I recently encountered a discussion in a developer chatroom about how to find good ruby gems to use for projects, and how to choose between them.
The (currently-defunct, hopefully only temporarily) website ruby-toolbox.com was mentioned, as well as awesome-ruby.com, and some conversation ensued about how to properly choose between similar dependencies for a given software requirement and why it's important.
In this article, I'll summarize some key ideas and heuristics on the topic of dependency management, and hopefully encourage the broader community to put a little more thought into how to navigate these choices appropriately, and lead to a better software ecosystem for everyone.
These concepts apply at many layers in a software project, not just
Risks and costs of dependencies
It's important to acknowledge that in software, dependencies are liabilities. There are a variety of ways that code dependencies can lead to additional risks and costs for a software project, and this is especially true on the ever-changing internet.
Code that you ship as part of your software stack that is written and maintained by 3rd parties can expose you to security vulnerabilities which you might not have otherwise incurred if you had exposed the code to a solid code quality and security review before acceptance. As your 3rd party software ages, the likelihood of new vulnerabilities being discovered and exploited increases, and the larger the codebase, the more surface area for
Just recently, news broke that one of the largest credit reporting companies' data on most American adults was breached due to a security issue in the underlying web application framework. The vulnerability had been patched and disclosed upstream by the time the breach occurred, but the application maintainers failed to deploy the upstream fix soon enough to prevent it from being exploited.
Unpatched known vulnerabilities are just one potential security risk of 3rd party dependencies. There can also be unknown (0-day) vulnerabilities discovered and exploited by an attacker before there is
Lack of Maintenance, AKA 'Abandonware'
Sometimes, for any number of reasons, a code dependency that was once well-groomed falls into disrepair through neglect or abandonment by its primary maintainers. In some cases, the open source community will take action, stepping in to create a substitute for continuity's sake, but in other cases, no such replacement springs up, and a tool will simply receive no more
It's common for ruby gems that were designed to solve problems that are now handled by a framework like rails to be considered no longer necessary by their authors, left behind with no mention of a migration strategy. Oftentimes having critical functionality coupled to dependencies can block a project from undergoing otherwise straightforward maintenance.
Breaking Changes and Instability
Any code that's maintained outside of your control is subject to its own release cycle, which might include changes which break your expectations (and the code you've written to depend on how it once behaved) at any time. If you're careful about testing any updates to dependency versions, you can probably prevent any of these issues from impacting your production apps, but the time and energy invested into quality control for those changes is sometimes prohibitive, or more likely simply a deterrent from allowing the versions to change at all, which circles back to the security and abandonment risks already mentioned. This article on the topic provides some perspective on the matter:
Think about what you are admitting to have created — an environment where hundreds (if not thousands) or people have the power, maliciously or innocently, to put you out of business (if only for a short period of time), without any accountability or recourse.
It's common these days for new tools to become popular long before they reach a "stable v1 release", which implies that the authors make no promise to preserve stability or compatibility between release versions, and your dependent code could break anytime you install a new copy of that tool. Large software projects which aim to protect stability, such as operating systems and development frameworks, typically purposefully prioritize stability for their users' benefit, which can feed back into the problems mentioned above, with software in use falling out of normal maintenance or containing security problems before a plan is formed to break the dependency on those tools.
Over time, the problem that code is designed to solve may be solved better in other
The same thing can happen to any single solution to a particular problem in a changing online environment. There are plenty of old dependencies out there for supporting protocols, patterns, and practices that just aren't great solutions anymore, and it's not ideal to keep that code around once its time has passed.
The YAGNI dependency pitfall
There are two variations of this mistake. One is choosing to depend on something complex to solve a problem that's simple. This often leads to heavy, slow code which is overly coupled to dependencies that may not even be in use. The other is choosing to depend on something at
Before choosing a tool to depend on, it's generally wise to simply ask "would it be any harder to just do it ourselves?" I find it's surprisingly common for rails developers to look for a gem for every new requirement or problem that they encounter, but for a lot of common problems, the web framework already provides a solution, or at least the tools to easily construct one, or sometimes it's really just a few lines of plain ruby code. Shipping 3rd party code in these kinds of cases is sometimes more work than it saves, and can be a liability in terms of breaking changes in the future, security issues, difficulty transitioning to an alternative solution, etc.
Just last week, I observed a conversation in Slack about how implementing 'soft delete' functionality is almost as easy to DIY leveraging rails as it is to install and use one of the popular gems that provide that feature (and in some cases a lot of other unneeded features)... so the question becomes, why would anyone ever want to add yet another 3rd party's code when the benefit is so minimal?
Sometimes it's just a matter of picking a tool that fits your use case as closely as possible. There are elaborate framework solutions with all sorts of rich functionality and which have evolved over years to handle innumerable rare edge
The popularity contest pitfall
I noticed with ruby-toolbox in particular, it's easy to compare
Of course, it's also possible for something to become popular for reasons unrelated to its quality, such as SEO, having a cute name, being backed by somebody famous, a lot of
The convenience pitfall
Sometimes when comparing tools to solve a given problem, your first priority is to get it working as quickly and easily as possible. This is an important criterion to consider before choosing between alternative approaches, but its importance usually should not outweigh longer term considerations like cost of maintenance, or cost of replacing it with something else if needed. As one of my mentors likes to say: "It's not about how fast you can get married, but how fast you can get a divorce." It's ideal when possible to favor lightweight dependencies that are easy to get rid of if something better comes along, or just for rapid experimentation. Just don't mistake "easy installation" for "simple and easy to replace if needed."
How to evaluate the risks and costs
All of the pitfalls mentioned above come from a failure to balance the costs, risks, and benefits of including a new dependency, but each stems from
Here's a rough list of ideas you could consider using as a starting point for your organization to iterate upon and adopt into policy.
Heuristics for comparing alternative options
- Start with "can we solve this problem cheaply and effectively without depending on anything?" -- remember that the concepts of dependency and coupling are applicable from the code unit level all the way through the stack of tools including organizational level ones.
- If you've determined that depending on something outside of your own control is probably advantageous, try to develop a clear understanding of what those advantages actually are, as well as what costs and risks need to be counterbalanced.
- Does this solution potentially include any unacceptable risks? Normally this goes without saying but in some contexts, it's worth giving some more serious thought to be extra sure, such as with security tools, tools which must be granted highly privileged access, or anything that's highly mission-critical.
- Does this solution have a high up-front cost? This might come in the form of actual fees, complex initial setup, steep learning curve, or some other tradeoff like sacrificing functionality or speed with the initial implementation before refinements can be made.
- Does this solution have a high ongoing cost? Again, maybe a direct money expense, as with subscriptions, or it could be a rapid maintenance cycle with frequent breaking changes or deprecations, a high chance of requiring external support to use it effectively, potential for expensive service disruptions, difficulty onboarding new team members to use it (see learning curve above), substantial performance impact, general bugginess, etc.
- When this solution is no longer the ideal one, does it make transitioning away from it prohibitively costly? Usually, the bigger and heavier the dependency is, the greater this cost becomes. The more tightly integrated it will be, the more highly coupled you will be to it, which can multiply this factor greatly. In cases where there is no direct alternative, the cost might be re-implementing the equivalent functionality in-house, or taking over maintenance of it upstream. In the worst cases, none of those escape routes are options and you will be bound to this choice indefinitely, including all the opportunity costs that
- Does this solution provide benefits beyond those of simply fulfilling the requirements in-house? This could mean shipping earlier, outsourcing maintenance, handling important edge cases more effectively, better performance or reliability, or free extra features that are desirable.
- Will this solution's immediate benefits pay for its adoption more quickly than alternatives?
- Do this solution's ongoing benefits outweigh the costs/savings of other alternatives?
- Does this solution closely approximate the requirements that are being addressed? Does it include a lot of extra stuff? Is it missing any requirements or nice-to-haves that might lead to seeking an alternative in the future?
- Can this solution be trusted to be reliably maintained upstream for a sufficient timeline that transitioning to an alternative will be unnecessary or affordable and feasible to do when the time comes?
No dependency beats no dependency
There are circumstances where avoiding the dependency altogether is the best choice. You may be able to sidestep all of the aforementioned issues by simply implementing the desired functionality within your own project. This is often a good option when the functionality needed is simple, straightforward to implement, and there are no major benefits to entrusting maintenance of that functionality to outside parties.
If you have the option of not having a dependency at all, then there can be a lot of advantages, as well as mitigated risks. Always err in the direction of avoiding a dependency relationship when it is a reasonable choice.
When to take the risk
Some benefits are worth adopting a dependency for, such as the vetting and bug-fixing that occurs with popular open source solutions. That community maintenance can be particularly important for security features or complex functionality which is deeply integrated or tied to other evolving dependencies such as a framework. Overall popularity can be a useful indicator for how valuable this benefit is likely to be, but shouldn't be considered definitive. Age of a project can be another useful indicator. Something that's been well supported for an extended period is a clue that it's at least not a flash-in-the-pan project that will be abandoned suddenly and can lead to a high degree of maturity and stability.
What if it's too late?
Sometimes you may find yourself in a situation where you already have a project heavily coupled to some external dependency which is making maintenance difficult. What strategies are available for moving forward?
- Break the dependency
- Adopt it (fork and maintain)
- Absorb it (copy-paste into your codebase and maintain that code as part of your application)
- Get lucky and find a drop-in substitute that is less troublesome
Be mindful about choosing dependencies based on costs, risks, and benefits. Be mindful about what kinds of things constitute dependencies which you might not have considered that way before. Be wary of the common dependency selection pitfalls that can lead to downtime, extra cost overhead, time-consuming maintenance, or roadblocks to future evolution. Pay special attention to security risks and favor dependency options which avoid, minimize, or mitigate them, especially the "no dependency" option, whenever it makes sense.