Dependency Management Guidelines for Rails Teams

  

Dependency Management Guidelines for Rails TeamsI recently encountered a discussion in a developer chatroom about how to find good ruby gems to use for projects, and how to choose between them.

The (currently-defunct, hopefully only temporarily) website ruby-toolbox.com was mentioned, as well as awesome-ruby.com, and some conversation ensued about how to properly choose between similar dependencies for a given software requirement and why it's important.

In this article, I'll summarize some key ideas and heuristics on the topic of dependency management, and hopefully encourage the broader community to put a little more thought into how to navigate these choices appropriately, and lead to a better software ecosystem for everyone.

These concepts apply at many layers in a software project, not just rubygems. In theory, we should be able to consider these principles and patterns in the context of any type of dependency, from method signatures to cloud infrastructure tools to cross-organizational communications.

Risks and costs of dependencies

It's important to acknowledge that in software, dependencies are liabilities. There are a variety of ways that code dependencies can lead to additional risks and costs for a software project, and this is especially true on the ever-changing internet.

Security Risks

Code that you ship as part of your software stack that is written and maintained by 3rd parties can expose you to security vulnerabilities which you might not have otherwise incurred if you had exposed the code to a solid code quality and security review before acceptance. As your 3rd party software ages, the likelihood of new vulnerabilities being discovered and exploited increases, and the larger the codebase, the more surface area for attack it generally has, which multiplies the risk.

Just recently, news broke that one of the largest credit reporting companies' data on most American adults was breached due to a security issue in the underlying web application framework. The vulnerability had been patched and disclosed upstream by the time the breach occurred, but the application maintainers failed to deploy the upstream fix soon enough to prevent it from being exploited.

Unpatched known vulnerabilities are just one potential security risk of 3rd party dependencies. There can also be unknown (0-day) vulnerabilities discovered and exploited by an attacker before there is any opportunity to respond. Any 3rd party tool might also be accidentally misused in a way that creates security holes in the application. In some cases, the package may provide some kind of disclaimer about this type of risk, but in others, it might be presumed that the implementer is savvy enough to account for their own threat model effectively. It's also possible for a dependency to include intentionally malicious code, which might easily get included in popular packages and widely distributed before those affected take notice.

Lack of Maintenance, AKA 'Abandonware'

Sometimes, for any number of reasons, a code dependency that was once well-groomed falls into disrepair through neglect or abandonment by its primary maintainers. In some cases, the open source community will take action, stepping in to create a substitute for continuity's sake, but in other cases, no such replacement springs up, and a tool will simply receive no more updates, or perhaps only infrequent ones. Sometimes software gets intentionally deprecated, such as with web APIs that are replaced with more modern versions, and it's left to the users to migrate their own dependent code away from the old version within a limited timeline before it gets fully shuttered.

It's common for ruby gems that were designed to solve problems that are now handled by a framework like rails to be considered no longer necessary by their authors, left behind with no mention of a migration strategy. Oftentimes having critical functionality coupled to dependencies can block a project from undergoing otherwise straightforward maintenance.

Breaking Changes and Instability

Any code that's maintained outside of your control is subject to its own release cycle, which might include changes which break your expectations (and the code you've written to depend on how it once behaved) at any time. If you're careful about testing any updates to dependency versions, you can probably prevent any of these issues from impacting your production apps, but the time and energy invested into quality control for those changes is sometimes prohibitive, or more likely simply a deterrent from allowing the versions to change at all, which circles back to the security and abandonment risks already mentioned. This article on the topic provides some perspective on the matter:

Think about what you are admitting to have created — an environment where hundreds (if not thousands) or people have the power, maliciously or innocently, to put you out of business (if only for a short period of time), without any accountability or recourse.

It's common these days for new tools to become popular long before they reach a "stable v1 release", which implies that the authors make no promise to preserve stability or compatibility between release versions, and your dependent code could break anytime you install a new copy of that tool. Large software projects which aim to protect stability, such as operating systems and development frameworks, typically purposefully prioritize stability for their users' benefit, which can feed back into the problems mentioned above, with software in use falling out of normal maintenance or containing security problems before a plan is formed to break the dependency on those tools.

Obsolescence

Over time, the problem that code is designed to solve may be solved better in other ways, or may cease to be a problem in need of solving, which is continuously happening in the vast and complex ecosystem that we call the internet. For instance, it's not important for us to support SSL v1 anymore because the rest of the internet has already deemed that protocol to be too weak to add any value when the newer and stronger protocols like TLS v1.3 are widely available.

The same thing can happen to any single solution to a particular problem in a changing online environment. There are plenty of old dependencies out there for supporting protocols, patterns, and practices that just aren't great solutions anymore, and it's not ideal to keep that code around once its time has passed.

Common pitfalls

The YAGNI dependency pitfall

There are two variations of this mistake. One is choosing to depend on something complex to solve a problem that's simple. This often leads to heavy, slow code which is overly coupled to dependencies that may not even be in use. The other is choosing to depend on something at all, when solving the problem without a dependency would be straightforward and sustainable.

Before choosing a tool to depend on, it's generally wise to simply ask "would it be any harder to just do it ourselves?" I find it's surprisingly common for rails developers to look for a gem for every new requirement or problem that they encounter, but for a lot of common problems, the web framework already provides a solution, or at least the tools to easily construct one, or sometimes it's really just a few lines of plain ruby code. Shipping 3rd party code in these kinds of cases is sometimes more work than it saves, and can be a liability in terms of breaking changes in the future, security issues, difficulty transitioning to an alternative solution, etc.

Just last week, I observed a conversation in Slack about how implementing 'soft delete' functionality is almost as easy to DIY leveraging rails as it is to install and use one of the popular gems that provide that feature (and in some cases a lot of other unneeded features)... so the question becomes, why would anyone ever want to add yet another 3rd party's code when the benefit is so minimal?

Sometimes it's just a matter of picking a tool that fits your use case as closely as possible. There are elaborate framework solutions with all sorts of rich functionality and which have evolved over years to handle innumerable rare edge cases, when all you really need sometimes is a naive and simple approach to cover your project's needs.

The popularity contest pitfall

I noticed with ruby-toolbox in particular, it's easy to compare github stars, download stats, etc. and compare options based on those metrics, and it's common for devs to presume that more popular implies better. However, sometimes a tool looks popular because it's been around for a long time accumulating users, whereas a newer, better tool might not have the same stats yet even though it's a superior alternative.

Of course, it's also possible for something to become popular for reasons unrelated to its quality, such as SEO, having a cute name, being backed by somebody famous, a lot of discussion (whether positive or negative) about a link on hackernews, or being distributed automatically by a botnet. It's also common for these visible stats to be misinterpreted as a direct reflection of popularity, when the counters indiscriminately include usage. A large corporation using a ruby gem internally for some minor task might find its way into a high scale CI pipeline, racking up hundreds or thousands of installs per day. These stats are still useful for quick comparison but don't rely on the numbers alone.

The convenience pitfall

Sometimes when comparing tools to solve a given problem, your first priority is to get it working as quickly and easily as possible. This is an important criterion to consider before choosing between alternative approaches, but its importance usually should not outweigh longer term considerations like cost of maintenance, or cost of replacing it with something else if needed. As one of my mentors likes to say: "It's not about how fast you can get married, but how fast you can get a divorce." It's ideal when possible to favor lightweight dependencies that are easy to get rid of if something better comes along, or just for rapid experimentation. Just don't mistake "easy installation" for "simple and easy to replace if needed."

How to evaluate the risks and costs

All of the pitfalls mentioned above come from a failure to balance the costs, risks, and benefits of including a new dependency, but each stems from a valid criteria to consider before deciding. I can't give a scientific formula for making the best choice every time, but I can suggest some general patterns for improving your odds of avoiding some costly mistakes, based on my own experiences.

Here's a rough list of ideas you could consider using as a starting point for your organization to iterate upon and adopt into policy.

Heuristics for comparing alternative options

  • Start with "can we solve this problem cheaply and effectively without depending on anything?" -- remember that the concepts of dependency and coupling are applicable from the code unit level all the way through the stack of tools including organizational level ones.
  • If you've determined that depending on something outside of your own control is probably advantageous, try to develop a clear understanding of what those advantages actually are, as well as what costs and risks need to be counterbalanced.
  • Does this solution potentially include any unacceptable risks? Normally this goes without saying but in some contexts, it's worth giving some more serious thought to be extra sure, such as with security tools, tools which must be granted highly privileged access, or anything that's highly mission-critical.
  • Does this solution have a high up-front cost? This might come in the form of actual fees, complex initial setup, steep learning curve, or some other tradeoff like sacrificing functionality or speed with the initial implementation before refinements can be made.
  • Does this solution have a high ongoing cost? Again, maybe a direct money expense, as with subscriptions, or it could be a rapid maintenance cycle with frequent breaking changes or deprecations, a high chance of requiring external support to use it effectively, potential for expensive service disruptions, difficulty onboarding new team members to use it (see learning curve above), substantial performance impact, general bugginess, etc.
  • When this solution is no longer the ideal one, does it make transitioning away from it prohibitively costly? Usually, the bigger and heavier the dependency is, the greater this cost becomes. The more tightly integrated it will be, the more highly coupled you will be to it, which can multiply this factor greatly. In cases where there is no direct alternative, the cost might be re-implementing the equivalent functionality in-house, or taking over maintenance of it upstream. In the worst cases, none of those escape routes are options and you will be bound to this choice indefinitely, including all the opportunity costs that entails.
  • Does this solution provide benefits beyond those of simply fulfilling the requirements in-house? This could mean shipping earlier, outsourcing maintenance, handling important edge cases more effectively, better performance or reliability, or free extra features that are desirable.
  • Will this solution's immediate benefits pay for its adoption more quickly than alternatives?
  • Do this solution's ongoing benefits outweigh the costs/savings of other alternatives?
  • Does this solution closely approximate the requirements that are being addressed? Does it include a lot of extra stuff? Is it missing any requirements or nice-to-haves that might lead to seeking an alternative in the future?
  • Can this solution be trusted to be reliably maintained upstream for a sufficient timeline that transitioning to an alternative will be unnecessary or affordable and feasible to do when the time comes?

No dependency beats no dependency

There are circumstances where avoiding the dependency altogether is the best choice. You may be able to sidestep all of the aforementioned issues by simply implementing the desired functionality within your own project. This is often a good option when the functionality needed is simple, straightforward to implement, and there are no major benefits to entrusting maintenance of that functionality to outside parties.

If you have the option of not having a dependency at all, then there can be a lot of advantages, as well as mitigated risks. Always err in the direction of avoiding a dependency relationship when it is a reasonable choice.

When to take the risk

Some benefits are worth adopting a dependency for, such as the vetting and bug-fixing that occurs with popular open source solutions. That community maintenance can be particularly important for security features or complex functionality which is deeply integrated or tied to other evolving dependencies such as a framework. Overall popularity can be a useful indicator for how valuable this benefit is likely to be, but shouldn't be considered definitive. Age of a project can be another useful indicator. Something that's been well supported for an extended period is a clue that it's at least not a flash-in-the-pan project that will be abandoned suddenly and can lead to a high degree of maturity and stability.

What if it's too late?

Sometimes you may find yourself in a situation where you already have a project heavily coupled to some external dependency which is making maintenance difficult. What strategies are available for moving forward?

  • Break the dependency
  • Adopt it (fork and maintain)
  • Absorb it (copy-paste into your codebase and maintain that code as part of your application)
  • Get lucky and find a drop-in substitute that is less troublesome

Conclusion

Be mindful about choosing dependencies based on costs, risks, and benefits. Be mindful about what kinds of things constitute dependencies which you might not have considered that way before. Be wary of the common dependency selection pitfalls that can lead to downtime, extra cost overhead, time-consuming maintenance, or roadblocks to future evolution. Pay special attention to security risks and favor dependency options which avoid, minimize, or mitigate them, especially the "no dependency" option, whenever it makes sense.

Free Ebook:
Should I Hire DevOps or Outsource to a Provider?

You have to invest in your infrastructure: Do you hire DevOps for this critical function, assign it to your already overworked engineers, or outsource to a provider that offers full-stack capabilities?

Should I Hire DevOps?

Brandon Dees

 
Rietta Inc. Rails Developer who emphasizes security, code quality, best practices, and mentorship.
Find me on:

Comments

Subscribe Here!