Will the Open Source Movement Backfire?
By Bill Franks, Dec 13, 2018
While open source software has been around for decades, its adoption level and breadth of focus have exploded in recent years. Most companies now make regular use of open source software - and that usage is increasing. However, there is a potential downside to the open source explosion that may be heading our way. Is it possible that the open source movement will backfire on us? This blog will explore that possibility.
OPEN SOURCE STARTED FOCUSED
In the early days of open source software, there was heavy focus on a smaller number of large projects that were broadly adopted. Web browsers and web server software (Apache) were two early and popular open source focal points. Both of these areas are now dominated by open source. There is no turning back from open source in areas like these since the software is an integral part of so many consumers’ and companies’ processes.
This isn’t to say that a wide range of open source projects didn’t exist years ago. But, only a small set of open source projects made it into the hands of a broad swath of the population. This focus made it easier to get contributors to keep the software up to date and to add new features.
OPEN SOURCE TODAY IS WIDELY SPREAD
Open source is no longer focused only on broad functionality that the masses require. There are now also open source projects focused on targeted, niche areas. Within the analytics world, for example, we have R and Python that cover a broad range of analytic capabilities. But we also have niche projects that address only certain aspects of the analytics process. These include projects like D3 for visualization and Scikit-learn for machine learning libraries. On the technology and operating system side, there are a wide array of options available ranging from Hadoop, to Spark, to PostgreSQL.
The point is that there are now multiple open source projects focused on identical subject areas. As opposed to a single analytics-focused open source project, there are dozens or hundreds. Companies are also adopting and incorporating multiple of these projects into their internal processes. This works fine as long as the community keeps the software up to date. However, it can become a big problem if some of the projects end up abandoned or mostly ignored by the developer community that “owns” it.
HOW OPEN SOURCE MIGHT BACKFIRE
There are only so many top-notch programmers who know a given subject area well and who also have time they are willing to devote to contributing to open source projects. As more and more projects open up, the pool of available contributors is stretched further and further and they are forced to focus on a few projects they are most passionate about.
This is where the potential problem comes in. A few years down the road as many of the long-standing Apache HTTP server contributors retire, will younger programmers have interest in taking up an “old, unexciting” project like Apache HTTP server? They may instead opt to contribute to sexier options like TensorFlow in the AI space. This means that even mature, widely deployed open source software could struggle to maintain support over time.
Worse, will the new projects mature effectively? With developers hopping from project to project, open source could become like the nightclub scene. The hot club this month with all the crowds could be dead in six months as a new club takes over the market. The new club gets all the attention for a while until yet another club takes over.
For example, Hadoop was among the hottest, sexiest open source projects out there for a few years. There was no problem getting world class contributors to help Hadoop grow and expand very quickly. Today, however, Hadoop is becoming widely regarded as yesterday’s news. Much of the contributor community may start to move on to other, sexier projects. This does not bode well for Hadoop’s ability to maintain enterprise readiness in the long term. Keep in mind that Hadoop is simply an illustrative example of the broader issue. I’m not trying to pick on Hadoop.
HEDGING YOUR BETS
Not long ago, many organizations were hesitant to implement open source tools at all. Further, when open source was implemented, it was at a strategic, enterprise level. Today, most organizations have embraced open source and are allowing groups such as analytics organizations to implement open source tools specific to their domain. If that hot new open source analytical tool doesn’t take off as expected, an organization will be stuck with orphaned software that has no support structures in place.
Much of the open source software being implemented today is far less mature and far more niche than implementations of the past. Are we heading for a future filled with a lot of half-completed, unsupported, low quality open source projects in place? Possibly so.
The action to take is to be careful and deliberate in your pursuit of open source toolsets. Don’t just install every up and coming project and start building critical processes with it. Focus on implementing mature projects with a wide contributor base and make sure the project looks like it will be here for the long haul. Then, track and monitor the state of any implemented open source tool carefully so that there is time to react if the project looks to be heading for trouble. Finally, build your processes in such a way (like a Lego kit!) that it is easy to swap out components without disrupting the entire chain.
The last thing anyone wants is to be caught with a mission critical piece of software that suddenly needs to be replaced. The software may be free but fixing the problem will be far from it.
About the author
Bill Franks is IIA’s Chief Analytics Officer, where he provides perspective on trends in the analytics and big data space and helps clients understand how IIA can support their efforts and improve analytics performance. His focus is on translating complex analytics into terms that business users can understand and working with organizations to implement their analytics effectively. His work has spanned many industries for companies ranging from Fortune 100 companies to small non-profits.
Franks is the author of the book Taming The Big Data Tidal Wave (John Wiley & Sons, Inc., April, 2012). In the book, he applies his two decades of experience working with clients on large-scale analytics initiatives to outline what it takes to succeed in today’s world of big data and analytics. Franks’ second book The Analytics Revolution (John Wiley & Sons, Inc., September, 2014) lays out how to move beyond using analytics to find important insights in data (both big and small) and into operationalizing those insights at scale to truly impact a business. He is an active speaker who has presented at dozens of events in recent years. His blog, Analytics Matters, addresses the transformation required to make analytics a core component of business decisions.
Franks earned a Bachelor’s degree in Applied Statistics from Virginia Tech and a Master’s degree in Applied Statistics from North Carolina State University. More information is available at www.bill-franks.com.