close
close

Association-anemone

Bite-sized brilliance in every update

The 10 hottest open-source software tools of 2024
asane

The 10 hottest open-source software tools of 2024

Here’s a look at 10 open-source software tools—including software for building AI applications or managing huge volumes of data—that are already widely used or gaining popularity.


Open to new ideas

Open source software tools continue to grow in popularity due to the many advantages they offer, including lower initial software and hardware costs, lower total cost of ownership, no vendor lock-in, simpler license management, and support from active communities.

In the following slides, as part of the CRN 2024 Year In Review project, we take a look at some of the most popular open-source software products that caught our eye this year. Some of these have been around for a while and are already in widespread use, while others are relatively new – a couple just debuted in the last year or so – but are showing early signs of momentum.

Not surprisingly, the wave of AI and generative AI application development is a major driver of open-source software adoption. Some of the products on this list are in the software development space or help address the need to manage the huge volumes of data that power AI systems.

These products are available under open-source licenses such as the MIT License, the Apache License 2.0, the GNU GPL, and others. Some are products developed by startups that have received financial investment from Y Combinator, the startup accelerator and venture capital firm.


Airbyte

Airbyte is a fast-growing data movement and integration platform for ETL/ELT data pipelines that connect applications, APIs, databases and files to data warehouses, data lakes and other destinations. Airbyte it can also be used to move unstructured and semi-structured data into vector databases and large model language frameworks for AI applications.

The Airbyte Open Source core is already used by over 40,000 companies. The software is available under several open-source licenses, including the MIT License and the Elastic 2.0 License.

Airbyte’s namesake developer, headquartered in San Francisco, also offers a number of commercial products and services around the platform. The company launched a partnership program in May, including a certification course, to help technology service providers and resellers work with Airbyte software.


Apache DataFusion

Apache Software Foundation describe DataFusion as “a fast and extensible query engine for building high-quality, data-centric systems” such as databases, dataframe libraries, machine learning, and streaming applications.

DataFusion can be used as an embedded or custom SQL engine and used as a foundation for building new systems with a focus on high-throughput, low-latency analytic, streaming, and transactional workloads.

DataFusion leverages the technology capabilities of Apache Arrow, a language-independent framework for building data analytics applications that process columnar data, and the Rust programming language.

In June, the Apache Software Foundation, which has been developing DataFusion since 2019 as part of the Apache Arrow project, said DataFusion is now designated as a top-level project “to provide a more focused governance capability for continued growth.”

DataFusion is available for download from the Apache Software Foundation websiteGitHub and other sites under the Apache 2.0 License. The latest source version is 41.0.0.


Respond

Danswer provides an open-source AI assistant and enterprise search application that connects all of a company’s tools, applications and documents, making it easier to find information across an organization, according to company website.

Danswer says that one way to think about his software is ChatGPT – but with access to an organization’s own information, data and documents – and so no hallucinations. The software already offers more than 40 turnkey integrations, such as with Slack and Google Docs, “with more being built every day,” according to the company.

Danswer software is self-hosted either in a company’s data center or on a cloud platform.

Founded in 2023, Danswer is backed by Y Combinator. The software, available under the MIT license, is available from the company and GitHub.


DuckDB

DuckDB is a high-performance, in-process database that is designed to support online analytical processing (OLAP) query workloads.

The relational (table-oriented) database supports SQL and uses a column vectorized query execution engine that can process large batches of values ​​in a single operation as a vector, according to Website database of databases. The database is designed to run embedded in a host process – there is no database server to install.

DuckDB was originally developed at Centrum Wiskunde & Informatica, the national research institute for mathematics and computer science in the Netherlands, in 2018.

DuckDB and its core extensions are open source under the MIT license, and the entire source code is freely available on GitHub. DuckDB version 1.0.0 was just launched in June and is available via DuckDB.org website and GitHub.

One of the reasons DuckDB has attracted attention is the cloud analytics software developed by the startup MotherDuck running on DuckDB.


Grafana observability tools

Grafana is an open-source observability and data visualization platform used to collect and visualize metric, tracking, and logging data from many data sources. It is commonly used as a component in IT/OT monitoring systems.

Grafana is developed by Grafana Laboratories and is available under the AGPL-3.0 open-source license. In April, the company debuted Grafana 11.0 with a new Explore Metrics root cause analysis feature, improved visualizations, simpler alerts, and support for additional data sources.

In addition to its flagship software, Grafana Labs develops additional open-source software, including Grafana Loki, a multi-tenant log aggregation system; Grafana Tempo, back-end software for large-scale distributed tracking; and Grafana Mimir, a scalable backend metrics storage and analysis tool. Grafana Labs also sells commercial enterprise editions of its software.


LangChain

LangChain is an open-source orchestration framework for developing generative AI applications powered by large language models (LLMs) that connect to external data sources, according to Website Python.Langchain.com and a description on IBM website.

Companies and organizations can get more value from GenAI if they have a way to load their own proprietary data into LLMs, a potentially difficult task due to the complexity of data preparation and LLM tuning and data security concerns.

LangChain simplifies every stage of the LLM application lifecycle, including application development and deployment to production. Specific tools include LangGraph for creating stateful agents, LangSmith for chain inspection and monitoring, and open-source building blocks, components, and third-party integrations.

Specific LangChain tools are available at GitHubincluding the framework itself under the MIT License.


MindsDB

MindsDB is an open-source virtual database and development platform that automates workflows that connect real-time data to AI systems. The software makes it easy to build, train and deploy machine learning models using SQL queries.

MindsDB, the software developer, was founded in 2017 and is headquartered in San Francisco. The company says its mission with its open-source software is to democratize machine learning, fittingly company website. With this goal in mind, in September 2023 the company launched the MindsDB AI Collective, a network of AI startups and developers that promotes open-source machine learning and AI projects and provides investor connections, technical support and talent.

The company is one of many open-source tech startups funded by Y Combinator, including several on this list.

MindsDB software is available under the open source MIT License, while MindsDB Core, the core software component, specifically uses the Elastic v2 License.


OpenFoundry

The OpenFoundry platform provides developer infrastructure for open-source AI projects. The technology helps engineers build, deploy and scale their open-source AI “stack” 10x faster, and deliver open-source, AI-powered products faster, according to company website.

OpenFoundry was just launched this year by CEO Tyler Lehman, formerly a product manager at Meta, and CTO Arthur Chi, a software engineer at Slack. The company is another open-source technology startup funded by Y Combinator.

The OpenFoundry page on the Y Combinator website touts the startup as an open-source alternative to machine learning and data science platform Hugging Face. OpenFoundry is available on GitHub under the MIT license.


OpenZiti

OpenZiti is a free and open-source project focused on bringing zero-trust networking principles directly into any application, according to www.openziti.io website. The platform provides all the components needed to implement a zero-trust overlay network and provides all the tools developers need to integrate zero-trust into their applications.

The OpenZiti Project “believes that the principles of zero trust should not stop at your network, those ideas belong in your application,” according to the site.

OpenZiti is available under the Apache 2.0 license and can be downloaded from the OpenZiti.io website and GitHub.

OpenZiti components include The Fabric, a scalable overlay network mesh with built-in intelligent routing; The Edge, components that provide secure entry points into the overlay network; SDKs that allow developers to integrate zero-trust principles into applications; and Tunneling, a bridge for built-in zero-trust applications.


Twentieth

Startup Twenty is pursuing the audacious task of developing an open-source, SaaS-based CRM application that is designed to provide a modern alternative to application giant Salesforce.

On his website Twenty says its software provides an operating system for managing customer data, along with all the features of a leading CRM system, including tasks and workflow views with “kanbans views.”

The app is still in early “alpha” development, but is available (under the GNU Affero General Public License) from the company and GitHub for those who want to check it out.

The latest iteration, version 0.32.0, was released on November 3 with a number of additions and improvements, including stronger search, a webhooks filter and webhooks multi-object filtering, advanced settings and a new settings layout, a soft delete function, and a new array field type to store undefined values.

Founded in 2023 and headquartered in San Francisco, Twenty received funding from Y Combinator.