Most Software is Built on the Shoulders of Giants
Programming languages are often open source standards, commonly with open source implementations too. For example, C++ is an open source standard, and GNU C++, also known as G++, is an open source implementation. Python is fully open source, and so are the packages that most of data science and machine learning are built with (e.g., NumPy, Pandas, PyTorch, TensorFlow, Hugging Face Transformers).
The transformer architecture, which dominates the natural language processing industry, is also public domain knowledge. Linux, chosen by a significant portion of software developers and used to host most production environments, is free and open source. The most popular source versioning system—Git—is also free and open source.
There is no need to reinvent the wheel every time something new is built, and locking tools behind paywalls would significantly hinder the development of computer science as a whole.
The Role of Corporations and Communities in Open Source
A big part of open source projects are created by corporate entities (e.g., TensorFlow by Google, PyTorch by Meta AI) or even state governments (e.g., École Polytechnique Fédérale de Lausanne, a Swiss state university, funding Scala). However, there’s also a significant portion of community-driven free and open source software projects.
Some open source projects receive substantial financial backing, like Git and Python, while others rely mostly on community support, such as GIMP.
Why Corporations Invest in Open Source
One might ask, why would a corporation invest in an open source tool? It’s a valid question with a simple answer: they rely heavily on these tools.
Python is financed by a multitude of tech giants who use it.
Git is sponsored by, among others, Microsoft-owned GitHub, as the platform is entirely reliant on Git.
Why Individuals Contribute to Open Source
Why would an individual contribute to open source?
Some do it out of passion.
Some contribute out of gratitude.
Some just want to fix or add a feature for their personal needs.
Our Contribution at Lightning ERP
At Lightning ERP, we recently made a contribution to the LlamaIndex project. We did this because it lacked a feature we needed and out of gratitude. Our project uses a lot of free and open source tools, LlamaIndex included, and we believe there’s no better way to give back to the community than by sharing our work with others.
Our contribution is very simple—we needed a vector storage solution that would be as lightweight as possible. Since LlamaIndex is designed primarily for RAG (retrieval-augmented generation) pipelines, most vector stores were cloud-based or overly complex systems. The few available options didn’t satisfy our needs, so we created our own vector store based on Hnswlib, which we then shared.
Artur Stopa
コメント