Github chooses Piql for perpetual data storage

Github is the world’s largest software repository, with 37 million users and more than 100 million repositories. Leaders in the technology industry, Github wanted a trustworthy solution for their perpetual storage needs.

CHALLENGE

GitHub has a clearer view than most of the speed at which technology evolves. Software and hardware can become obsolete to a newer version in a matter of months and this could jeopardize valuable source code to be accessible in the future. Source code creates the foundation for future development of computer science, which in many ways, is the foundation of the digital world as we know it. With so much of the world now digital and so much of our heritage increasingly digital, software and source code are a core part of the story.

Open source is particularly important, as the basis most software has been built on. Github is a major advocate of open source and places a high value on open source repositories.

Information that is born digital can be difficult to keep alive. Modern data storage options are designed for the short term and data can become inaccessible after just a few years. GitHub realised the way they were running their archival processes to protect their valuable code assets today was not sufficient. Wanting to learn about both the technical processes as well as the contextual processes of how to keep digital information safeguarded for decades and even centuries, as well as how to ensure guaranteed future access to this information, Github engaged a panel of experts.

These experts, known as the best of the best within the different parts of the digital archiving processes, include the Long Now Foundation, the Internet Archive, Software Heritage Preservation, Stanford Library and Microsoft Research. When it’s a matter of securing the world code heritage, nothing is left to chance, and only state of the art solutions where considered to solve their challenges of keeping source code secured and accessible for hundreds of year.

We’re excited to partner with Piql to help preserve open source software for future generations. Piql’s custom film and archiving technologies will allow us to store terabytes of data on a durable medium designed to last for over 1,000 years. We’re delighted that every active public GitHub repository will be written to this film, and safeguarded in the Arctic World Archive in Svalbard, for the centuries and generations to come.

Kyle Daigle, Director of Special Projects at GitHub

SOLUTION

With a key focus on perpetuity, Github engaged Piql for its unique and unmatched technology that could withstand any technological obsolescence in a time perspective of 1000+ years. Github also wanted a secure sustainable storage facility for storing information in a secondary location outside of the United States. Piql’s unique approach to archiving data, built on principles of open source and future access, offered many benefits to the technology giant. With authenticity measures, no need for data migration and vendor independence, piqlFilm can do what no other technology can. Offer perpetual storage whilst being completely self-contained ensuring that the data can be read back both by machines as well as the human eye, guaranteeing future access to the original data independent of how much time passes.

In addition, storage in the Arctic World Archive (AWA), a safe, resilient and remote repository of digital world memory, aligned perfectly with Github’s objectives. Data stored here can last for over 1000 years, with ensured readback access regardless of future technology.

In the initial deposit, GitHub has stored 6,000 of its most significant repositories in AWA for perpetuity, capturing the evolution of technology and software. This collection includes the source code for the Linux and Android operating systems; the programming languages Python, Ruby, and Rust; web platforms Node, V8, React, and Angular; cryptocurrencies Bitcoin and Ethereum; AI tools TensorFlow and FastAI; and many more. A snapshot, but very important documentation of the world history of computer science.

In its second AWA deposit, GitHub stored a snap-shot of every active public repository, featuring millions of individual contributions. These two deposits collectively provide an overview of the state of open source software development and use in the world today.

FUTURE

Piql and GitHub are continuing collaboration through the GitHub Archival Program, co-designing new elements of perpetual storage as part of the Arctic Code Vault project.