How Can We Help?

What Database Engine Does Gini Use?

You are here:
< Back

There's a lot of hype and confusion swirling around database engine technologies today. The hype is often caused by the following factors:

  • For-profit corporations pumping propaganda into the tech media sphere trying to sell us something.
  • The tribe mentality that often emerges around technologies as engineers invest their time and energy into learning them. This can cause engineers to develop emotional and financial attachments to technologies for subjective, personal reasons.
  • Technologies are continuously evolving with new features and new ways to overcome rapidly evolving technical challenges. The rapid pace of innovation can be inherently confusing.

We took all these factors into account as we considered the best database engine for the Gini BlockGrid.

From a technical perspective, there are many database engines based on a variety of database storage models. There are NoSQL database engines, e.g., key-value store engines (e.g., LMDB, LevelDB, RocksDB, Redis, Cassandra, etc.), object database engines (e.g., MongoDB, CouchDB, DynamoDB, etc.), graph database engines (e.g., Neo4j, HyperGraphDB, etc.). Then there are SQL database engines (e.g., MySQL, SQL Server, Postgres, SQLite, MariaDB, Oracle, etc.)

To understand and appreciate why a particular database engine should be used for a particular use-case, it's important to understand your domain, your goals, and the fundamental differences between each database engine. Without a deep understanding of these factors, it's too easy to be distracted by all the hype and confusion.

For Gini, we have very clear requirements and a deep understanding of what a cryptocurrency database must accomplish. Virtually everybody on the Gini team has years of experience implementing databases for high-performance, mission-critical applications, especially for payment processing services. So, we bring a lot of insight and experience to this question.

At a high level, our primary goals were the following:

  • Cryptocurrency transactions must be Atomic, Consistent, Isolated and Durable ("ACID"). We've seen many cryptocurrency projects (including many of the largest projects) ignore this fundamental principle. For the sake of maximizing theoretical speed and simplicity, other projects often use NoSQL database engines that have high theoretical speed, but they do not perform truly ACID operations. This can create all kinds of problems over the long-run, including out-of-sync data, DB and file system corruption, cascading failures and outages, all of which can make a cryptocurrency vulnerable to hackers, lost time and money.
  • High Enough Throughput. Virtually all modern SQL and NoSQL database engines today can perform tens of thousands of read/write operations per second. This is more than sufficient for all payment networks on Earth today. For example, Visa's global credit card payment network operates at between 14,000 - 20,000 transactions per second, depending upon how it's measured. This is relatively low compared to the throughput of virtually all modern database engines today. There's no point in saying, "My DB engine is faster than yours!" if all that theoretical speed is not accompanied by truly ACID operation, especially if all that speed is not necessary anyway.
  • Small Memory Footprint. A database engine that is a memory hog puts a strain on users' computers, which hurts the user experience and can limit broad-based adoption.
  • Embedded Design. A DB engine that can be embedded as a component within your application is generally much faster than than one that must communicate between totally separate programs running on your computer. So, embeddable DB engines are ideal in this case.
  •  Reliable Interface for Haskell. Given that Haskell is the ideal language for building a secure, fast and scalable cryptocurrency, the DB engine must enable us to write native Haskell code that can communicate directly with the DB engine. We don't want to depend on (pseudo-) SQL code in our application logic because that slows down the DB query operations and creates friction in our development process due to frequent language context-switching.
  • Proven History of Reliability. The DB engine must not be the new kid on the block. It must have at least 10 years of independently verifiable performance metrics from real-world, mission-critical applications.

In contrast, the following DB engine features that are often touted as essential in many centralized server architectures are certainly not needed for a truly decentralized cryptocurrency like Gini.

  • High Concurrency. For a truly decentralized cryptocurrency, at the DB engine level, there's usually only one (possibly a few) users accessing the DB at any given moment. This is because, in a decentralized cryptocurrency, each node has its own copy of the blockchain. That means the DB engine running on the computer where the blockchain lives only needs to handle one or a few concurrent connections, which is trivial for any modern DB engine. This is a very different requirement compared to centralized payment systems where their DB engines must handle hundreds or thousands of concurrent DB connections. (Note: A cryptocurrency full node does need to manage many concurrent connections during the consensus protocol process, but that's an application logic requirement, not a DB engine requirement. They are totally separate processes with totally separate requirements.)
  • Big Data Capabilities. People often talk about Big Data without actually defining what that means. DB engines like MongoDB are wonderful Big Data DB engines, but they're totally unnecessary for any decentralized blockchain. This is because "Big Data" typically means a database of at least 100 terabytes. For context, Bitcoin's database is only about 200 gigabytes in early 2019 and it's already too big because a DB that size begins to cause inherent network centralization as an increasingly smaller number of users are able to download the entire blockchain. Thus, any cryptocurrency team that wants to achieve a truly decentralized architecture must build pruning, snapshotting and multiple node types into their architecture. Regardless, a blockchain is useless long before it ever reaches the minimum "Big Data" size of 100 terabytes.

Scalability

There are two general kinds of database scalability: vertical scalability (i.e., how many transactions per second can a single DB engine process) and horizontal scalability (how many transactions per second can a cluster of DB engines process). Given that we are talking about a decentralized network, the concept of "vertical scaling" is not very relevant for the reasons stated in the "High Enough Throughput" section above. So, we will focus on horizontal scaling below.

Horizontal scalability for any truly decentralized cryptocurrency is never going to be constrained by any modern DB engine. For example, SQLite can perform at least 50,000 insert operations per second on commodity hardware, which is fast enough for any large-scale payment network today. (But it's not fast enough for IoT- and AI-driven payment networks in the future, which is why we are also working on adding more horizontal scaling features to Gini for the future.) Here are a few points regarding horizontal scalability.

  • Regardless of DB engine speed/throughput, a truly decentralized cryptocurrency will always be constrained by the consensus protocol due to the requirement of syndicating blocks across the network to the minimum quorum of nodes. Here, the speed of light and network conditions are the constraints. Thus, most of the work related to achieving horizontal scalability needs to be performed in a blockchain node's application logic, regardless of the underlying DB engine.

  • The only realistic way to achieve horizontal scalability for a cryptocurrency is through a combination of workload-sharding, payment channel tunneling, and network topology segmentation. (DB sharding alone does nothing when the DB engine is not really the bottleneck.) Of course, these techniques introduce some centralization into the network, which is a tradeoff that some stakeholders might accept for particular market segments, e.g., IoT, high-volume financial services, and data science applications where high-speed/throughput are required.

NoSQL DB engines like MongoDB are most valuable when all the nodes in a cluster are running within the same centralized network infrastructure and without the overhead of a decentralized consensus protocol; otherwise, their value is not significantly better or worse than a SQL-based DB engine like Postgress or SQLite. In fact, for truly decentralized cryptocurrencies like Gini, the choice is mostly based on personal preference and tooling, e.g., if a person prefers collections/objects vs. tables/rows and JSON vs. SQL and whether there are interfaces provided by a language that reduce the complexity of designing and optimizing DB queries.

For all those reasons, for the Gini BlockGrid, we have chosen the rock-solid and fast SQLite database engine, which has been proven and reliable for nearly two decades in some of the most demanding computing environments on Earth.


Did You Like This Resource?


Gini is doing important work that no other organization is willing or able to do. Please support us by joining the Gini Newsletter below to be alerted about important Gini news and events and follow Gini on Twitter.