Google opens the Spanner database with the PostgreSQL interface
Search engine and cloud computing juggernaut Google is hosting its Google Cloud Next ’21 conference this week, and one of the most interesting things the company has unveiled are several layers of software that gives its globally distributed relational database Spanner. the look and feel of the popular open source PostgreSQL relational database.
Make Spanner’s Cloud Spanner implementation – which means not the version that Google itself uses for its internal workloads like ad serving and data analytics, but the one that is exposed via Google Cloud and sold as a service – looks like PostgreSQL immediately opens Cloud Spanner to a wider variety of customers, especially in businesses. It is not clear whether the internal version of Spanner used by Google for its own work also supports PostgreSQL in the same way, but it makes sense as PostgreSQL is quickly becoming the interface of choice for developers who choose databases. open source relational data that sits behind the applications that they create.
The days when MySQL was the default choice – and the interface that many NoSQL and NewSQL databases were coded on when they were first created a decade or more ago – is fading away, and it has much to do with Oracle’s $ 7.4 billion acquisition of Sun. Microsystems in early 2010. Two years earlier, Sun had paid $ 1 billion to buy MySQL, which was pretty much the leading open source relational database and a threat to all relational database vendors, and major vendors at risk were Oracle, Microsoft, and IBM. Once Oracle took control of MySQL and the forced database with MariaDB, attention turned to the PostgreSQL database as developers searched for a new Switzerland on which to build their applications. And so the rise of PostgreSQL was pretty much guaranteed.
There is still a lot of MySQL in the world, which is why Google launched its Cloud SQL implementation of MySQL on Google Cloud in 2011. Google then released Managed Service implementations of PostgreSQL and Microsoft SQL Server under the cloud. The SQL brand, and in the case of the PostgreSQL variant, it’s a real PostgreSQL under the hood, meaning the service has the same scalability limits that PostgreSQL customers often struggle with, but it does has the merit of providing 100 percent compatibility with the open source variant of the PostgreSQL database.
This is not the case with the PostgreSQL layer on Cloud Spanner, and this is probably not the case with the PostgreSQL layer that Google supports internally on the real Spanner database – if it does indeed support PostgreSQL. .
In a blog post, Justin Makeig, Product Manager for Cloud Spanner, said that when it comes to the PostgreSQL layer on top of Cloud Spanner, don’t expect universal compatibility for PostgreSQL functionality.
“This draft is the first of a much larger long-term investment to make Spanner more open and accessible,” Makeig explained. “Initially, the PostgreSQL Spanner interface supported a basic subset of the functionality offered by PostgreSQL. By design, they align with current Spanner features that power a wide variety of critical applications in production today. Queries and schemas that use the PostgreSQL interface will have the same semantics as other PostgreSQL environments. 100% compatibility with PostgreSQL is not the goal. We focused on familiarity and portability, providing easier access to the consistency and availability of Spanner at scale without reducing deployment flexibility.
This is an important distinction, but to be fair, Google does more than add support for the PostgreSQL wired protocol to its database, and several databases, including the Spanner-inspired CockroachDB, have been supported since. first day.
Integration of PostgreSQL into Cloud Spanner is thorough; it’s not just a conversion overlay. At the database schema level, the PostgreSQL interface for Cloud Spanner supports native PostgresSQL data types and its Data Description Language (DDL), which is a syntax for creating users, tables, and indexes for databases. The result is that if you write a schema for the PostgreSQL interface for Cloud Spanner, it will port and run on any real PostgreSQL database, which means clients are not trapped in Google Cloud if they use this service in production and want to change. But customers should be careful. Spanner functions, such as table interleaving, have been added to the PostgreSQL layer because they are important features in Spanner. You can get stuck because of these. But Makeig says Google has tried to minimize those exceptions so customers don’t feel locked in. (They’ll decide how locked in they feel, and maybe afterwards it’ll be too late if they’re not careful.)
The PostgreSQL interface for Cloud Spanner compiles PostgresSQL queries up to Spanner’s native distributed query storage and processing primitives and not only supports the PostgreSQL wired protocol, which allows clients and a myriad of tools to third-party analysis to interact with the PostgreSQL database. This last element is the important thing for the business customers that Google is trying to attract to its cloud. Companies have applications built on PostgreSQL itself or on third-party data analysis and dashboard applications that rely on the PostgreSQL protocol, and supporting this layer, PostgreSQL databases and their applications and ancillary programs can all be moved to Google Cloud. This is mainly what it is about.
That said, Google also wants to attract new developers who code new apps on its eponymous cloud and the cloud variant of its Spanner database, and the best way to do that is to support the PostgreSQL interface. This way, these applications can start on the Cloud SQL for PostgreSQL service until they exceed its fairly limited scale (the limit is not the one that Google artificially imposes, it is an inherent database limit. open source like MySQL and PostgreSQL For big database jobs, it’s no coincidence that companies stick to Oracle, DB2, or SQL Server and drop it, quite heavily, on big NUMA servers. The horizontal scale for a true relational database that can do OLTP and OLAP at the same time is not trivial, which is why Google created Spanner to do a better job than its Megastore and Bigtable NoSQL databases first. place.
Google is previewing the PostgreSQL interface now and says it has very little overhead compared to Cloud Spanner and offers the same 99.999% uptime guarantee along with the same consistency and data security.
We still think Google should go all the way and open source Spanner, like it did Kubernetes, but it will be a very cold day before that happens. Spanner is the very glue of data that holds Google together, and is far more important today than the MapReduce data analysis method and the Google filesystem that underpins it was unveiled in 2004, inspiring Hadoop to see the day.