Enterprise Database

A Fully Dockerized MySQL to YugabyteDB Migration Strategy Using pgloader

A Fully Dockerized MySQL to YugabyteDB Migration Strategy Using pgloader 768 487 Exist Software Labs

While there have been many who began their journey to relational databases with the simple and popular MySQL, the evolution of business use cases involving more than read optimization and the need for more performant, full-fledged, read/write-optimized OLTP systems have given rise to a widespread migration from MySQL to Postgres.

Along with this, the transition from monolithic to cloud-native has also paved the way for distributed SQL systems that allow for read/write functionality in every node of the database cluster (while maintaining ACID-compliance across all nodes) and cloud-agnostic deployments of these nodes across geographic zones and regions. This is the future of the database, a future where reliability, accessibility, and scalability are built into the product. The future of the database is YugabyteDB.

From MySQL to YugabyteDBfast!

The method that we will be using to migrate a MySQL database to YugabyteDB is through the use of pgloader, a very reliable tool for migrating from MySQL (even SQL Server) to Postgres. We will first migrate the MySQL database to a Dockerized Postgres instance using Dockerized pgloader.

Once the MySQL database has been migrated to Postgres, we will then use the ysql_dump utility that comes with every installation of YugabyteDB to dump the Postgres database into a YugabyteDB-friendly format. This is one of the very useful traits of ysql_dump: it ensures that your Postgres dump can be fully restored in a YugabyteDB instance.

After getting the dump, we will restore this dump in the blank YugabyteDB database that we’ve created beforehand, thereby completing the migration from MySQL to YugabyteDB!



1. Get the Postgres Docker container

docker run -e POSTGRES_HOST_AUTH_METHOD=trust -p 5432:5432 -d postgres:11

2. Create the MySQL database counterpart in Dockerized Postgres


3. Run Dockerized pgloader to load from MySQL to Dockerized Postgres

docker run --rm --name pgloader dimitri/pgloader:latest pgloader --debug mysql://<user name>:<password>@<ip address of MySQL DB server>:3306/<source database name> postgresql://postgres@<ip address of Dockerized Postgres>:5432/<destination database name>

*If a user error is encountered, make sure the user and IP address combination indicated in the error is created in the MySQL source and has access to the databases to be migrated.”

4. Since pgloader creates a Postgres schema using the database name and puts the tables there, we can change the schema name to “public”

     l_old_schema NAME = '<schema name>';
     l_new_schema NAME = 'public';
     l_sql TEXT;
     FOR l_sql IN
          format('ALTER TABLE %I.%I SET SCHEMA %I', n.nspname, c.relname, l_new_schema)
     FROM pg_class c
          JOIN pg_namespace n ON n.oid = c.relnamespace
     n.nspname = l_old_schema AND
     c.relkind = 'r'
     RAISE NOTICE 'applying %', l_sql;
     EXECUTE l_sql;
     END LOOP;

5. In this example, we will be using Dockerized Yugabyte as the destination (also applies to other form factors)

a. 1-node cluster with no persistence: 

docker run -d --name yugabyte  -p7000:7000 -p9000:9000 -p5433:5433 -p9042:9042 yugabytedb/yugabyte:latest bin/yugabyted start --daemon=false

b. With persistence:

docker run -d --name yugabyte  -p7000:7000 -p9000:9000 -p5433:5433 -p9042:9042 -v ~/yb_data:/home/yugabyte/var yugabytedb/yugabyte:latest bin/yugabyted start --daemon=false

6. Go inside the Yugabyte container

a. To access the interactive terminal of the container:

docker exec -it <yugabyte container id> /bin/bash

b. Go to the bin directory:

cd /home/yugabyte/postgres/bin

c. Make sure destination database exists in YugabyteDB:

CREATE DATABASE <destination yugabytedb name>;

d. Dump the database in the Postgres container:

./ysql_dump -h <ip address of Postgres container> -U postgres -d <database name of postgres db> -p 5432 -f <dump name>.sql

e. Restore the Postgres dump in the blank database in the YugabyteDB instance:

./ysqlsh -p 5433 -d <database name of destination yugabyte db> -f <dump name>.sql


And there you have it! You have successfully migrated your MySQL database to the future of the database. You have migrated to YugabyteDB!

yugabytedb migration

Exist is your data solutions partner of choice!

Explore the next level of your digital transformation journey with big data and analytics. Let’s look at opportunities to better maximize your ROI by turning your data into actionable intelligence. Connect with us today, and we’ll proudly collaborate with you!

Enterprise Database

The Future of the Database: YugabyteDB

The Future of the Database: YugabyteDB 768 487 Exist Software Labs

The journey to application modernization brought about by the cloud-native renaissance continues, and the benefits to be had are truly being enjoyed by the enterprises that embrace the path. Speed, scalability, resiliency, and agility may seem to just be industry buzzwords, but in reality, they translate to better application deployment, performance, and availability, which further translate to what really matters: happy customers.

This has given way to the concomitant need for databases to adapt to this need for speed, scalability, resiliency, and agility. The way traditional databases have implemented a single-node access to the database cluster via the master node has proven untenable in a commercial environment wherein the need to scale users, not just locally, but across the regional and geographical divide, has become dire and ubiquitous.

This is where the gap is filled by YugabyteDB.


What is YugabyteDB?

What is YugabyteDB?

YugabyteDB is a transactional, distributed SQL database that was designed primarily to possess the virtues of the cloud-native philosophy. Its creators wanted a chiefly OLTP database that was fast, easy to add more nodes to, able to tolerate node failures, upgradable without incurring any downtime, and deployable in all form factors (public/private cloud, VMs, and on-prem).

Being a distributed SQL database, it has automatic distribution of data across nodes in a cluster, automatic replication of data in a strongly consistent manner, support for distributed query execution so clients do not need to know about the underlying distribution of data, and support for distributed ACID transactions.

It is a multi-API database that exposes the following APIs (more will be added in the future): 

  • YSQL – an ANSI SQL, fully-relational API that is completely compatible with PostgreSQL 11.2
  • YCQL – a semi-relational SQL API that is based on the Cassandra Query Language

It is a Consistent and Partition Tolerant (CP) database in that in the event of a network partition within the database cluster wherein one of the nodes cannot communicate with the other nodes and determine majority membership, data consistency over availability is prioritized by the system and this node will not be able to accept writes, whereas the nodes that are still part of the majority will remain unaffected.

It is completely open source, released under the Apache 2.0 license.


What are the key benefits of YugabyteDB?

The following are some of the benefits that are immediately enjoyed “out-of-the-box”:

  • No single point of failure given all nodes are equal
  • Distributed transactions across any number of nodes
  • Scale write throughput linearly across multiple nodes and/or geographic regions.
  • Low-latency reads and high-throughput writes.
  • Strongly consistent, zero data loss writes.
  • Cloud-neutral deployments with a Kubernetes-native database.
  • Automatic failover and native repair.
  • 100% Apache 2.0 open source even for enterprise features.

In other words, you get a cloud-native, transactional, distributed SQL database system that allows you to read and write on every node in the cluster (with ACID assurance), distribute your application load across many nodes in many regions and geographies, read and write data fast, deploy anywhere, and be highly available—all in open source!


Use Cases

YugabyteDB is perfect for:Use Cases of YugabyteDB

Just this morning, social media personality, James Deakin, posted on his FB wall about a particular bank whose “app feels like it’s running on windows 95” (his own words). He ended up closing his account due to the overall poor customer experience brought on by the subpar performance of this bank’s client-facing, internet applications, along with other concerns.

YugabyteDB is perfect for the client-facing, Internet, transactional application.

Want to know more about the Yuggernaut of Distributed SQL? Contact us.

Exist is your data solutions partner of choice!

Explore the next level of your digital transformation journey with big data and analytics. Let’s look at opportunities to better maximize your ROI by turning your data into actionable intelligence. Connect with us today, and we’ll proudly collaborate with you!

Introducing PostgrEX: How to Fulfill Your Database SLAs Without Having to Sell a Kidney

Introducing PostgrEX: How to Fulfill Your Database SLAs Without Having to Sell a Kidney 768 487 Exist Software Labs

In a past blog post, I gave the definition of software as being enterprise-grade in the following manner:

A piece of software is enterprise-grade when it caters to the needs of not a single individual, nor a select group of individuals, but the whole organization. When applied to database management systems, an enterprise database is an implementation of database software that serves the organization by managing their humongous collection of data. It must be robust enough to handle queries from hundreds to tens of thousands of users at a time. It must also have a host of features that are geared towards improving the productivity and efficiency of the organization, such as multi-processing, parallel queries, and clustering, to name a few.

To tease it out a little bit further, I would like to propose that a database implementation is “enterprise” when it possesses the following attributes:

1.    A database engine that has proven itself in a multitude of business applications globally in a span of decades

2.     Able to meet strict SLAs (at least 5 nines) through high availability and failover mechanisms

3.    Monitoring

4.    Backup and Recovery Management

5.    Connection Pooling

Traditionally, enterprise database implementations have been costly investments and organizations have been willing to pay the price given the criticality of data to any business endeavor. But given the current economic climate brought on by the COVID-19 pandemic, along with the perennial need for businesses to streamline costs in order to divert savings into the core business, many are asking: Is there a better, more cost-efficient way of implementing a database solution without sacrificing enterprise-ness?

The answer is most certainly! Let me introduce you to PostgrEX.

What is PostgrEX?

PostgrEX is shorthand for Postgres EXIST Enterprise Xpertise.

It is an enterprise-grade database platform built on top of a purely open-source technology stack and is part of EXIST Software Labs Inc.’s Data Solutions.

What are the components of PostgrEX?

1.    Scoping and sizing of DB hardware

We will recommend the hardware specifications (memory, CPU, storage, networking, etc.) that will be optimal for your business requirements based on the current and projected data growth, the total number of users, total concurrent users, largest table size, largest query size, etc.

2.     Installation

We will install the database system, along with the high availability/failover, monitoring, backup/recovery, and connection pooling components.

3.    Optimization

We will optimize the database configuration settings for the best possible performance given the hardware available.

4.    High Availability/Failover/Disaster Recovery

We will set up replication between the Postgres database servers (streaming replication, WAL log-shipping, or a combination of both) in the Main site and we can also set up replication to a DR site.

We will also set up and configure Patroni, etcd, and HAProxy as part of the failover mechanism of the system.

5.    Monitoring

We will install, set up, and configure pgCluu as the default DB cluster monitoring tool.

6.    Backup and Recovery

We will install, set up, and configure Barman as the default DB backup and recovery management tool.

7.    Connection Pooling

We will install, set up, and configure pgBouncer as the default DB connection pooling tool.

8.    Query Optimization

We can also provide query optimization services to your Developers in order to ensure tip-top application performance.

9.    Migration to Postgres

We can migrate your existing SQL Server, MySQL, and Oracle databases to Postgres CE.

What are the technologies used by PostgrEX?

1.    Database

Postgres, or PostgreSQL, is arguably the best open-source object-relational database management system available today. It was DB-Engine’s “DB-of-the-Year” for 2 years straight (2017 and 2018), and has proven itself in mission-critical applications across all industry verticals.

See: Why use PostgreSQL for your Business?

2.    High Availability and Failover

Patroni – an open-source Python application that handles Postgres configuration and is ideal for HA applications. See Patroni documentation.

etcd – a fault-tolerant, distributed key-value store that is used to store the state of the Postgres cluster. See etcd documentation.

HAProxy – provides a single endpoint to which you can connect the application. It forwards the connection to whichever node is currently the master. It does this using a REST endpoint provided by Patroni. Patroni ensures that, at any given time, only the master Postgres node will appear as online, forcing HAProxy to connect to the correct node. See HAProxy documentation.

3.     Monitoring

pgCluu – a lightweight, open-source Postgres monitoring and auditing tool. See pgCluu documentation.

4.    Backup and Recovery

Barman – an open-source backup and recovery management tool. See Barman documentation.

5.    Connection Pooling

pgBouncer – a lightweight, open-source connection pooler for Postgres. See pgBouncer documentation.

Moving Forward with PostgrEX

Is your organization ready to face the challenges of an uncertain future? Having enough money in the bank is certainly a top priority and doing away with unnecessary and exorbitantly-priced database license costs is one way of doing this.

With PostgrEX, your business applications can still enjoy industry-recognized, top-level, enterprise database excellence through the use of expertly-configured, purely open-source technologies. This means you get to keep your kidney to live and fight another day—and many other days!

Contact us for more information.

Download our datasheet now!