Happy birthday pg_chameleon

Today is two years since I started working on pg_chameleon. Back in 2016 this commit changed the project’s license from GPL v2 to the 2 clause BSD and the project’s scope, which became a MySQL to PostgreSQL replica system.

Since then I learned a lot of lessons, made several mistakes and worked out solutions which resulted in a decent tool for bridging two different universes.

Writing a replica system is a very complex task. Even if the replica is designed to work with the same database engine the issues are hidden, nasty and treacherous.

When bridging two different databases, the quantity of issues to address literally skyrocket.

If I look back I still can’t believe what I made, considering that I worked on the project mostly on my daily commute, but hey, here we are.

And nowadays, from the pip downloads, it seems that the project is gaining popularity.

I also got some contributors willing to improve the project. I apologize with them if I haven’t merged yet the changes. The reason why is that I want to write the guidelines for contributing to the project.

So what’s next?

First, I want to thank Percona for this blog post on how to use pg_chameleon.

With the version 2.0, drafted during the pgconf eu 2017, the tool became more usable, reliable and simpler to manage. I haven’t started yet working on the version 2.1, but those are some of the weak spots in pg_chameleon 2.0 I want to fix.

init_replica speed. This is a problem which affects large databases as the procedure is single process. I want to make the initialisation process capable of working using multiple parallel jobs.
Not null fields. currently any field added after the init_replica becomes nullable in PostgreSQL. I’ll try to keep in sync the not null constraint as well.
on error resync. currently the tables when generating errors are pulled out from the replica. I want to give the possibility to tell the daemon to resync the table in case of error, automatically.
auto start after a synchronisation. currently after the init_replica,refresh_schema, sync_tables the replica process has to be started manually. I want to give the possibility to auto start the replica on completion.

Last but not least, I want to thank my former employer Transferwise for giving me the challenge and the opportunity to create a tool which is helping people to join two database technologies.

Happy birthday pg_chameleon!