Database Migration: A TypeScript-guided-journey from MongoDB to PostgreSQL

Database Migration: A TypeScript-guided-journey from MongoDB to PostgreSQL

·

12 min read

By Matheus Castiglioni

This TypeScript-guided transition from MongoDB to PostgreSQL empowers developers to seamlessly adapt backend architectures, so it is possible to change the database layer on the fly with no break changes. Businesses gain improved scalability, performance and data integrity without disrupting production, fostering uninterrupted development and growth.

How to smoothly transition between databases without maintenance mode, production interruptions or outages

In the world of backend development, adaptability and scalability are paramount. This article delves into a real-world scenario where a Node.js backend, powered by TypeScript and Fastify, smoothly transitioned between these databases with no maintenance mode or production interruptions or outages.

Feel free to take a look at the simple final code examples.

Setting the stage: The MongoDB era

Our journey begins with a robust backend API, leveraging MongoDB as a data storage layer. TypeScript’s embrace of interfaces and contracts allowed for a structured and reliable codebase. The use of design patterns—specifically, the repository pattern for data abstraction and the service pattern for encapsulating business logic—ensured a well-organised architecture.

MongoDB has been chosen as a storage layer for several reasons:

  • Flexible schema: MongoDB uses a flexible schema design (NoSQL) where documents in a collection don’t need to have a uniform structure. This flexibility is advantageous when dealing with evolving data models.

  • Scalability: It’s designed to scale horizontally by adding more machines to the database, allowing it to handle large volumes of data and high traffic loads.

  • Performance: Its ability to store related data in a single document and support for embedded data models can enhance query performance. Additionally, it supports indexing for faster queries.

  • JSON-like documents: MongoDB stores data in a format similar to JSON, which is often more natural for developers and can simplify data integration with applications.

  • Replication and high availability: It offers built-in replication, ensuring data redundancy and high availability in case of server failures.

  • Automatic sharding: MongoDB can automatically partition data across multiple servers, allowing for efficient distribution and scaling of data.

However, it’s important to note that while MongoDB has these advantages, it might not be the best choice for every use case. Factors such as specific data requirements, transactional needs and the nature of the application should be considered when choosing a storage layer.

The need for change: Introducing PostgreSQL

As the project evolved, it became evident that the system’s requirements were shifting towards PostgreSQL. Referential, data integrity, and strong relationships were key factors in migrating to a relational database and replacing MongoDB. So we definitely need to use a relational database and Postgres was the option chosen by the team to replace MongoDB for many reasons:

  • Advanced features: It offers a wide range of advanced features, including support for JSON and other semi-structured data types, full-text search, window functions, common table expressions, and more.

  • Extensibility: PostgreSQL allows users to define their own data types, indexes and functions, enabling custom extensions suited to specific needs.

  • Community and support: PostgreSQL has a strong, active community that provides continuous support, updates, and a vast array of documentation and resources.

  • Performance: It’s known for its performance and scalability. With proper optimisation and configuration, PostgreSQL can handle large amounts of data and complex queries efficiently.

  • Open Source: Being open source means it’s free to use, and the community actively contributes to its development, ensuring continuous improvements and updates.

  • ACID compliance: PostgreSQL ensures data integrity and consistency by adhering to ACID (Atomicity, Consistency, Isolation, Durability) principles, crucial for many applications.

  • Cross-platform compatibility: It’s available on various platforms like Windows, Linux, macOS and others, making it versatile and widely accessible.

  • Security: PostgreSQL offers robust security features, including SSL support, data encryption, access controls and authentication methods, making it suitable for applications handling sensitive information.

  • Maturity and stability: With a long history of development and refinement, PostgreSQL is mature and considered highly stable for mission-critical applications.

  • Ecosystem and tools: It has a rich ecosystem of tools, libraries, and extensions that enhance its functionality and ease of use, catering to different application requirements.

Choosing PostgreSQL over other relational databases often depends on specific project needs, but its rich feature set, performance, reliability and strong community support make it a compelling choice for many applications.

The refactoring process: From MongoDB to PostgreSQL

Initially, a repository interface was been created to act as a construct between the storage layer and the application:

type Nullable<T> = T | null
type WithNoId<T> = Omit<T, 'id'>

// Interface for repositories
export interface Repository<T> {
  create(product: WithNoId<T>): Promise<T>
  getById(id: string): Promise<Nullable<T>>
   // Other database operations...
}

An infrastructure repository was then implemented to connect the application to the MongoDB service:

type Product = {
  id: string
  name: string
}

// MongoDB repository implementation
export class ProductRepository implements Repository<Product> {
  async create(product: WithNoId<Product>): Promise<Product> {
    // MongoDB persistence logic...
  }

  async getById(id: string): Promise<Nullable<Product>> {
    // MongoDB retrieval logic...
  }

  // Other MongoDB operations...
}

With PostgreSQL in sight, a new repository was created along the old one, meticulously adhering to the same interface as the MongoDB repository.

type Product = {
  id: string
  name: string
}

// PostgreSQL repository implementation
export class ProductRepository implements Repository<Product> {
  constructor(private readonly database: Client) {}

  async create(product: { name: string }): Promise<Product> {
    // PostgreSQL persistence logic...
  }

  async getById(id: string): Promise<Nullable<Product>> {
    // PostgreSQL retrieval logic...
  }

  // Other PostgreSQL operations...
}

Unveiling TypeScript’s power: Interfaces & contracts

Central to this transition was TypeScript’s utilisation of interfaces to define clear contracts for repositories. These interfaces served as a blueprint, ensuring standardised interactions with the database layer, abstracted from specific implementations.

// Interface for repositories
export interface Repository<T> {
  create(product: { name: string }): Promise<T>
  getById(id: string): Promise<Nullable<T>>
   // Other database operations...
}


// MongoDB repository implementation
export class ProductRepository implements Repository<Product> {
  async create(product: WithNoId<Product>): Promise<Product> {
    // MongoDB persistence logic...
  }

  async getById(id: string): Promise<Nullable<Product>> {
    // MongoDB retrieval logic...
  }

  // Other MongoDB operations...
}

// PostgreSQL repository implementation
export class ProductRepository implements Repository<Product> {
  constructor(private readonly database: Client) {}

  async create(product: { name: string }): Promise<Product> {
    // PostgreSQL persistence logic...
  }

  async getById(id: string): Promise<Nullable<Product>> {
    // PostgreSQL retrieval logic...
  }

  // Other PostgreSQL operations...
}

Architectural elegance: Injecting the correct database service using dependency injection

The repository pattern is responsible for decoupling data access logic and keeping an abstraction from the database engine. These patterns maintained a modular structure, easing the process of swapping database services.

// Product repository depending on postgress client

export class ProductRepository implements Repository<Product> {
  constructor(private readonly database: Client) {}

  async create(product: { name: string }): Promise<Product> {
    const result = await this.database.query(SQL`INSERT INTO products (id, name) VALUES (${crypto.randomUUID()}, ${product.name}) RETURNING *`)

    return {
      id: result.rows[0].id,
      name: result.rows[0].name
    }
  }

  async getById(id: string): Promise<Nullable<Product>> {
    const result = await this.database.query(SQL`SELECT p.* FROM products p WHERE p.id=${id}`)

    if (result.rows.length > 0) {
      return {
        id: result.rows[0].id,
        name: result.rows[0].name
      }
    }

    return null
  }

  // Other product-specific operations...
}

// Product Repository depending on mongodb client
export class ProductRepository implements Repository<Product> {
  constructor(private readonly database: Model<Product>) {}

  async create(product: WithNoId<Product>): Promise<Product> {
    return (await this.database.create({ id: crypto.randomUUID(), name: product.name })).toObject()
  }

  async getById(id: string): Promise<Nullable<Product>> {
    return await this.database.findOne({ id })
  }
  // Other product-specific operations...
}

Having leveraged TypeScript’s dependency injection capabilities, the new service seamlessly replaced the old one within the existing codebase.

Harnessing dependency injection: Ensuring seamless integration

The implementation of dependency injection allowed for the effortless switch between the database layers. By injecting the PostgresProductRepository, the backend continued to operate flawlessly, abstracted from the intricacies of the underlying database technology.

import Fastify, { FastifyReply } from 'fastify'
import Postgrator from 'postgrator'
import { CreateProductInput, GetProductInput, ProductController } from './controllers/product.controller'
import { mongoose } from './infrastructure/mongo.client'
import { postgres } from './infrastructure/postgres.client'
import { ProductRepository as MongoProductRepository } from './repositories/mongo/product.repository'
import { ProductRepository as PostgresProductRepository } from './repositories/postgres/product.repository'
import { Product } from './models/product.model'

(async () => {
  const postgrator = new Postgrator({
    database: 'postgres',
    driver: 'pg',
    migrationPattern: `${__dirname}/database/migrations/*.sql`,
    schemaTable: 'migrations_version',
    execQuery: (query) => postgres.query(query),
  });

  // const repository = new MongoProductRepository(Product)
  const repository = new PostgresProductRepository(postgres)

  const productController = new ProductController(repository)
  const fastify = Fastify({
    logger: true
  })

  try {
    await mongoose.connect('mongodb://root:root@127.0.0.1:27017/journey_from_mongo_to_pg?authSource=admin')
    await postgres.connect()
    await postgrator.migrate()
  } catch (err) {
    throw err
  }

  fastify.post('/products', (request: CreateProductInput, reply: FastifyReply) => productController.createProduct(request, reply))
  fastify.get('/products/:productId', (request: GetProductInput, reply: FastifyReply) => productController.getProduct(request, reply))

  fastify.listen({ port: 3000 }, async (err, address) => {
    if (err) {
      fastify.log.error(err)
      process.exit(1)
    }

    fastify.log.debug(`Server is now listening on ${address}`)
  })
})()

The code snippet above is a TypeScript script that sets up a web server using the Fastify framework, connects to both MongoDB and PostgreSQL databases, and defines routes for handling product-related operations.

Overall, this script is a basic setup for a web server using Fastify, with connections to MongoDB and PostgreSQL databases for product management. It includes routes for creating and retrieving products, like:

  1. Get a product by id

  2. Create a new product

Compile-time assurance: TypeScript’s safety net

Crucially, TypeScript’s compile-time checks acted as a safety net throughout the transition. This inherent capability ensured that any potential issues were caught early, guaranteeing the integrity of the API functionality.

npm run type:check
{
  "scripts": {
    "type:check": "tsc"
  }
}

Storage migration

This transition involves gradually shifting read operations from MongoDB to PostgreSQL while initially writing data to both databases for redundancy. Eventually, all operations are centralised to PostgreSQL for both reading and writing, completing the migration process.

Writing into MongoDB and PostgreSQL, reading from MongoDB

Initially, the system writes data into both MongoDB and PostgreSQL for redundancy but reads exclusively from MongoDB.

// Controller handling product logics
export type CreateProductInput = FastifyRequest<{ Body: { name: string } }>
export type GetProductInput = FastifyRequest<{ Params: { productId: string }}>

export class ProductController {
  constructor(private readonly postgresRepository: Repository<Product>, private readonly mongoRepository: Repository<Product>) {}

  async createProduct(request: CreateProductInput, reply: FastifyReply) {
    try {
      const product = request.body

      if (product?.name?.trim()?.length > 0) {
        // Save product using both repositories
        await this.mongoRepository.create(product)
        return reply.send({ data: await this.postgresRepository.create(product) })
      }

      return reply.code(400).send({ message: 'Name is required to create a new product' })
    } catch (err) {
      throw err
    }
  }

  async getProduct(request: GetProductInput, reply: FastifyReply) {
    try {
      const productId = request.params.productId
      // Read product using mongo repository
      const product = await this.mongoRepository.getById(productId)

      if (!product) {
        return reply.code(404).send({ message: `No product found for id ${productId}` })
      }

      return reply.send({ data: product })
    } catch (err) {
      throw err
    }
  }
}

// Define the route for creating a product
fastify.post('/products', (request: CreateProductInput, reply: FastifyReply) => productController.createProduct(request, reply))

// Define the route for reading a product
fastify.get('/products/:productId', (request: GetProductInput, reply: FastifyReply) => productController.getProduct(request, reply))

Keep writing into both, start reading from PostgreSQL

As the transition progresses, maintain writing data into both databases but start reading from PostgreSQL.

// Controller handling product logics
export type CreateProductInput = FastifyRequest<{ Body: { name: string } }>
export type GetProductInput = FastifyRequest<{ Params: { productId: string }}>

export class ProductController {
  constructor(private readonly postgresRepository: Repository<Product>, private readonly mongoRepository: Repository<Product>) {}

  async createProduct(request: CreateProductInput, reply: FastifyReply) {
    try {
      const product = request.body

      if (product?.name?.trim()?.length > 0) {
        // Save product using both repositories
        await this.mongoRepository.create(product)
        return reply.send({ data: await this.postgresRepository.create(product) })
      }

      return reply.code(400).send({ message: 'Name is required to create a new product' })
    } catch (err) {
      throw err
    }
  }

  async getProduct(request: GetProductInput, reply: FastifyReply) {
    try {
      const productId = request.params.productId
      // Read product using postgres repository now
      const product = await this.postgresRepository.getById(productId)

      if (!product) {
        return reply.code(404).send({ message: `No product found for id ${productId}` })
      }

      return reply.send({ data: product })
    } catch (err) {
      throw err
    }
  }
}

Write and read exclusively from PostgreSQL

Finally, stop writing into MongoDB and shift all read operations to PostgreSQL.

export type CreateProductInput = FastifyRequest<{ Body: { name: string } }>
export type GetProductInput = FastifyRequest<{ Params: { productId: string }}>

export class ProductController {
  constructor(private readonly repository: Repository<Product>) {}

  async createProduct(request: CreateProductInput, reply: FastifyReply) {
    try {
      const product = request.body

      if (product?.name?.trim()?.length > 0) {
        return reply.send({ data: await this.repository.create(product) })
      }

      return reply.code(400).send({ message: 'Name is required to create a new product' })
    } catch (err) {
      throw err
    }
  }

  async getProduct(request: GetProductInput, reply: FastifyReply) {
    try {
      const productId = request.params.productId
      const product = await this.repository.getById(productId)

      if (!product) {
        return reply.code(404).send({ message: `No product found for id ${productId}` })
      }

      return reply.send({ data: product })
    } catch (err) {
      throw err
    }
  }
}

Data migration from MongoDB to PostgreSQL using an ad-hoc script

Additionally if the database has a couple of records and is a simple data model we can migrate the data using a single script where we’ll read all data coming from all MongoDB documents and persist everything into Postgres tables.

Execution steps:
  1. Initialising database connections: The script initializes connections to both MongoDB and PostgreSQL databases, leveraging MongoDB’s native client and the pg library for PostgreSQL. Connecting to both databases is essential to facilitate data extraction from MongoDB and insertion into PostgreSQL.

  2. Migrating data: Once connected, the script accesses the desired MongoDB collection and iterates through its documents. For each document, it constructs and executes an INSERT query into the corresponding PostgreSQL table. Careful mapping of MongoDB fields to PostgreSQL columns ensures accurate data transfer.

  3. Completion and clean-up: Upon migrating all relevant data, the script finalises its execution, closing the connections to both databases. It affirms the completion of the migration process and terminates the MongoDB and PostgreSQL connections to maintain resource efficiency.

Below, an ad hoc script demonstrates a seamless one-time data migration from MongoDB to PostgreSQL, maintaining data integrity without altering the application code.

import crypto from 'node:crypto'
import SQL from '@nearform/sql'
import { mongoose } from '../infrastructure/mongo.client'
import { postgres } from '../infrastructure/postgres.client'
import { Product } from '../models/product.model'

(async () => {
  console.log('Starting database migration')
  try {
    console.log('Connecting to mongo')
    await mongoose.connect('mongodb://root:root@127.0.0.1:27017/journey_from_mongo_to_pg?authSource=admin')
    console.log('Connecting to postgres')
    await postgres.connect()

    console.log('Getting all products from mongo')
    const products = await Product.find({})

    for (const product of products) {
      console.log(`Inserting product ${product._id} into postgres`)
      const result = await postgres.query(SQL`INSERT INTO products (id, name) VALUES (${crypto.randomUUID()}, ${product.name}) RETURNING *`)

      if (result.rows.length > 0) {
        console.log(`Product ${result.rows[0].id} inserted successfully`)
      } else {
        console.log(`Product with id ${product._id} was not inserted into postgres`)
      }
    }

    console.log('Closing postgres connection')
    await postgres.end()
    console.log('Closing mongo connection')
    await mongoose.disconnect()

    console.log('Database migration finished successfully')
  } catch(err) {
    throw err
  }
})()

This ad hoc script enables a smooth, one-time data migration from MongoDB to PostgreSQL, offering a practical approach to transition between databases without necessitating changes in the application codebase.

Adaptable to varying schemas, it ensures data consistency and integrity throughout the migration process. However, rigorous testing and appropriate backups are paramount before implementing such migrations in production environments.

Conclusion: The power of TypeScript in adaptability

In conclusion, this journey from MongoDB to PostgreSQL demonstrates the power of TypeScript for building adaptable backend architectures. By leveraging interfaces, design patterns and TypeScript’s features, the team seamlessly navigated a fundamental change in the database layer while ensuring the continued robustness and functionality of the API.