database dev

Building a Semantic Similarity Search API with FastAPI, Sentence-BERT, and PostgreSQL pgvector

kiran sabne — Tue, 21 Oct 2025 19:30:08 GMT

In this article, I share how I built a semantic similarity search API using FastAPI, Sentence-BERT (SBERT), and PostgreSQL with pgvector.
The idea was to test whether we can achieve context-aware text search directly inside Postgres, without adding a separate vector database.
It turned out to be a simple yet powerful setup — perfect for real-world applications and quick POCs alike.

Full Detailed Implementation: Read the complete walkthrough here →

PostgreSQL Indexing: When BRIN Beats B-Tree

kiran sabne — Wed, 15 Oct 2025 19:30:17 GMT

Summary:

If you're dealing with huge tables and struggling with B-Tree index bloat or slow bulk inserts, it's time to look at BRIN (Block Range Indexes).

This article explains:

What BRIN indexes are and how they work internally
How they compare to B-Tree indexes in terms of size, performance, and maintenance
When BRIN is a better choice, especially for:

Time-series or append-only data
Huge, cold partitions
Queries filtering by time or sequential IDs

Key Takeaways:

BRIN indexes are tiny and fast to build, ideal for large tables.
They're not a B-Tree replacement, but a powerful companion in the right scenarios.
Proper data ordering makes or breaks BRIN performance.
Combine BRIN (e.g., on timestamps) with B-Tree (e.g., on IDs) for best results.

Check this post here: https://kiransabne.dev/postgresql-indexing-when-brin-is-a-better-choice-than-b-tree

Mastering MongoDB Locking, Concurrency, and Performance Optimization: A Deep Dive

kiran sabne — Sat, 19 Jul 2025 17:57:48 GMT

Concurrency is a critical aspect of database operations, ensuring multiple clients can read and write data simultaneously without compromising data integrity. MongoDB employs a robust locking and concurrency control system to handle these challenges efficiently. This blog post explores how MongoDB manages locks, explains optimistic and pessimistic locking patterns, and provides monitoring and optimization tips to boost performance in high-concurrency environments.

1. Understanding MongoDB’s Locking Mechanisms

With the WiredTiger storage engine (default since MongoDB 3.2), MongoDB uses document-level locking, allowing concurrent reads and writes at a granular level. However, it still internally tracks lock intents at global, database, and collection levels for coordination. Though developers don't directly manage locks like in traditional RDBMSs, MongoDB uses internal lock modes for resource coordination:

Types of Locks in MongoDB:

Shared (S) Lock:
- Purpose: Allows multiple clients to read a resource concurrently.
- Behavior: Coexists with other shared locks but blocks exclusive locks.
- Example: Multiple clients reading documents from the same collection.
Exclusive (X) Lock:
- Purpose: Grants exclusive write access to a resource.
- Behavior: Prevents any other operation (read or write) on the resource until released.
- Example: A document update acquires an exclusive lock.
Intent Shared (IS) Lock:
- Purpose: Signals the intention to acquire shared locks on subordinate resources.
- Behavior: Placed at higher levels (e.g., database) when reading collections.
- Example: Reading a collection places an IS lock on the database.
Intent Exclusive (IX) Lock:
- Purpose: Indicates the intention to acquire exclusive locks.
- Behavior: Applied at higher levels to signal a lower-level exclusive lock.
- Example: A document update places an IX lock on the database.

2. Lock Compatibility Matrix

Understanding lock compatibility is essential for predicting how operations interact.

Requested Lock	S	X	IS	IX
Held Lock
S	✔️	❌	✔️	❌
X	❌	❌	❌	❌
IS	✔️	❌	✔️	✔️
IX	❌	❌	✔️	✔️

3. Real-World Implementation of Locking

MongoDB encourages application-level strategies for managing concurrency rather than relying solely on database locks.

Optimistic Locking (Versioning):

Optimistic locking assumes minimal conflict, updating data only if the document version is unchanged.

async function updateDocument(collection, docId, newData) {
  const document = await collection.findOne({ _id: docId });
  const currentVersion = document.version;

  const result = await collection.updateOne(
    { _id: docId, version: currentVersion },
    {
      $set: { data: newData },
      $inc: { version: 1 }
    }
  );

  if (result.modifiedCount === 0) {
    throw new Error('Document was modified by another process');
  }
}

Pessimistic Locking (Simulated Lock Field):

Pessimistic locking blocks other processes from accessing the resource by setting a lock.

async function acquireLock(collection, docId, lockId) {
  const result = await collection.updateOne(
    { _id: docId, lock: null },
    { $set: { lock: lockId } }
  );
  if (result.modifiedCount === 0) {
    throw new Error('Document is already locked');
  }
}

async function releaseLock(collection, docId, lockId) {
  await collection.updateOne(
    { _id: docId, lock: lockId },
    { $set: { lock: null } }
  );
}

4. Monitoring Locks and Diagnosing Issues

MongoDB provides lock monitoring through the serverStatus command.

// View current lock statistics
db.serverStatus().locks

Deadlocks: Monitor deadlockCount to identify deadlocks.
Performance Metrics: Review timeAcquiringMicros and acquireWaitCount to assess lock contention.

db.currentOp({ active: true, waitingForLock: true }) //check active lock waits

5. Deadlock Prevention Techniques

Ordered Operations: Update documents in a consistent order across transactions.
Timeouts: Limit lock duration using $maxTimeMS.

// Limit update time
db.collection.updateOne(
  { _id: ObjectId("507f191e810c19729de860ea") },
  { $set: { status: "active" } },
  { maxTimeMS: 1000 }
)

Retry Logic for Transactions:

async function retryTransaction(session) {
  let retries = 3;
  while (retries > 0) {
    try {
      await session.withTransaction(async () => {
        // Transaction logic here
      });
      break;
    } catch (error) {
      if (error.hasErrorLabel('TransientTransactionError')) {
        retries--;
      } else {
        throw error;
      }
    }
  }
}

6. Lock Optimization Strategies

MongoDB Lock Optimization Techniques: Understand how to minimize lock contention by optimizing query patterns and using proper indexing.
Field-Level Updates: Minimize lock duration by updating only necessary fields.
Use of Secondary Indexes: Reduce the number of documents scanned during queries.

7. Comparison: Optimistic vs. Pessimistic Locking

Feature	Optimistic Locking	Pessimistic Locking
Best For	High-read environments	High-write environments
Concurrency	High	Low
Complexity	Medium	High
Performance Impact	Minimal	Can degrade under contention

8. Key Takeaways

Monitor lock metrics regularly to avoid performance degradation.
Implement versioning or lock fields to manage concurrency.
Use transactions and retries to handle deadlocks.

By mastering MongoDB’s locking and concurrency mechanisms, developers can ensure data integrity while maximizing performance in high-concurrency environments. Happy coding!

Crack SQL Interviews with These PostgreSQL Internals & Real Questions

kiran sabne — Sat, 12 Jul 2025 16:15:18 GMT

If you’re preparing for a SQL, backend or data engineering role, it’s not enough to just know SQL — you’ll need to understand how PostgreSQL works behind the scenes.

In my latest blog, I share:

The real SQL queries I was asked in interviews, from recursive CTEs to rolling sums
Deep-dive questions on PostgreSQL indexing, query planning, partitioning, and sharding
CDC implementation strategies using WAL, Debezium, and native logical replication

💬 I also cover tips on interpreting execution plans and designing scalable systems with Postgres as the core.

👉 Check out the full article here

Sharding PostgreSQL: Techniques for Achieving Horizontal Scalability

kiran sabne — Sat, 05 Jul 2025 07:05:57 GMT

Mastering Sharding in PostgreSQL for Horizontal Scalability

Sharding is a key technique to horizontally scale PostgreSQL databases by distributing data across multiple servers or instances. It enables PostgreSQL to handle massive datasets and high-traffic environments by partitioning data into smaller, more manageable shards. This guide provides a deep dive into PostgreSQL sharding, including its implementation, use cases, benefits, drawbacks, and best practices.

What is Sharding in PostgreSQL?

Sharding refers to distributing rows from a table across multiple databases or servers. Each shard contains a subset of the total data, effectively splitting the workload across multiple nodes.

Key Characteristics of Sharding:

Horizontal Scaling – Unlike partitioning, which divides data within a single server, sharding spreads data across multiple servers.
Independent Nodes – Each shard operates independently, reducing load on individual servers.
Fault Isolation – Failures are localized to specific shards, improving fault tolerance.

Why Use Sharding in PostgreSQL?

Sharding is essential when vertical scaling (adding more CPU, RAM, or storage) is no longer sufficient.

Top Use Cases for Sharding:

Massive Datasets – Tables exceeding billions of rows.
Geographically Distributed Systems – Shard data based on regions or user locations.
High-Traffic Applications – E-commerce, social media, and IoT systems.
Multi-Tenant Applications – Isolate tenant data by sharding based on tenant ID.

How to Implement Sharding in PostgreSQL

PostgreSQL offers several methods to implement sharding, including Foreign Data Wrappers (FDW), Citus, and custom application-level sharding.

Sharding with PostgreSQL Foreign Data Wrappers (FDW)

FDW allows PostgreSQL to query tables on remote servers, enabling sharding across multiple instances.

Step 1: Install FDW Extension

CREATE EXTENSION postgres_fdw;

Step 2: Create a Foreign Server

CREATE SERVER shard1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'shard1_host', dbname 'db1');

Step 3: Create User Mapping

CREATE USER MAPPING FOR current_user SERVER shard1 OPTIONS (user 'shard_user', password 'password');

Step 4: Import Foreign Schema

IMPORT FOREIGN SCHEMA public FROM SERVER shard1 INTO foreign_schema;

Sharding with Citus (PostgreSQL Extension)

Citus is a PostgreSQL extension that transforms PostgreSQL into a distributed database by enabling table sharding across multiple nodes.

Install and Configure Citus:

sudo apt install postgresql-14-citus

Distribute Table Across Nodes:

SELECT create_distributed_table('orders', 'customer_id');

Managing Sharded Tables

Adding New Shards:

CREATE SERVER shard2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'shard2_host', dbname 'db2');

Rebalancing Data:

SELECT rebalance_table_shards('orders');

Monitoring Shard Health:

SELECT * FROM citus_shard_health;

Benefits of PostgreSQL Sharding

Horizontal Scalability – Scale out by adding more servers.
Fault Tolerance – Failures affect only specific shards.
Improved Performance – Distributes workload across multiple servers.
Geographic Distribution – Place shards closer to users for lower latency.

Drawbacks and Limitations of Sharding

Complexity – Sharding introduces architectural complexity.
Cross-Shard Queries – Queries spanning multiple shards can be slower.
Data Rebalancing – Moving data between shards requires careful planning.
Maintenance Overhead – Each shard must be maintained individually.

PostgreSQL Sharding Best Practices

Choose Shard Keys Carefully – Select shard keys that minimize cross-shard queries.
Distribute Evenly – Ensure data is evenly distributed across shards.
Automate Monitoring – Use tools to monitor shard health and performance.
Minimize Cross-Shard Joins – Design queries to avoid joins across multiple shards.
Regularly Rebalance Shards – Prevent uneven growth of certain shards.

Edge Cases to Consider

Hot Shards – Some shards may receive disproportionate traffic.
Shard Failures – Plan for automatic failover and replication.
Schema Changes – Apply schema changes consistently across shards.
Data Migration – Migrating data between shards can impact performance.

Additional PostgreSQL Sharding Resources

Sharding is essential for scaling PostgreSQL databases beyond a single server. By implementing effective sharding strategies, developers and database administrators can build robust, scalable, and fault-tolerant database architectures for large-scale applications.

Mastering Clustering in PostgreSQL for Enhanced Query Performance

kiran sabne — Thu, 30 Jan 2025 17:41:52 GMT

Clustering is a vital performance optimization technique in PostgreSQL that reorganizes tables based on an index. By aligning table data according to an index, clustering improves query speed, reduces disk I/O, and enhances sequential scan performance. This guide provides a comprehensive overview of PostgreSQL clustering, covering implementation, use cases, benefits, drawbacks, and best practices.

What is Clustering in PostgreSQL?

Clustering in PostgreSQL refers to the physical reordering of table data based on an index. Unlike partitioning, which splits tables into multiple segments, clustering reshuffles the rows of a table to match the order of an indexed column.

Key Characteristics of Clustering:

Persistent Data Reorganization – Tables are physically reordered, but PostgreSQL does not maintain clustering automatically.
Index Dependency – Clustering relies on an existing B-tree index.
Improved Query Performance – Optimized for range queries and sequential scans.

Why Use Clustering in PostgreSQL?

Clustering is best suited for scenarios involving frequent range scans and ordered queries.

Top Use Cases for Clustering:

Read-Heavy Applications – Improves query performance in read-intensive environments.
Frequent Range Queries – Boosts performance for queries using BETWEEN, <, or > filters.
Index-Driven Workloads – Ideal when queries consistently access data in index order.
Data Warehousing – Enhances performance for analytical queries and batch processing.

How to Implement Clustering in PostgreSQL

Clustering is performed manually in PostgreSQL and does not persist after subsequent inserts or updates. Re-execute the CLUSTER command periodically to maintain efficiency.

Basic Clustering Example

CREATE INDEX orders_order_date_idx ON orders (order_date);

CLUSTER orders USING orders_order_date_idx;

Explanation:

An index is created on the order_date column.
The CLUSTER command reorders the orders table based on this index.

Automating Clustering with Scripts

Since clustering is not maintained by PostgreSQL, automation ensures consistent performance.

CREATE OR REPLACE FUNCTION auto_cluster_orders() RETURNS void AS $$
BEGIN
    CLUSTER orders USING orders_order_date_idx;
END;
$$ LANGUAGE plpgsql;

SELECT auto_cluster_orders();

Managing Clustering

Check Clustering Status:

SELECT relname, relhasindex FROM pg_class WHERE relname = 'orders';

Recluster After Inserts/Updates:

CLUSTER VERBOSE;

Reorganize Specific Tables:

CLUSTER orders;

Disable AutoVacuum (Optional for Performance):

ALTER TABLE orders SET (autovacuum_enabled = false);

Benefits of PostgreSQL Clustering

Faster Range Queries – Access data more efficiently by aligning rows with the index.
Reduced Disk I/O – Sequential scans benefit from reduced disk seek times.
Enhanced Analytical Performance – Speeds up analytical workloads and reporting queries.
Improved Cache Efficiency – Frequently accessed data is stored contiguously.

Drawbacks and Limitations of Clustering

Manual Maintenance – Clustering must be periodically re-executed.
Table Locking – Clustering locks the table during the process, blocking writes.
Performance Overhead – Frequent inserts or updates may disrupt the clustered order.
Limited Applicability – Only beneficial for tables with frequent range scans.

PostgreSQL Clustering Best Practices

Cluster During Low Traffic – Perform clustering during maintenance windows to avoid downtime.
Prioritize Read-Heavy Tables – Focus clustering efforts on tables with heavy read workloads.
Combine with Partitioning – Use clustering alongside partitioning for large datasets.
Recluster Periodically – Schedule periodic clustering to maintain performance.
Monitor Query Performance – Regularly analyze query plans to identify clustering candidates.

Edge Cases to Consider

Large Tables – Clustering large tables may take significant time and resources.
Frequent Writes – Inserts and updates gradually degrade clustering efficiency.
Partial Indexes – Clustering works only with full B-tree indexes, not partial indexes.
Locking Overhead – Avoid clustering during peak traffic to prevent blocking transactions.

Additional PostgreSQL Clustering Resources

Clustering in PostgreSQL is a powerful but underutilized feature that significantly boosts query performance for specific workloads. By understanding its limitations and applying best practices, developers and database administrators can unlock greater efficiency and scalability for PostgreSQL databases.

How to Scale PostgreSQL Databases with Partitioning

kiran sabne — Sun, 05 Jan 2025 17:32:36 GMT

Mastering Partitioning in PostgreSQL for Optimal Database Performance

Partitioning is a crucial technique for scaling and managing large datasets in PostgreSQL. As data grows, performance bottlenecks can arise, making it essential to break down tables into smaller, more efficient segments. This guide explores PostgreSQL partitioning, its implementation, use cases, benefits, and potential pitfalls. Learn how to leverage partitioning to optimize your PostgreSQL database and enhance query performance.

What is Partitioning in PostgreSQL?

Partitioning divides a large table into multiple smaller partitions that store subsets of the data. Although each partition acts as an independent table, PostgreSQL treats them collectively as a single table during queries, enhancing efficiency and scalability.

Key Types of Partitioning in PostgreSQL:

Range Partitioning – Divides data into partitions based on a range of values in a column (e.g., dates).
List Partitioning – Groups data into partitions based on matching specific values.
Hash Partitioning – Distributes data across partitions using a hash function.
Composite Partitioning – Combines two or more partitioning methods.

Why Use Partitioning in PostgreSQL?

Partitioning is essential when dealing with vast amounts of data, ensuring optimal performance and manageability.

Top Use Cases for Partitioning:

Handling Large Datasets – Tables exceeding millions or billions of rows.
Time-Series Data – Ideal for tables storing event logs or time-sensitive information.
Data Archiving – Effortlessly manage historical data by detaching old partitions.
Query Optimization – Speeds up queries by scanning specific partitions.
Indexing Efficiency – Indexes are created per partition, enhancing performance.

How to Implement Partitioning in PostgreSQL

PostgreSQL's declarative table partitioning simplifies implementation, making it more accessible to database administrators and developers.

Range Partitioning Example

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    order_date DATE NOT NULL,
    customer_id INT
) PARTITION BY RANGE (order_date);

CREATE TABLE orders_2023 PARTITION OF orders
    FOR VALUES FROM ('2023-01-01') TO ('2023-12-31');

CREATE TABLE orders_2024 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2024-12-31');

List Partitioning Example

CREATE TABLE orders_by_region (
    order_id SERIAL,
    region TEXT NOT NULL,
    PRIMARY KEY (order_id, region)
) PARTITION BY LIST (region);

CREATE TABLE orders_us PARTITION OF orders_by_region
    FOR VALUES IN ('US');

CREATE TABLE orders_eu PARTITION OF orders_by_region
    FOR VALUES IN ('EU');

Hash Partitioning Example

CREATE TABLE hash_example (
    id SERIAL,
    data TEXT
) PARTITION BY HASH (id);

CREATE TABLE hash_example_0 PARTITION OF hash_example
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);

CREATE TABLE hash_example_1 PARTITION OF hash_example
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);

Managing PostgreSQL Partitions

Adding New Partitions:

CREATE TABLE orders_2025 PARTITION OF orders
    FOR VALUES FROM ('2025-01-01') TO ('2025-12-31');

Detaching Partitions:

ALTER TABLE orders DETACH PARTITION orders_2023;

Dropping Partitions:

DROP TABLE orders_2023;

Benefits of PostgreSQL Partitioning

Blazing-Fast Query Performance – Queries run faster by targeting smaller partitions.
Seamless Data Management – Simplifies handling large tables by partitioning.
Efficient Indexing and Vacuuming – Maintains smaller indexes for each partition.
Concurrency Boost – Operations on one partition don't affect others.

Drawbacks and Limitations of Partitioning

Complex Schema Design – Managing partitions can complicate schema development.
Query Overhead – Poor query planning can result in scanning all partitions.
Insert/Write Performance – Determining the correct partition can add overhead.
Imbalance Risk – Uneven data distribution may lead to inefficient performance. Might need occasional partition rebalancing.

PostgreSQL Partitioning Best Practices

Choose Partition Keys Wisely – Opt for columns often filtered in queries.
Favor Time-Based Partitions – Ideal for time-sensitive datasets.
Limit Partition Count – Excessive partitions can slow query planning.
Automate Partition Management – Develop scripts for partition creation and detachment.
Regular Performance Monitoring – Analyze query plans to ensure partitions perform as expected.

Edge Cases to Watch For

Partition Hotspots – Uneven growth of partitions can create data hotspots.
Missing Partitions – Queries failing due to out-of-range values.
Bulk Inserts – Bulk insertions can slow performance if not optimized.
Partition Key Updates – Avoid updating partition keys to prevent row movement across partitions.

Additional PostgreSQL Partitioning Resources

Partitioning in PostgreSQL is a game-changer for databases managing extensive datasets. By strategically implementing and managing partitions, developers and DBAs can significantly enhance PostgreSQL performance, making it an essential skill for scaling database systems effectively.

PostgreSQL Concurrency and Locking: A Comprehensive Guide

kiran sabne — Fri, 03 Jan 2025 01:11:38 GMT

Introduction to Locking in PostgreSQL

Locking in PostgreSQL is essential for ensuring data consistency and isolation across concurrent transactions. PostgreSQL uses multi-version concurrency control (MVCC) to allow multiple transactions to access data simultaneously, but certain operations still require explicit locking to prevent conflicts.

Understanding the types of locks, their use cases, and potential performance implications is critical for optimizing database performance and avoiding deadlocks.

Types of Locks in PostgreSQL

PostgreSQL provides a variety of locks to handle different levels of data protection. These locks can be broadly categorized as:

Row-Level Locks
Table-Level Locks
Page-Level Locks
Advisory Locks
Deadlocks and Prevention

Let's dive into each lock type with detailed explanations, real-world examples, and performance implications.

1. Row-Level Locks

Row-level locks allow fine-grained control over individual rows, ensuring minimal impact on other parts of the table.

Types of Row-Level Locks:

FOR UPDATE: Prevents other transactions from modifying or locking the same row until the current transaction completes.
FOR NO KEY UPDATE: Similar to FOR UPDATE, but allows non-key columns to be updated by other transactions.
FOR SHARE: Prevents modifications but allows other transactions to acquire a shared lock.
FOR KEY SHARE: Allows transactions to modify non-key columns but prevents deletion or key updates.

Real-World Application:

Order Management Systems: When updating the status of an order, acquiring a FOR UPDATE lock ensures no other transaction modifies or deletes the order concurrently.

BEGIN;
SELECT * FROM orders WHERE order_id = 101 FOR UPDATE;
-- Another transaction trying to update the same row will wait until the lock is released.

Edge Case:

Deadlocks: Occurs when two transactions hold locks that the other needs, leading to a stalemate.
Performance Implication: Row-level locks scale well, but frequent locking can lead to increased contention and deadlocks.

2. Table-Level Locks

Table-level locks apply to entire tables, preventing or allowing certain operations to be performed concurrently.

Types of Table-Level Locks:

ACCESS SHARE: Acquired by SELECT statements.
ROW SHARE: Acquired by SELECT ... FOR UPDATE or SELECT ... FOR SHARE.
ROW EXCLUSIVE: Acquired by INSERT, UPDATE, and DELETE.
SHARE UPDATE EXCLUSIVE: Used by VACUUM operations.
SHARE: Allows multiple transactions to read but not write.
EXCLUSIVE: Blocks all other operations except SELECT.
ACCESS EXCLUSIVE: Blocks all operations, including SELECT.

Real-World Application:

Schema Migrations: When altering table structure, acquiring an ACCESS EXCLUSIVE lock prevents data modifications during the schema update.

BEGIN;
LOCK TABLE orders IN EXCLUSIVE MODE;
-- Blocks other operations until the lock is released.

Edge Case:

Performance Implication: Table locks can lead to high contention in multi-user environments.
Deadlocks: High risk when combined with row locks.

3. Page-Level Locks

Page-level locks are used internally by PostgreSQL during index and table access operations.

Page Locks: Implicitly managed by PostgreSQL and not directly accessible to users.
Use Case: Prevents data corruption during index writes.

Real-World Application:

Index Maintenance: During large data insertions, page locks ensure index consistency.

4. Advisory Locks

Advisory locks provide application-level locking mechanisms that are independent of the standard SQL locks.

Session-level: Locks held until the session ends.
Transaction-level: Locks held until the transaction commits or rolls back.

Real-World Application:

Distributed Systems Coordination: Advisory locks help coordinate processes accessing shared resources.

SELECT pg_advisory_lock(12345);
-- Released when the transaction ends.

Performance Implication:

Lightweight but requires careful management to avoid deadlocks.

Optimistic vs. Pessimistic Locking

Optimistic Locking:

Assumes minimal conflicts and only checks for conflicts at the time of commit.
Implementation: Use versioning or timestamps.

UPDATE products SET price = 200 WHERE product_id = 1 AND updated_at = '2025-01-01 10:00:00';

Real-World Application:

E-commerce: Prevents overwriting of product information by checking for updates before committing changes.

Pessimistic Locking:

Acquires locks at the beginning of a transaction to prevent other transactions from modifying the data.
Implementation:

BEGIN;
SELECT * FROM products WHERE product_id = 1 FOR UPDATE;

Real-World Application:

Banking Systems: Ensures account balances are not modified by multiple transactions simultaneously.

Deadlocks and Prevention

Detecting Deadlocks:

SELECT * FROM pg_stat_activity WHERE wait_event_type = 'Lock';

Log Monitoring:

Deadlocks are logged in the PostgreSQL log.
Location: pg_log or log_directory.
Command:

cat /var/log/postgresql/postgresql.log | grep 'deadlock'

Real-World Application:

High-Transaction Systems: Continuously monitor and resolve deadlocks to prevent transaction failures.

Preventing Deadlocks:

Order Transactions Consistently: Always access tables and rows in the same order.
Keep Transactions Short: Minimize the duration of locks.
Use NOWAIT/ SKIP LOCKED:

SELECT * FROM orders FOR UPDATE NOWAIT;

Comparison of Lock Types

Lock Type	Scope	Blocks Read	Blocks Write	Use Case
Row-Level (FOR UPDATE)	Row	No	Yes	Row updates and deletions
Table-Level (EXCLUSIVE)	Table	Yes	Yes	Schema modifications, migrations
Advisory Locks	Application	No	No	Application-level coordination
Access Share	Table	No	No	`SELECT` statements
Row Exclusive	Table	No	Yes	`INSERT`, `UPDATE`, `DELETE`
Access Exclusive	Table	Yes	Yes	Full table modifications

Conclusion

Locking in PostgreSQL is a powerful mechanism that, when used correctly, can ensure data integrity and consistency. By understanding the various types of locks, their appropriate use cases, and how to manage deadlocks, developers can design efficient and resilient database applications. Optimistic and pessimistic locking strategies provide additional tools to handle concurrency effectively.