Aug 22, 2012

PostgreSQL: Advanced Concepts

This a continuation of my PostgreSQL series (an installation guide for Ubuntu and an overview of some basic concepts). Here I will briefly outline some more concepts about the database.

You can also check out my notes on designing relational databases

  • The purpose of an index is to allow the RDBMS to locate the row efficiently. However maintaining an index requires some overhead, so it is not worthwhile to index all fields in a table. In addition, since the index is synchronized when the table is updated it is recommended to remove indexes that are seldom or never used.

  • Although a multi-column index can be specified (up to 32 columns), it is generally not recommended to do so unless in certain circumstances. For instance, a multi-column index is useful if you are constantly making queries that references the two or more fields. This is because multi-column indexes are larger and slower than querying a single column index.

  • Unique indexes with NULL values are not considered equal in PostgreSQL (you will need to specify the NOT NULL clause to restrict NULL values for this field)

  • Partial indexes allow you to index only the field values that you search frequently, and not the whole table. It offers some benefits as your index will not be as big as if you indexed the whole table, but lost out if you every need to query something outside of the index parameters.

  •  Data consistency is maintained by following the MVCC approach (MultiVersion Concurrency Control). Each transaction will see a snapshot of the data as it was, regardless of the current state of the underlying data. This isolates the transaction, preventing it from seeing inconsistent data that is produced from concurrent transactions.

    MVCC's major advantage is that locks for read transactions do not conflict with locks for write transactions, and vice versa.

  • You can explicitly lock tables or rows when the MVCC approach does not produce desirable behavior. However by explicitly locking resources you increase the risk of producing deadlocks in the system.

  • Messages generated by the server are assigned a five-character error code that follows the SQL standard for SQLSTATE codes. This standard states that the first two characters denote the class of the message, while the last three characters denote the specific condition.

  • The followin class codes have been defined:

    CODEDESCRIPTOR
    00Successful completion
    01Warning
    02No Data (also serves as a warning)
    03SQL Statement not yet complete
    08Connection exception
    09Triggered action exception
    0AFeature not supported
    0BInvalid transaction initiation
    0FLocator Exception
    0LInvalid Grantor
    0PInvalid Role specification
    20Case not found
    21Cardinality Exception
    22Data Exception
    23Integrity Constraint Violation
    24Invalid Cursor state
    25Invalid Transaction State
    26Invalid SQL Statement Name
    27Triggered Data Change Violation
    28Invalid Authorization Specification
    2BDependant Privileges Descriptors still exist
    2DInvalid Transaction Termination
    2FSQL Routine Exception
    34Invalid Cursor Name
    38External Routine Exception
    39External Routine Invocation Exception
    3BSavepoint Exception
    3DInvalid Catalog name
    40Transaction rollback
    42Syntax Error or Access Role Violation
    44WITH CHECK OPTION Violation
    53Insufficient Resources
    54Program Limit Exceeded
    55Object not in prerequisite state
    57Operator Intervention
    58System Error (external to PostgreSQL)
    F0Configuration File Error
    HVForeign Data Wrapper Error
    P0PL/pgSQL Error
    XXInternal Error


  • Unfortunately you cannot raise exceptions directly within the PostgreSQL server. You could use a procedural language within the server (such as PL/pgSQL), but this will limit the portability and compatibility of your code with other RDBMS systems.

  • In PL/pgSQL, you would use the RAISE statement to report exceptions and messages.
    RAISE [level] 'message' [, value] USING ERRCODE = 'SQLSTATE';
    where:
    level = DEBUG, LOG, INFO, NOTICE, WARNING, and EXCEPTION
    message = String (% denotes a placeholder for an optional value
    value = field value
    SQLSTATE = Error code conforming to the SQLSTATE convention

  • A SQL query will undergo the following process in order to generate a result:
    1. A connection from the application to the server is established. The application then sends a query to the server and waits for a result
    2. The parser checks the query for correct syntax and creates a query tree.
    3. The rewrite system takes the query tree and looks for any rules (stored in the system catalog) to apply to the tree. It performs the transformations given in the rule bodies.
    4. The planner/optimizer takes the rewritten query tree and creates a query plan. It does this by creating all possible paths leading to the same result, and then calculates the estimated cost of each plan. It will then select the cheapest path and prepare a query plan tree for the executor.
    5. The executor recursively steps through the plan tree and retrieves rows in the way represented by the plan. When it is complete it hands the result back to the application.

References:

No comments:

Post a Comment

Thanks for contributing!! Try to keep on topic and please avoid flame wars!!