Monday 15 January 2018

Points to consider for creating new tables

Below are the points which can be consider while creating new table in the database.

  1. Comments ( Always add comments to columns and tables)


a.       I always find comments incredibly helpful, they could answer some of the business / content questions.
b.      You can add table and column comments.
c.       Add comments at the table level at the very least to indicate the contents of the table and the relationship between the two (other than simply knowing parent-> child by way of the Foreign Key).
d.      Column comments are useful, as long as they do not simply restate the column name
e.      There are undoubtedly some terms for which there is a Glossary of Terms etc. in which case repeating the definition given there is also of little value.
f.        However if the contents may not be immediately obvious / there is no pre-defined term it is definitely useful.


  1. Column naming convention

    1. While I like the underscores (they are natural for database developers )
    2. As a general translation from camel case java attribute names, we have generally ended up without any underscores.
    3. Maintain consistency with your database tables.

  1. NOT NULL Columns

    1. You have only defined the primary key columns as not null (because you have to)
    2. There are more columns that cannot be null
      1. i.e The foreign key column for example
    1. If a column cannot be null, we should have a not null constraint on the column. This will prevent an application error from going undetected if a null value is inadvertently inserted.

  1. TIMESTAMP/DATE columns

    1. Timestamps are 'interesting' beasts in oracle, when you consider timezone and some arithmetic functions and implicit type conversions that take place
    2. When to use Date and when to use Timestamp?
                                                               i.      If you have milliseconds, then clearly you have to use timestamps
                                                             ii.      If they are simply to the second precision (or worse a given day), then using a DATE column makes more sense

  1. Surrogate Primary Keys and Compound Keys / Unique Constraints

    1. We seem to have a general pattern of using surrogate Primary Keys everywhere
                   i.      i.e. we generate a unique numeric value as a unique key
    1. Sometimes there is a perfectly natural primary key, for example if you have a table defining enumerations, you might as well use the enumeration value as the primary key, there is no need to define a separate PKEY column as well
    2. The tables as they stand with the surrogate PKEY give no indication as to which columns should be unique
                                     i.    i.e. could we have multiple records for the same PKEY+SOME_COLOUMN



            6 .Performance Considerations

 Indexes on Foreign Key Columns
a.       You are undoubtedly going to wish to query the child records for a given parent record. You are not allowing for this, so querying child table for a given parent table record will require a full table scan.

  1. Caching on Sequences

    1. Create sequences  to generate values for the PKEYs
    2. Is there is no order dependency on the PKEY column values then should allow caching of the sequence values.
    3. This improves the efficiency significantly, but at the cost of potentially having gaps in the sequence (which you could always have if a transaction rolls back) and 'out of order' inserts in the case of databases with multiple (RAC) instances.

  1.  Partitioning Considerations 
Expected Data Volumes
    • How many records do we expect to create per day/week/month?

Is the data write time critical?
    • is the creation of the new record  in a time critical operation?

Do we expect to read the data far more than we write the data?
    • i.e. can we sacrifice some performance when writing the data in order to optimise the data read

Query profile
    • Do we expect to read 'ALL current' records on a frequent basis?
    • Or are all queries likely to be related to an instrument or relatively small set of instruments?

Data Lifecycle
    • Do we have to retain the data indefinitely?
    • Or can we remove data after a set time period?
      • Would this time period be related to the some date etc. or to the date at which the record is no longer valid



No comments:

Post a Comment

Spring boot with CORS

CORS (Cross-Origin Resource Sharing) errors occur when a web application running in a browser requests a resource from a different domain or...