Databasing

Databasing Concept:

Databasing is “the act of analyzing a real world entity relationship, and writing the DDL to implement a database”. (Source: Stack Exchange, verbs – Word for the act of databasing – English Language & Usage Stack Exchange). Hence database is the way to organize the raw and columns into a table.

Types of Tables 

As databasing is built on the process of organizing rows and columns, the role of tables is important. There are different types of tables depending on how rows and columns are structured in the table. KnowledgeShop  identified five types of tables:

Heap-organized table, a table with rows stored in no particular order. This is a standard Oracle table; the term “heap” is used to differentiate it from an index-organized table or external table;

Index-organized Table (IOT) or Index-Only Table. An index-organized table (IOT) is a type of table that stores data in a B-Tree index structure;

External Table  a table that is NOT stored within the Oracle database, however, data is loaded from a file via an access driver (normally ORACLE_LOADER) when the table is accessed;

Materialized Query Tables (MQT) are tables whose definition is based on the result of a query. The data consists of precomputed results from the tables that you specify in the materialized query table definition (source: IBM, Materialized query tables – IBM Documentation;

Multi-Dimensional Clustering Tables (MDC) MDC tables can be used to improve performance in many cases. With MDC you can cluster data in multiple dimensions.

While there is some commonalty, IBM has a slightly different types of tables: The IBM Table type includes:

(1.0) Base tables:  These types of tables hold persistent data. There are different kinds of base tables, including  Regular tables: Regular tables with indexes are the “general purpose” table choice.

(1.1) Multidimensional clustering (MDC) tables:  These types of tables are implemented as tables that are physically clustered on more than one key, or dimension, at the same time. Multidimensional clustering tables are not supported in a Db2 pureScale® environment.

(1.2)Insert time clustering (ITC) tables:  These types of tables are conceptually, and physically similar to MDC tables, but rather than being clustered by one or more user specified dimensions, rows are clustered by the time they are inserted into the table. ITC tables can be partitioned tables. ITC tables are not supported in a Db2 pure Scale environment.

(1.3)Range-clustered tables (RCT): These types of tables are implemented as sequential clusters of data that provide fast, direct access. Each record in the table has a predetermined record ID (RID) which is an internal identifier used to locate a record in a table.  Range-clustered tables: are not supported in a Db2 pure Scale environment.

(2.0) Partitioned tables: These types of tables use a data organization scheme in which table data is divided across multiple storage objects, called data partitions or ranges, according to values in one or more table partitioning key columns of the table. Partitioned tables can contain large amounts of data and simplify the rolling in and rolling out of table data.

(3.0)Temporal tables: These types of tables are used to associate time-based state information to your data. Data in tables that do not use temporal support represents the present, while data in temporal tables is valid for a period defined by the database system, customer applications, or both.

(4.0) Temporary tables: These types of tables are used as temporary work tables for various database operations. Declared temporary tables (DGTTs) do not appear in the system catalog, which makes them not persistent for use by, and not able to be shared with other applications. (Source: IBM, Types of tables – IBM Documentation)

Database Type:

The data base can be a relational database or simply a Column-oriented Databases. A relational database is a type of database that stores and provides access to data points that are related to one another. Here’s a simple example of two tables a small business might use to process orders for its products.

The first table is a customer info table, so each record includes a customer’s name, address, shipping and billing information, phone number, and other contact information. Each bit of information (each attribute) is in its own column, and the database assigns a unique ID (a key) to each row. In the second table—a customer order table—each record includes the ID of the customer that placed the order, the product ordered, the quantity, the selected size and color, and so on—but not the customer’s name or contact information. (Source:  What Is a Relational Database | Oracle Canada)

On the other hand, a Column-oriented Databases are Large data stored across machines, with none or less data relationships. Known column-oriented databases include: HBase, Cassandra, Hypertable. For instance, unlike a relational database, in HBase a column is specific to the row that contains it. When we start adding rows, we’ll add columns to store data at the same time.

Again depending on the size and complexity of the raw data to be extracted, the operation can be manual and/or fully automated. There are generally two Types of Software languages that are commonly used, namely No Structured Query Language (NOSQL) and Structured Quarry Language (SQL). Examples of NoSQL databases include  Riak, Redis, memcached, memcachedb, membase, Voldemort (Source: NoSQL – KnowledgeShop (fizalihsan.github.io)

Information Processing

The extracting and transforming is the next process after recording in the dynamics of generating wizening. In the process of creating a database (Databasing) records are organized into a row, which is the identifier, and column which is the attribute of the identifier  to generate meaning.

Information processing theory was originally focused on individual mind and cognition process. However, the theory can be extended beyond individuals. If fact our focus is at the macro level.

After databasing, there are four primary stages of information processing:

  1. Extraction – Extraction occurs as per need and could be schedule  to meet the need or could be non scheduled happening irregularly. Also the extraction could be based on internal records or sourced from outside of the organization.

  2. Storage – Storing is a crucial part of the entire information processing workflow so that future users can access the information when needed. It is also important for learning as the organization can learn from its past experiences through the stored information.

  3. Transformation – This may include filtering, analysis, expansion, or compression that will help discern patterns and trends. Transforming can include extracting or deriving result from newly connected  information.

  4. Informing/Transmission – This is the process whereby the information are distributed to others. This may include reporting or presenting to relevant stakeholders. (Source: idea is taken from Research.com. What is Information Processing Theory? Stages, Models & Limitations | Research.com). The informing can be based on a push model where the owner of the information sends to users or pull model where the users are getting the information from the owner.