By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: (Ⅱ) A Complete Guide to Core Data Warehouse Design Standards: From Layers, Types to Lifecycle | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > (Ⅱ) A Complete Guide to Core Data Warehouse Design Standards: From Layers, Types to Lifecycle | HackerNoon
Computing

(Ⅱ) A Complete Guide to Core Data Warehouse Design Standards: From Layers, Types to Lifecycle | HackerNoon

News Room
Last updated: 2025/09/15 at 9:36 PM
News Room Published 15 September 2025
Share
SHARE

“Emerging Data Lakehouse Design and Practice Handbook · Data Lakehouse Modeling and Naming Standards (2025)” consists of four progressive guides, structured along the main line of Model Architecture – Common Standards – Layering Standards – Naming Standards. Together, they systematically build an evolvable, governable, and shareable modern data lakehouse.

The first article,(I) Principles of Data Model Architecture: Four Layers and Seven Stages, the “First Cornerstone” of Data Lake and Warehouse Modeling, introduced the four-layer (ODS–DW–APP, with DWD/DWM/DWS inside DW) data layering architecture. Around four core principles—domain partitioning, high cohesion & low coupling, common logic sinking, and cost-performance balance—it laid a unified and extensible foundation for dimensional modeling in a lakehouse architecture.

This article is the second in the series. It details the common design standards followed in data warehouses, including hierarchy invocation standards, data type standards, string handling, and other warehouse design specifications.

The remaining two articles will further analyze detailed specifications for each data warehouse layer and the unified naming system within this framework, helping enterprises complete the full chain from data ingestion to value realization with one methodology. Stay tuned for the full series.

1. Hierarchy Invocation Standards: Controlling Data Warehouse Flow and Reference Principles

🚀 Business data flow design and layering reference essentials

Stable business should follow the standard flow design: ODS → DWD → DWS → APP. For unstable or exploratory demands, ODS → DWD → APP or ODS → DWD → DWM → APP models may be used.

Beyond ensuring data flow rationality, reference principles must also be followed:

  • Normal flow: ODS → DWD → DWM → DWS → APP. If ODS → DWD → DWS → APP occurs, it means the domain is incomplete. DWD data should be landed in DWM. For rarely used tables, DWD → DWS is allowed.

  • Avoid having a DWS wide table reference both DWD and the DWM of the same domain.

  • Within the same domain, generating DWM from DWM should be avoided to maintain ETL efficiency.

  • DWM, DWS, and APP must not directly use ODS tables. Only DWD can reference ODS.

  • Reverse dependencies are forbidden, e.g., DWM tables depending on DWS tables.

Example:

2. Data Type Standards: Unifying Data Warehouse Type Settings

🔍 Precise type definitions for various data

Different data types must be standardized and strictly enforced:

  1. Amount: double or decimal(28,6) for precision, clarify whether the unit is cents or yuan.

  2. String: string.

  3. ID fields: bigint.

  4. Time: string.

  5. Status: string.

3. Data Redundancy Standards: Reasonable Control of Wide Table Redundancy

🤔 Considering frequency, latency, and duplication rate

Redundant fields in wide tables must ensure:

  1. Redundant fields are frequently used, referenced downstream by at least 3.

  2. Redundant fields should not introduce significant data latency.

  3. Redundant fields should not overly duplicate existing fields; generally, not exceeding 60%. Otherwise, use join or extend the original table.

4. NULL Field Handling Standards: Strategies for Dimensions and Metrics

❓ Why set NULL values this way

  • Dimension fields: set to -1.

  • Metric fields: set to 0.

5. Metric Caliber Standards: Ensuring Consistency of Metrics

🧩 Specific methods for metric collation and management

Consistency must be ensured within domains, without ambiguity.

Through layering, unified data outputs are provided to ensure consistent external metric definitions, avoiding “same metric, different definition.”

1) Metric Collation: Inconsistent definitions increase data usage costs, leading to disputes and repeated verification. In governance, all metrics collected from requirements are further refined to clarify their definitions. If two metrics share a name but differ in definition, determine whether to merge. If both must exist, their names must clearly distinguish them.

2) Metric Management Metric management includes atomic metrics and derived metrics.

Atomic metrics:

  • Assign to production line, business unit, domain, and business process.
  • Define source data within the business process.
  • Record English/Chinese name, description.
  • Fill in metric function.
  • System auto-generates definition expressions.
  • System auto-generates SQL from the definition and source table.

Derived metrics:

  • Built on atomic metrics with additional dimensions or qualifiers.

6. Data Table Handling Standards: Characteristics of Different Table Types

⚡ Differences between incremental, full, snapshot, and zipper tables

1) Incremental Table

New data since the last extraction.

  1. Records changes only, not totals.

  2. Reports changes only; no change, no report.

  3. One partition per day.

2) Full Table

All the latest state data each day.

  1. Reports regardless of change.

  2. Each report includes all data (changed + unchanged).

  3. Only one partition.

3) Snapshot Table

Partitioned daily, recording full data as of that date.

  1. Reports regardless of change.

  2. Each report includes all data (changed + unchanged).

  3. One partition per day.

4) Zipper Table

Full data is up to date.

  1. Records all changes of an entity from start to current.

  2. Each report includes the historical final state at present.

  3. Current record stores last change before now.

  4. Only one partition.

7. Table Lifecycle Management: Strategies Based on Data History and Table Type

⏳ Lifecycle matrix via historical data grading and table type categorization

1) Historical Data Grading

Four levels: P0, P1, P2, P3.

  • P0: Critical domain and app data, non-recoverable, e.g., transactions, logs, group KPIs, IPO-related tables.

  • P1: Important business and app data, non-recoverable, e.g., key product data.

  • P2: Important but recoverable business/app data, e.g., intermediate ETL process data.

  • P3: Non-critical, recoverable data, e.g., SNS product reports.

2) Table Types

  1. Event stream table (incremental): non-key or non-duplicate data like logs.
  2. Event mirror table (incremental): process data with primary key and slowly changing attributes, e.g., transactions, order states.
  3. Dimension tables: users, products.
  4. Merge full table: process/dimension data with possible duplicates; merge keeps latest by key, older states in previous partitions.
  5. ETL temporary tables: generated during processing, not recommended to keep, max 7 days.
  6. TT temporary data: data from TT pulls or DbSync, flows to DS layer, ODS keeps original; lifespan default 93 days.
  7. Regular full tables: small business/product data, often pulled directly; long retention, strategy by historical level.

By grading historical data and categorizing table types, a lifecycle management matrix is generated.

  • Previous article: “(Part 1) Data Model Architecture Principles: Four Layers, Seven Levels—the First Building Block of Lakehouse Modeling”
  • Next article preview: Detailed design standards for each data warehouse layer

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article A mega deal on the iPad mini: Take $100 off at Amazon
Next Article From $0.035 to $2? Mutuum Finance (MUTM) Price Prediction Puts It Among the Top Cryptos to Buy Now
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Africa’s gender funding gap persists; women founders still get just 1 in 10 dollars
Computing
I ditched my flagship for a mid-range phone and don’t regret it
News
Microsoft’s Xbox PC app adds Steam games and access to other stores
News
Fedora Workstation 43 Beta Is Running Well On AMD Strix Halo / Framework Desktop
Computing

You Might also Like

Computing

Africa’s gender funding gap persists; women founders still get just 1 in 10 dollars

5 Min Read
Computing

Fedora Workstation 43 Beta Is Running Well On AMD Strix Halo / Framework Desktop

2 Min Read
Computing

Vivo’s new phone to feature Industry’s largest 7,600mAh battery, likely iQOO Z10 Turbo · TechNode

1 Min Read
Computing

How to plan for Africa’s next technology decade

15 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?