Design a SQL schema for a supply chain database that tracks products, suppliers, orders, inventory levels, and shipments, ensuring data integrity and efficient querying for common supply chain analytics.
technical screen · 5-7 minutes
How to structure your answer
MECE Framework: Define entities (Products, Suppliers, Customers, Orders, OrderItems, Inventory, Shipments, ShipmentItems, Warehouses). Establish relationships (one-to-many, many-to-many) with foreign keys. Specify data types and constraints (PRIMARY KEY, NOT NULL, UNIQUE). Implement indexing on frequently queried columns (e.g., product_id, supplier_id, order_date). Optimize for common analytics: join operations for order fulfillment rates, inventory turnover, supplier performance, and shipment tracking. Ensure referential integrity with CASCADE/RESTRICT rules. Consider partitioning for large tables like 'Shipments' or 'Inventory' based on time or location for performance.
Sample answer
A robust SQL schema for supply chain analytics requires a MECE approach, ensuring all critical entities are covered with appropriate relationships and constraints. Key tables include Products (product_id PK, name, description, weight), Suppliers (supplier_id PK, name, contact_info), Customers (customer_id PK, name, address), and Warehouses (warehouse_id PK, location, capacity). Orders (order_id PK, customer_id FK, order_date, status) and OrderItems (order_item_id PK, order_id FK, product_id FK, quantity, unit_price) capture sales. Inventory (inventory_id PK, product_id FK, warehouse_id FK, quantity_on_hand, last_updated) tracks stock. Shipments (shipment_id PK, order_id FK, warehouse_id FK, shipment_date, delivery_date, carrier, tracking_number, status) and ShipmentItems (shipment_item_id PK, shipment_id FK, product_id FK, quantity) manage logistics. Foreign keys enforce referential integrity. Indexes on order_date, product_id, supplier_id, and shipment_date will optimize queries for lead times, supplier performance, and inventory turnover. Constraints like NOT NULL and CHECK (e.g., quantity > 0) ensure data quality.
Key points to mention
- • **Normalization (3NF):** Discuss how tables are designed to minimize data redundancy and improve data integrity, e.g., separating `Products` from `Suppliers`.
- • **Primary and Foreign Keys:** Emphasize the use of PKs for unique identification and FKs for establishing relationships between tables, ensuring referential integrity.
- • **Data Types:** Mention appropriate data types for each column (e.g., `INT`, `VARCHAR`, `DECIMAL`, `DATE`, `BOOLEAN`) to optimize storage and querying.
- • **Indexes:** Explain the importance of indexing frequently queried columns (e.g., `ProductID` in `Inventory`, `OrderDate` in `Orders`) to improve query performance.
- • **Enums/Lookup Tables:** Suggest using `ENUM` types or separate lookup tables for categorical data like `OrderStatus` or `ShipmentStatus` for consistency and easier maintenance.
- • **Timestamp Columns:** Include `CreatedAt` and `LastUpdated` columns in relevant tables for auditing and tracking changes.
- • **Scalability Considerations:** Briefly touch upon how this schema can be extended (e.g., adding `Returns`, `QualityControl`, `BillsOfMaterial` tables) to accommodate future needs.
Common mistakes to avoid
- ✗ **Denormalization without justification:** Combining too much data into one table, leading to redundancy and update anomalies.
- ✗ **Missing Indexes:** Not creating indexes on frequently used columns, resulting in slow query performance.
- ✗ **Inconsistent Naming Conventions:** Using different naming styles for tables and columns, making the schema harder to understand and maintain.
- ✗ **Lack of Constraints:** Not implementing `NOT NULL`, `UNIQUE`, or `CHECK` constraints, which can lead to invalid data.
- ✗ **Ignoring Scalability:** Designing a rigid schema that cannot easily accommodate new features or data types without significant refactoring.