Database Limitations

Size Limitations

One of the reasons Microsoft is able to give SQL Server Express away for free is because of the limits imposed on certain facets of the server. One of the largest limitations of SQL Server Express is the restriction on individual database sizes. No single database can be over 10GB on modern versions. This restriction has been in place for over a decade and has not increased over time, even as modern data storage has become extremely cheap and plentiful.

Even if an application doesn't require storing 10GB of data immediately, over years of use it is not uncommon for a database to hit the ceiling on its size. When SQL Server Express is bundled with software, this can result in customer support calls and performance degradation for end users.

SQL Server Express

How to Work Around SQL Server Express's Data Limit

Compression

In SQL Server 2016 SP1+, Express versions can utilize data compressionopen in new window on indexes and tables to increase the size of a database on disk. Generally, the additional CPU required to decompress data that is read from disk is a worthwhile trade off for larger tables and indexes.

Design Smart

Commonly, databases are created using generic data types that may provide flexibility, but also come at the cost of an increased storage footprint. By assessing the below options, a database can take up a fraction of its original size without sacrificing any actual data.

This advice is meant to be general and should always be assessed in the context of the specific use case at hand.

Integers

INTopen in new window is a common data type for IDs and whole-number numeric data. With a size of 4 bytes, it isn't very big but can support numbers up to 2,147,483,647. However if it is clear that the value being captured is never going to hit anywhere near that limit, using a SMALLINTopen in new window for values up to 32,767 (2 bytes) or TINYINTopen in new window for values up to 255 (1 byte) can drastically reduce storage space. All of these types are available in SQL 2008+

For example, a table [dbo].[state] may provide a list of all 50 states in the US. Commonly the ID for this table will be an INT data type, but since it is unlikely there will be 200+ new states any time soon, it is safe to use a TINYINT instead and reduce the ID column's footprint to 25% of its original value.

Date and Time

Similarly to integers, often times date and time datatypes are chosen that provide far more detail than necessary. While some use cases necessitate the capturing of milliseconds, many times precision more than a second is just excessive and provides no benefit to an application, yet takes up more disk space. Be sure that each type is not defaulting to a higher precision than is needed, also.

There are many options for date, time, and datetime values, so assess what makes the most sense for each situation and use the smallest available:

Data Type	Accuracy	Size	Compatibility	Example
`DATE`open in new window	One day	3 bytes	2008+	2018-04-02
`DATETIME`open in new window	Rounded to increments of .000, .003, .007 sec	8 bytes	2008+	2018-04-02 12:35:29.123
`DATETIME2`open in new window	100 nanoseconds	6 bytes for precisions < 3; 7 bytes for precisions 3 & 4; all others are 8 bytes	2008+	2018-04-02 12:35:29.1234567
`DATETIMEOFFSET`open in new window	100 nanoseconds	8-10 bytes	2008+	2018-04-02 12:35:29.1234567 +12:15
`SMALLDATETIME`open in new window	One minute	4 bytes	2008+	2018-04-02 12:35:00
`TIME`open in new window	Varies	3-5 bytes	2008+	12:35:29.1234567

Decimals

Similar to dates and integers, making sure only the precision necessary is being stored when dealing with DECIMAL and NUMERICopen in new window (functionally equivalent data types) is very important.

For most cases, a precision of 9 or less will suffice, which means DECIMAL will only take up 5 bytes. However, if more is required, the following sizes will apply:

Precision	Size
1-9	5 bytes
10-19	9 bytes
20-28	13 bytes
29-38	17 bytes

It may be tempting to consider the FLOAT or REALopen in new window types as well, since they can store a precision of 7 digits in 4 bytes and 15 digits in 8 bytes. However, it is integral to remember that these types are floating point numerics and thus contain approximate values. Unless floating point numbers are a specific requirement, using these types can lead to rounding issues and are certainly not worth the risk for space savings.

Strings and VARCHARs

Another easy way to slim down a database is choosing the right character data type format. NVARCHARopen in new window and NCHAR store Unicode data, and are thus must-haves for storing any multilingual or localized data. Conversely, VARCHARopen in new window and CHAR store non-Unicode data and only take 1 byte for every 2 bytes that NVARCHAR takes up. This can effectively cut a table's size in half if it is comprised mostly of a large string column, like a comment or note field.

So, a simple "Hello" in NVARCHAR will be 10 bytes but only 5 in VARCHAR. The difference can be huge for not only performance, but also storage, when strings are properly typed.

Use Multiple Databases

While each individual database has a size restriction, the number of databases in an instance is not limited any more than other versions. This leaves room to partition data horizontally or vertically into several databases.

Horizontal Partitioning

Horizontal partitioning refers to slicing data by putting rows into different tables (and databases, in this case). The tables in each database will be the same, but contain different sets of data based on a logical grouping, like location.

If a database contains data for all customers across the United States, but the data exceeds 10GB, an option would be to create databases for regions, i.e. Northwest, Northeast, etc. and split the data across them.

The downside to this approach is that there are now N number of identical databases to manage when it comes to upgrades, maintenance, and backups.

Vertical Partitioning

In this context, vertical partitioning can be used to describe storing different columns and tables in different databases, as opposed to storing rows in different databases. By utilizing cross-database queries, tables in different databases can be queried and manipulated in almost the exact same way as if they were in the same database.

If an application had a lot of static reference data, i.e. data points on every city and town in the US, those tables could be stored in another database to separate the storage costs of that data from interfering with the actual customer data in the primary database.

Normalize, Normalize, Normalize

Normalizationopen in new window is a huge topic, but it isn't often talked about today in terms of storage benefits, due to the low cost of disk space.

Proper normalization data can greatly reduce the redundancy of data. Often the most redundant data are of the string variety and thus can have a significant impact on the overall size of a database when not properly normalized. Aiming for the third normal formopen in new window is a reasonable goal to achieve most of the size-related benefits that normal forms can offer.

Since there are always exceptions as to when normalized data or denormalized data is most beneficial, it will depend on the case at hand, but defaulting to normalizing data is generally a good strategy.

# Database Limitations

# Size Limitations

# How to Work Around SQL Server Express's Data Limit

# Compression

# Design Smart

# Integers

# Date and Time

# Decimals

# Strings and VARCHARs

# Use Multiple Databases

# Horizontal Partitioning

# Vertical Partitioning

# Normalize, Normalize, Normalize