Table of Contents
MySQL supports SQL data types in several categories: numeric types, date and time types, string (character and byte) types, and spatial types. This chapter provides an overview and more detailed description of the properties of the types in each category, and a summary of the data type storage requirements. The initial overviews are intentionally brief. Consult the more detailed descriptions for additional information about particular data types, such as the permissible formats in which you can specify values.
Data type descriptions use these conventions:
For integer types, M
indicates the
maximum display width. For floating-point and fixed-point types,
M
is the total number of digits that
can be stored (the precision). For string types,
M
is the maximum length. The maximum
permissible value of M
depends on the
data type.
D
applies to floating-point and
fixed-point types and indicates the number of digits following
the decimal point (the scale). The maximum possible value is 30,
but should be no greater than
M
−2.
fsp
applies to the
TIME
,
DATETIME
, and
TIMESTAMP
types and represents
fractional seconds precision; that is, the number of digits
following the decimal point for fractional parts of seconds. The
fsp
value, if given, must be in the
range 0 to 6. A value of 0 signifies that there is no fractional
part. If omitted, the default precision is 0. (This differs from
the standard SQL default of 6, for compatibility with previous
MySQL versions.)
Square brackets ([
and ]
)
indicate optional parts of type definitions.
MySQL supports all standard SQL numeric data types. These types
include the exact numeric data types
(INTEGER
,
SMALLINT
,
DECIMAL
, and
NUMERIC
), as well as the
approximate numeric data types
(FLOAT
,
REAL
, and
DOUBLE PRECISION
). The keyword
INT
is a synonym for
INTEGER
, and the keywords
DEC
and
FIXED
are synonyms for
DECIMAL
. MySQL treats
DOUBLE
as a synonym for
DOUBLE PRECISION
(a nonstandard
extension). MySQL also treats REAL
as a synonym for DOUBLE PRECISION
(a nonstandard variation), unless the
REAL_AS_FLOAT
SQL mode is
enabled.
The BIT
data type stores bit values
and is supported for MyISAM
,
MEMORY
,
InnoDB
, and
NDB
tables.
For information about how MySQL handles assignment of out-of-range values to columns and overflow during expression evaluation, see Section 11.1.7, “Out-of-Range and Overflow Handling”.
For information about storage requirements of the numeric data types, see Section 11.6, “Data Type Storage Requirements”.
For descriptions of functions that operate on numeric values, see Section 12.5, “Numeric Functions and Operators”. The data type used for the result of a calculation on numeric operands depends on the types of the operands and the operations performed on them. For more information, see Section 12.5.1, “Arithmetic Operators”.
For integer data types, M
indicates
the maximum display width. The maximum display width is 255.
Display width is unrelated to the range of values a type can
store, as described in
Section 11.1.6, “Numeric Type Attributes”.
For floating-point and fixed-point data types,
M
is the total number of digits that
can be stored.
If you specify ZEROFILL
for a numeric column,
MySQL automatically adds the UNSIGNED
attribute to the column.
Numeric data types that permit the UNSIGNED
attribute also permit SIGNED
. However, these
data types are signed by default, so the
SIGNED
attribute has no effect.
SERIAL
is an alias for BIGINT
UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE
.
SERIAL DEFAULT VALUE
in the definition of an
integer column is an alias for NOT NULL AUTO_INCREMENT
UNIQUE
.
When you use subtraction between integer values where one is
of type UNSIGNED
, the result is unsigned
unless the
NO_UNSIGNED_SUBTRACTION
SQL
mode is enabled. See Section 12.10, “Cast Functions and Operators”.
A bit-value type. M
indicates the
number of bits per value, from 1 to 64. The default is 1 if
M
is omitted.
TINYINT[(
M
)]
[UNSIGNED] [ZEROFILL]
A very small integer. The signed range is
-128
to 127
. The
unsigned range is 0
to
255
.
These types are synonyms for
TINYINT(1)
. A value of zero
is considered false. Nonzero values are considered true:
mysql>SELECT IF(0, 'true', 'false');
+------------------------+ | IF(0, 'true', 'false') | +------------------------+ | false | +------------------------+ mysql>SELECT IF(1, 'true', 'false');
+------------------------+ | IF(1, 'true', 'false') | +------------------------+ | true | +------------------------+ mysql>SELECT IF(2, 'true', 'false');
+------------------------+ | IF(2, 'true', 'false') | +------------------------+ | true | +------------------------+
However, the values TRUE
and
FALSE
are merely aliases for
1
and 0
, respectively,
as shown here:
mysql>SELECT IF(0 = FALSE, 'true', 'false');
+--------------------------------+ | IF(0 = FALSE, 'true', 'false') | +--------------------------------+ | true | +--------------------------------+ mysql>SELECT IF(1 = TRUE, 'true', 'false');
+-------------------------------+ | IF(1 = TRUE, 'true', 'false') | +-------------------------------+ | true | +-------------------------------+ mysql>SELECT IF(2 = TRUE, 'true', 'false');
+-------------------------------+ | IF(2 = TRUE, 'true', 'false') | +-------------------------------+ | false | +-------------------------------+ mysql>SELECT IF(2 = FALSE, 'true', 'false');
+--------------------------------+ | IF(2 = FALSE, 'true', 'false') | +--------------------------------+ | false | +--------------------------------+
The last two statements display the results shown because
2
is equal to neither
1
nor 0
.
SMALLINT[(
M
)]
[UNSIGNED] [ZEROFILL]
A small integer. The signed range is
-32768
to 32767
. The
unsigned range is 0
to
65535
.
MEDIUMINT[(
M
)]
[UNSIGNED] [ZEROFILL]
A medium-sized integer. The signed range is
-8388608
to 8388607
.
The unsigned range is 0
to
16777215
.
INT[(
M
)]
[UNSIGNED] [ZEROFILL]
A normal-size integer. The signed range is
-2147483648
to
2147483647
. The unsigned range is
0
to 4294967295
.
INTEGER[(
M
)]
[UNSIGNED] [ZEROFILL]
This type is a synonym for
INT
.
BIGINT[(
M
)]
[UNSIGNED] [ZEROFILL]
A large integer. The signed range is
-9223372036854775808
to
9223372036854775807
. The unsigned range
is 0
to
18446744073709551615
.
SERIAL
is an alias for BIGINT
UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE
.
Some things you should be aware of with respect to
BIGINT
columns:
All arithmetic is done using signed
BIGINT
or
DOUBLE
values, so you
should not use unsigned big integers larger than
9223372036854775807
(63 bits) except
with bit functions! If you do that, some of the last
digits in the result may be wrong because of rounding
errors when converting a
BIGINT
value to a
DOUBLE
.
MySQL can handle BIGINT
in the following cases:
When using integers to store large unsigned values
in a BIGINT
column.
In
MIN(
or
col_name
)MAX(
,
where col_name
)col_name
refers to
a BIGINT
column.
When using operators
(+
,
-
,
*
,
and so on) where both operands are integers.
You can always store an exact integer value in a
BIGINT
column by storing
it using a string. In this case, MySQL performs a
string-to-number conversion that involves no
intermediate double-precision representation.
The -
,
+
, and
*
operators use BIGINT
arithmetic when both operands are integer values. This
means that if you multiply two big integers (or results
from functions that return integers), you may get
unexpected results when the result is larger than
9223372036854775807
.
DECIMAL[(
M
[,D
])]
[UNSIGNED] [ZEROFILL]
A packed “exact” fixed-point number.
M
is the total number of digits
(the precision) and D
is the
number of digits after the decimal point (the scale). The
decimal point and (for negative numbers) the
-
sign are not counted in
M
. If
D
is 0, values have no decimal
point or fractional part. The maximum number of digits
(M
) for
DECIMAL
is 65. The maximum
number of supported decimals (D
)
is 30. If D
is omitted, the
default is 0. If M
is omitted,
the default is 10.
UNSIGNED
, if specified, disallows
negative values.
All basic calculations (+, -, *, /
) with
DECIMAL
columns are done with
a precision of 65 digits.
DEC[(
,
M
[,D
])]
[UNSIGNED] [ZEROFILL]NUMERIC[(
,
M
[,D
])]
[UNSIGNED] [ZEROFILL]FIXED[(
M
[,D
])]
[UNSIGNED] [ZEROFILL]
These types are synonyms for
DECIMAL
. The
FIXED
synonym is available
for compatibility with other database systems.
FLOAT[(
M
,D
)]
[UNSIGNED] [ZEROFILL]
A small (single-precision) floating-point number.
Permissible values are -3.402823466E+38
to -1.175494351E-38
,
0
, and 1.175494351E-38
to 3.402823466E+38
. These are the
theoretical limits, based on the IEEE standard. The actual
range might be slightly smaller depending on your hardware
or operating system.
M
is the total number of digits
and D
is the number of digits
following the decimal point. If M
and D
are omitted, values are
stored to the limits permitted by the hardware. A
single-precision floating-point number is accurate to
approximately 7 decimal places.
FLOAT(
is a nonstandard MySQL extension.
M
,D
)
UNSIGNED
, if specified, disallows
negative values.
Using FLOAT
might give you
some unexpected problems because all calculations in MySQL
are done with double precision. See
Section B.4.4.7, “Solving Problems with No Matching Rows”.
FLOAT(
p
)
[UNSIGNED] [ZEROFILL]
A floating-point number. p
represents the precision in bits, but MySQL uses this value
only to determine whether to use
FLOAT
or
DOUBLE
for the resulting data
type. If p
is from 0 to 24, the
data type becomes FLOAT
with
no M
or
D
values. If
p
is from 25 to 53, the data type
becomes DOUBLE
with no
M
or D
values. The range of the resulting column is the same as for
the single-precision FLOAT
or
double-precision DOUBLE
data
types described earlier in this section.
FLOAT(
syntax is provided for ODBC compatibility.
p
)
DOUBLE[(
M
,D
)]
[UNSIGNED] [ZEROFILL]
A normal-size (double-precision) floating-point number.
Permissible values are
-1.7976931348623157E+308
to
-2.2250738585072014E-308
,
0
, and
2.2250738585072014E-308
to
1.7976931348623157E+308
. These are the
theoretical limits, based on the IEEE standard. The actual
range might be slightly smaller depending on your hardware
or operating system.
M
is the total number of digits
and D
is the number of digits
following the decimal point. If M
and D
are omitted, values are
stored to the limits permitted by the hardware. A
double-precision floating-point number is accurate to
approximately 15 decimal places.
DOUBLE(
is a nonstandard MySQL extension.
M
,D
)
UNSIGNED
, if specified, disallows
negative values.
DOUBLE
PRECISION[(
,
M
,D
)]
[UNSIGNED] [ZEROFILL]REAL[(
M
,D
)]
[UNSIGNED] [ZEROFILL]
These types are synonyms for
DOUBLE
. Exception: If the
REAL_AS_FLOAT
SQL mode is
enabled, REAL
is a synonym
for FLOAT
rather than
DOUBLE
.
MySQL supports the SQL standard integer types
INTEGER
(or INT
) and
SMALLINT
. As an extension to the standard,
MySQL also supports the integer types
TINYINT
, MEDIUMINT
, and
BIGINT
. The following table shows the
required storage and range for each integer type.
Table 11.1 Required Storage and Range for Integer Types Supported by MySQL
Type | Storage (Bytes) | Minimum Value Signed | Minimum Value Unsigned | Maximum Value Signed | Maximum Value Unsigned |
---|---|---|---|---|---|
TINYINT |
1 | -128 |
0 |
127 |
255 |
SMALLINT |
2 | -32768 |
0 |
32767 |
65535 |
MEDIUMINT |
3 | -8388608 |
0 |
8388607 |
16777215 |
INT |
4 | -2147483648 |
0 |
2147483647 |
4294967295 |
BIGINT |
8 | -263 |
0 |
263-1 |
264-1 |
The DECIMAL
and NUMERIC
types store exact numeric data values. These types are used when
it is important to preserve exact precision, for example with
monetary data. In MySQL, NUMERIC
is
implemented as DECIMAL
, so the following
remarks about DECIMAL
apply equally to
NUMERIC
.
MySQL stores DECIMAL
values in binary format.
See Section 12.21, “Precision Math”.
In a DECIMAL
column declaration, the
precision and scale can be (and usually is) specified. For
example:
salary DECIMAL(5,2)
In this example, 5
is the precision and
2
is the scale. The precision represents the
number of significant digits that are stored for values, and the
scale represents the number of digits that can be stored
following the decimal point.
Standard SQL requires that DECIMAL(5,2)
be
able to store any value with five digits and two decimals, so
values that can be stored in the salary
column range from -999.99
to
999.99
.
In standard SQL, the syntax
DECIMAL(
is
equivalent to
M
)DECIMAL(
.
Similarly, the syntax M
,0)DECIMAL
is equivalent
to DECIMAL(
,
where the implementation is permitted to decide the value of
M
,0)M
. MySQL supports both of these
variant forms of DECIMAL
syntax. The default
value of M
is 10.
If the scale is 0, DECIMAL
values contain no
decimal point or fractional part.
The maximum number of digits for DECIMAL
is
65, but the actual range for a given DECIMAL
column can be constrained by the precision or scale for a given
column. When such a column is assigned a value with more digits
following the decimal point than are permitted by the specified
scale, the value is converted to that scale. (The precise
behavior is operating system-specific, but generally the effect
is truncation to the permissible number of digits.)
The FLOAT
and DOUBLE
types
represent approximate numeric data values. MySQL uses four bytes
for single-precision values and eight bytes for double-precision
values.
For FLOAT
, the SQL standard permits an
optional specification of the precision (but not the range of
the exponent) in bits following the keyword
FLOAT
in parentheses; ; that is,
FLOAT(
.
MySQL also supports this optional precision specification, but
the precision value in
p
)FLOAT(
is used only to determine storage size. A precision from 0 to 23
results in a 4-byte single-precision p
)FLOAT
column. A precision from 24 to 53 results in an 8-byte
double-precision DOUBLE
column.
MySQL permits a nonstandard syntax:
FLOAT(
or
M
,D
)REAL(
or M
,D
)DOUBLE
PRECISION(
.
Here,
M
,D
)(
means than values can be stored with up to
M
,D
)M
digits in total, of which
D
digits may be after the decimal
point. For example, a column defined as
FLOAT(7,4)
will look like
-999.9999
when displayed. MySQL performs
rounding when storing values, so if you insert
999.00009
into a
FLOAT(7,4)
column, the approximate result is
999.0001
.
Because floating-point values are approximate and not stored as exact values, attempts to treat them as exact in comparisons may lead to problems. They are also subject to platform or implementation dependencies. For more information, see Section B.4.4.8, “Problems with Floating-Point Values”
For maximum portability, code requiring storage of approximate
numeric data values should use FLOAT
or
DOUBLE PRECISION
with no specification of
precision or number of digits.
The BIT
data type is used to store bit
values. A type of
BIT(
enables
storage of M
)M
-bit values.
M
can range from 1 to 64.
To specify bit values,
b'
notation
can be used. value
'value
is a binary value
written using zeros and ones. For example,
b'111'
and b'10000000'
represent 7 and 128, respectively. See
Section 9.1.5, “Bit-Value Literals”.
If you assign a value to a
BIT(
column that
is less than M
)M
bits long, the value
is padded on the left with zeros. For example, assigning a value
of b'101'
to a BIT(6)
column is, in effect, the same as assigning
b'000101'
.
NDB Cluster.
The maximum combined size of all BIT
columns used in a given NDB
table
must not exceed 4096 bits.
MySQL supports an extension for optionally specifying the
display width of integer data types in parentheses following the
base keyword for the type. For example,
INT(4)
specifies an
INT
with a display width of four
digits. This optional display width may be used by applications
to display integer values having a width less than the width
specified for the column by left-padding them with spaces. (That
is, this width is present in the metadata returned with result
sets. Whether it is used is up to the application.)
The display width does not constrain the
range of values that can be stored in the column. Nor does it
prevent values wider than the column display width from being
displayed correctly. For example, a column specified as
SMALLINT(3)
has the usual
SMALLINT
range of
-32768
to 32767
, and
values outside the range permitted by three digits are displayed
in full using more than three digits.
When used in conjunction with the optional (nonstandard)
ZEROFILL
attribute, the default padding of
spaces is replaced with zeros. For example, for a column
declared as INT(4) ZEROFILL
, a
value of 5
is retrieved as
0005
.
The ZEROFILL
attribute is ignored for
columns involved in expressions or
UNION
queries.
If you store values larger than the display width in an
integer column that has the ZEROFILL
attribute, you may experience problems when MySQL generates
temporary tables for some complicated joins. In these cases,
MySQL assumes that the data values fit within the column
display width.
All integer types can have an optional (nonstandard)
UNSIGNED
attribute. An unsigned type can be
used to permit only nonnegative numbers in a column or when you
need a larger upper numeric range for the column. For example,
if an INT
column is
UNSIGNED
, the size of the column's range is
the same but its endpoints shift up, from
-2147483648
and 2147483647
to 0
and 4294967295
.
Floating-point and fixed-point types also can be
UNSIGNED
. As with integer types, this
attribute prevents negative values from being stored in the
column. Unlike the integer types, the upper range of column
values remains the same.
If you specify ZEROFILL
for a numeric column,
MySQL automatically adds the UNSIGNED
attribute.
Integer or floating-point data types can have the
AUTO_INCREMENT
attribute. When you insert a
value of NULL
into an indexed
AUTO_INCREMENT
column, the column is set to
the next sequence value. Typically this is
, where
value
+1value
is the largest value for the
column currently in the table.
(AUTO_INCREMENT
sequences begin with
1
.)
Storing 0
into an
AUTO_INCREMENT
column has the same effect as
storing NULL
, unless the
NO_AUTO_VALUE_ON_ZERO
SQL mode
is enabled.
Inserting NULL
to generate
AUTO_INCREMENT
values requires that the
column be declared NOT NULL
. If the column is
declared NULL
, inserting
NULL
stores a NULL
. When
you insert any other value into an
AUTO_INCREMENT
column, the column is set to
that value and the sequence is reset so that the next
automatically generated value follows sequentially from the
inserted value.
Negative values for AUTO_INCREMENT
columns
are not supported.
When MySQL stores a value in a numeric column that is outside the permissible range of the column data type, the result depends on the SQL mode in effect at the time:
If strict SQL mode is enabled, MySQL rejects the out-of-range value with an error, and the insert fails, in accordance with the SQL standard.
If no restrictive modes are enabled, MySQL clips the value to the appropriate endpoint of the column data type range and stores the resulting value instead.
When an out-of-range value is assigned to an integer column, MySQL stores the value representing the corresponding endpoint of the column data type range.
When a floating-point or fixed-point column is assigned a value that exceeds the range implied by the specified (or default) precision and scale, MySQL stores the value representing the corresponding endpoint of that range.
Suppose that a table t1
has this definition:
CREATE TABLE t1 (i1 TINYINT, i2 TINYINT UNSIGNED);
With strict SQL mode enabled, an out of range error occurs:
mysql>SET sql_mode = 'TRADITIONAL';
mysql>INSERT INTO t1 (i1, i2) VALUES(256, 256);
ERROR 1264 (22003): Out of range value for column 'i1' at row 1 mysql>SELECT * FROM t1;
Empty set (0.00 sec)
With strict SQL mode not enabled, clipping with warnings occurs:
mysql>SET sql_mode = '';
mysql>INSERT INTO t1 (i1, i2) VALUES(256, 256);
mysql>SHOW WARNINGS;
+---------+------+---------------------------------------------+ | Level | Code | Message | +---------+------+---------------------------------------------+ | Warning | 1264 | Out of range value for column 'i1' at row 1 | | Warning | 1264 | Out of range value for column 'i2' at row 1 | +---------+------+---------------------------------------------+ mysql>SELECT * FROM t1;
+------+------+ | i1 | i2 | +------+------+ | 127 | 255 | +------+------+
When strict SQL mode is not enabled, column-assignment
conversions that occur due to clipping are reported as warnings
for ALTER TABLE
,
LOAD DATA
,
UPDATE
, and multiple-row
INSERT
statements. In strict
mode, these statements fail, and some or all the values are not
inserted or changed, depending on whether the table is a
transactional table and other factors. For details, see
Section 5.1.10, “Server SQL Modes”.
Overflow during numeric expression evaluation results in an
error. For example, the largest signed
BIGINT
value is
9223372036854775807, so the following expression produces an
error:
mysql> SELECT 9223372036854775807 + 1;
ERROR 1690 (22003): BIGINT value is out of range in '(9223372036854775807 + 1)'
To enable the operation to succeed in this case, convert the value to unsigned;
mysql> SELECT CAST(9223372036854775807 AS UNSIGNED) + 1;
+-------------------------------------------+
| CAST(9223372036854775807 AS UNSIGNED) + 1 |
+-------------------------------------------+
| 9223372036854775808 |
+-------------------------------------------+
Whether overflow occurs depends on the range of the operands, so
another way to handle the preceding expression is to use
exact-value arithmetic because
DECIMAL
values have a larger
range than integers:
mysql> SELECT 9223372036854775807.0 + 1;
+---------------------------+
| 9223372036854775807.0 + 1 |
+---------------------------+
| 9223372036854775808.0 |
+---------------------------+
Subtraction between integer values, where one is of type
UNSIGNED
, produces an unsigned result by
default. If the result would otherwise have been negative, an
error results:
mysql>SET sql_mode = '';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT CAST(0 AS UNSIGNED) - 1;
ERROR 1690 (22003): BIGINT UNSIGNED value is out of range in '(cast(0 as unsigned) - 1)'
If the NO_UNSIGNED_SUBTRACTION
SQL mode is enabled, the result is negative:
mysql>SET sql_mode = 'NO_UNSIGNED_SUBTRACTION';
mysql>SELECT CAST(0 AS UNSIGNED) - 1;
+-------------------------+ | CAST(0 AS UNSIGNED) - 1 | +-------------------------+ | -1 | +-------------------------+
If the result of such an operation is used to update an
UNSIGNED
integer column, the result is
clipped to the maximum value for the column type, or clipped to
0 if NO_UNSIGNED_SUBTRACTION
is enabled. If strict SQL mode is enabled, an error occurs and
the column remains unchanged.
The date and time data types for representing temporal values are
DATE
,
TIME
,
DATETIME
,
TIMESTAMP
, and
YEAR
. Each temporal type has a
range of valid values, as well as a “zero” value that
may be used when you specify an invalid value that MySQL cannot
represent. The TIMESTAMP
and
DATETIME
types have special
automatic updating behavior, described in
Section 11.2.6, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”.
For information about storage requirements of the temporal data types, see Section 11.6, “Data Type Storage Requirements”.
For descriptions of functions that operate on temporal values, see Section 12.6, “Date and Time Functions”.
Keep in mind these general considerations when working with date and time types:
MySQL retrieves values for a given date or time type in a standard output format, but it attempts to interpret a variety of formats for input values that you supply (for example, when you specify a value to be assigned to or compared to a date or time type). For a description of the permitted formats for date and time types, see Section 9.1.3, “Date and Time Literals”. It is expected that you supply valid values. Unpredictable results may occur if you use values in other formats.
Although MySQL tries to interpret values in several formats,
date parts must always be given in year-month-day order (for
example, '98-09-04'
), rather than in the
month-day-year or day-month-year orders commonly used
elsewhere (for example, '09-04-98'
,
'04-09-98'
). To convert strings in other
orders to year-month-day order, the
STR_TO_DATE()
function may be
useful.
Dates containing 2-digit year values are ambiguous because the century is unknown. MySQL interprets 2-digit year values using these rules:
Year values in the range 70-99
become
1970-1999
.
Year values in the range 00-69
become
2000-2069
.
See also Section 11.2.9, “2-Digit Years in Dates”.
Conversion of values from one temporal type to another occurs according to the rules in Section 11.2.8, “Conversion Between Date and Time Types”.
MySQL automatically converts a date or time value to a number if the value is used in numeric context and vice versa.
By default, when MySQL encounters a value for a date or time
type that is out of range or otherwise invalid for the type,
it converts the value to the “zero” value for
that type. The exception is that out-of-range
TIME
values are clipped to the
appropriate endpoint of the
TIME
range.
By setting the SQL mode to the appropriate value, you can
specify more exactly what kind of dates you want MySQL to
support. (See Section 5.1.10, “Server SQL Modes”.) You can get MySQL
to accept certain dates, such as
'2009-11-31'
, by enabling the
ALLOW_INVALID_DATES
SQL
mode. This is useful when you want to store a “possibly
wrong” value which the user has specified (for example,
in a web form) in the database for future processing. Under
this mode, MySQL verifies only that the month is in the range
from 1 to 12 and that the day is in the range from 1 to 31.
MySQL permits you to store dates where the day or month and
day are zero in a DATE
or
DATETIME
column. This is useful
for applications that need to store birthdates for which you
may not know the exact date. In this case, you simply store
the date as '2009-00-00'
or
'2009-01-00'
. However, with dates such as
these, you should not expect to get correct results for
functions such as DATE_SUB()
or
DATE_ADD()
that require
complete dates. To disallow zero month or day parts in dates,
enable the NO_ZERO_IN_DATE
SQL mode.
MySQL permits you to store a “zero” value of
'0000-00-00'
as a “dummy
date.” In some cases, this is more convenient than
using NULL
values, and uses less data and
index space. To disallow '0000-00-00'
,
enable the NO_ZERO_DATE
SQL
mode.
“Zero” date or time values used through
Connector/ODBC are converted automatically to
NULL
because ODBC cannot handle such
values.
The following table shows the format of the “zero”
value for each type. The “zero” values are special,
but you can store or refer to them explicitly using the values
shown in the table. You can also do this using the values
'0'
or 0
, which are easier
to write. For temporal types that include a date part
(DATE
,
DATETIME
, and
TIMESTAMP
), use of these values may
produce warning or errors. The precise behavior depends on which,
if any, of the strict and
NO_ZERO_DATE
SQL modes are
enabled; see Section 5.1.10, “Server SQL Modes”.
Data Type | “Zero” Value |
---|---|
DATE |
'0000-00-00' |
TIME |
'00:00:00' |
DATETIME |
'0000-00-00 00:00:00' |
TIMESTAMP |
'0000-00-00 00:00:00' |
YEAR |
0000 |
The date and time data types for representing temporal values
are DATE
,
TIME
,
DATETIME
,
TIMESTAMP
, and
YEAR
.
For the DATE
and
DATETIME
range descriptions,
“supported” means that although earlier values
might work, there is no guarantee.
MySQL permits fractional seconds for
TIME
,
DATETIME
, and
TIMESTAMP
values, with up to
microseconds (6 digits) precision. To define a column that
includes a fractional seconds part, use the syntax
,
where type_name
(fsp
)type_name
is
TIME
,
DATETIME
, or
TIMESTAMP
, and
fsp
is the fractional seconds
precision. For example:
CREATE TABLE t1 (t TIME(3), dt DATETIME(6), ts TIMESTAMP(0));
The fsp
value, if given, must be in
the range 0 to 6. A value of 0 signifies that there is no
fractional part. If omitted, the default precision is 0. (This
differs from the standard SQL default of 6, for compatibility
with previous MySQL versions.)
Any TIMESTAMP
or
DATETIME
column in a table can
have automatic initialization and updating properties; see
Section 11.2.6, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”.
A date. The supported range is
'1000-01-01'
to
'9999-12-31'
. MySQL displays
DATE
values in
'
format, but permits assignment of values to
YYYY-MM-DD
'DATE
columns using either
strings or numbers.
A date and time combination. The supported range is
'1000-01-01 00:00:00.000000'
to
'9999-12-31 23:59:59.999999'
. MySQL
displays DATETIME
values in
'
format, but permits assignment of values to
YYYY-MM-DD
hh:mm:ss
[.fraction
]'DATETIME
columns using either
strings or numbers.
An optional fsp
value in the
range from 0 to 6 may be given to specify fractional seconds
precision. A value of 0 signifies that there is no
fractional part. If omitted, the default precision is 0.
Automatic initialization and updating to the current date
and time for DATETIME
columns
can be specified using DEFAULT
and
ON UPDATE
column definition clauses, as
described in Section 11.2.6, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”.
A timestamp. The range is '1970-01-01
00:00:01.000000'
UTC to '2038-01-19
03:14:07.999999'
UTC.
TIMESTAMP
values are stored
as the number of seconds since the epoch
('1970-01-01 00:00:00'
UTC). A
TIMESTAMP
cannot represent
the value '1970-01-01 00:00:00'
because
that is equivalent to 0 seconds from the epoch and the value
0 is reserved for representing '0000-00-00
00:00:00'
, the “zero”
TIMESTAMP
value.
An optional fsp
value in the
range from 0 to 6 may be given to specify fractional seconds
precision. A value of 0 signifies that there is no
fractional part. If omitted, the default precision is 0.
The way the server handles TIMESTAMP
definitions depends on the value of the
explicit_defaults_for_timestamp
system variable (see
Section 5.1.7, “Server System Variables”).
If
explicit_defaults_for_timestamp
is enabled, there is no automatic assignment of the
DEFAULT CURRENT_TIMESTAMP
or ON
UPDATE CURRENT_TIMESTAMP
attributes to any
TIMESTAMP
column. They must
be included explicitly in the column definition. Also, any
TIMESTAMP
not explicitly
declared as NOT NULL
permits
NULL
values.
If
explicit_defaults_for_timestamp
is disabled, the server handles TIMESTAMP
as follows:
Unless specified otherwise, the first
TIMESTAMP
column in a table
is defined to be automatically set to the date and time of
the most recent modification if not explicitly assigned a
value. This makes TIMESTAMP
useful for recording the timestamp of an
INSERT
or
UPDATE
operation. You can
also set any TIMESTAMP
column
to the current date and time by assigning it a
NULL
value, unless it has been defined
with the NULL
attribute to permit
NULL
values.
Automatic initialization and updating to the current date
and time can be specified using DEFAULT
CURRENT_TIMESTAMP
and ON UPDATE
CURRENT_TIMESTAMP
column definition clauses. By
default, the first TIMESTAMP
column has these properties, as previously noted. However,
any TIMESTAMP
column in a
table can be defined to have these properties.
explicit_defaults_for_timestamp
is available as of MySQL 5.6.6. Before 5.6.6, the server
handles TIMESTAMP
as
discussed for
explicit_defaults_for_timestamp
disabled. Those behaviors, while they remain the default,
are nonstandard and are deprecated as of 5.6.6. For
discussion regarding upgrading to an installation with
explicit_defaults_for_timestamp
enabled, see
Section 2.11.3, “Changes in MySQL 5.6”.
A time. The range is '-838:59:59.000000'
to '838:59:59.000000'
. MySQL displays
TIME
values in
'
format, but permits assignment of values to
hh:mm:ss
[.fraction
]'TIME
columns using either
strings or numbers.
An optional fsp
value in the
range from 0 to 6 may be given to specify fractional seconds
precision. A value of 0 signifies that there is no
fractional part. If omitted, the default precision is 0.
A year in 2-digit or 4-digit format. The default is 4-digit
format. YEAR(2)
or
YEAR(4)
differ in display format, but
have the same range of values. In 4-digit format, values
display as 1901
to
2155
, or 0000
. In
2-digit format, values display as 70
to
69
, representing years from 1970 to 2069.
MySQL displays YEAR
values in
YYYY
or
YY
format, but permits assignment
of values to YEAR
columns
using either strings or numbers.
The 2-digit YEAR(2)
data type has
certain issues that you should consider before choosing to
use it. As of MySQL 5.6.6, YEAR(2)
is
deprecated: 2-digit YEAR(2)
columns in
existing tables are treated as before, but 2-digit
YEAR(2)
columns in new or altered
tables are converted to 4-digit YEAR
columns. For more information, see
Section 11.2.5, “2-Digit YEAR(2) Limitations and Migrating to 4-Digit YEAR”.
For additional information about YEAR
display format and interpretation of input values, see
Section 11.2.4, “The YEAR Type”.
The SUM()
and
AVG()
aggregate functions do not
work with temporal values. (They convert the values to numbers,
losing everything after the first nonnumeric character.) To work
around this problem, convert to numeric units, perform the
aggregate operation, and convert back to a temporal value.
Examples:
SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(time_col
))) FROMtbl_name
; SELECT FROM_DAYS(SUM(TO_DAYS(date_col
))) FROMtbl_name
;
The MySQL server can be run with the
MAXDB
SQL mode enabled. In
this case, TIMESTAMP
is
identical with DATETIME
. If
this mode is enabled at the time that a table is created,
TIMESTAMP
columns are created
as DATETIME
columns. As a
result, such columns use
DATETIME
display format, have
the same range of values, and there is no automatic
initialization or updating to the current date and time. See
Section 5.1.10, “Server SQL Modes”.
The DATE
, DATETIME
, and
TIMESTAMP
types are related. This section
describes their characteristics, how they are similar, and how
they differ. MySQL recognizes DATE
,
DATETIME
, and TIMESTAMP
values in several formats, described in
Section 9.1.3, “Date and Time Literals”. For the
DATE
and DATETIME
range
descriptions, “supported” means that although
earlier values might work, there is no guarantee.
The DATE
type is used for values with a date
part but no time part. MySQL retrieves and displays
DATE
values in
'
format. The supported range is YYYY-MM-DD
''1000-01-01'
to '9999-12-31'
.
The DATETIME
type is used for values that
contain both date and time parts. MySQL retrieves and displays
DATETIME
values in
'
format. The supported range is
YYYY-MM-DD
hh:mm:ss
''1000-01-01 00:00:00'
to '9999-12-31
23:59:59'
.
The TIMESTAMP
data type is used for values
that contain both date and time parts.
TIMESTAMP
has a range of '1970-01-01
00:00:01'
UTC to '2038-01-19
03:14:07'
UTC.
A DATETIME
or TIMESTAMP
value can include a trailing fractional seconds part in up to
microseconds (6 digits) precision. In particular, as of MySQL
5.6.4, any fractional part in a value inserted into a
DATETIME
or TIMESTAMP
column is stored rather than discarded. With the fractional part
included, the format for these values is
'
,
the range for YYYY-MM-DD
hh:mm:ss
[.fraction
]'DATETIME
values is
'1000-01-01 00:00:00.000000'
to
'9999-12-31 23:59:59.999999'
, and the range
for TIMESTAMP
values is '1970-01-01
00:00:01.000000'
to '2038-01-19
03:14:07.999999'
. The fractional part should always be
separated from the rest of the time by a decimal point; no other
fractional seconds delimiter is recognized. For information
about fractional seconds support in MySQL, see
Section 11.2.7, “Fractional Seconds in Time Values”.
The TIMESTAMP
and DATETIME
data types offer automatic initialization and updating to the
current date and time. For more information, see
Section 11.2.6, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”.
MySQL converts TIMESTAMP
values from the
current time zone to UTC for storage, and back from UTC to the
current time zone for retrieval. (This does not occur for other
types such as DATETIME
.) By default, the
current time zone for each connection is the server's time. The
time zone can be set on a per-connection basis. As long as the
time zone setting remains constant, you get back the same value
you store. If you store a TIMESTAMP
value,
and then change the time zone and retrieve the value, the
retrieved value is different from the value you stored. This
occurs because the same time zone was not used for conversion in
both directions. The current time zone is available as the value
of the time_zone
system
variable. For more information, see
Section 5.1.12, “MySQL Server Time Zone Support”.
Invalid DATE
, DATETIME
, or
TIMESTAMP
values are converted to the
“zero” value of the appropriate type
('0000-00-00'
or '0000-00-00
00:00:00'
), if the SQL mode permits this conversion.
The precise behavior depends on which if any of strict SQL mode
and the NO_ZERO_DATE
SQL mode
are enabled; see Section 5.1.10, “Server SQL Modes”.
Be aware of certain properties of date value interpretation in MySQL:
MySQL permits a “relaxed” format for values
specified as strings, in which any punctuation character may
be used as the delimiter between date parts or time parts.
In some cases, this syntax can be deceiving. For example, a
value such as '10:11:12'
might look like
a time value because of the :
, but is
interpreted as the year '2010-11-12'
if
used in date context. The value
'10:45:15'
is converted to
'0000-00-00'
because
'45'
is not a valid month.
The only delimiter recognized between a date and time part and a fractional seconds part is the decimal point.
The server requires that month and day values be valid, and
not merely in the range 1 to 12 and 1 to 31, respectively.
With strict mode disabled, invalid dates such as
'2004-04-31'
are converted to
'0000-00-00'
and a warning is generated.
With strict mode enabled, invalid dates generate an error.
To permit such dates, enable
ALLOW_INVALID_DATES
. See
Section 5.1.10, “Server SQL Modes”, for more information.
MySQL does not accept TIMESTAMP
values
that include a zero in the day or month column or values
that are not a valid date. The sole exception to this rule
is the special “zero” value
'0000-00-00 00:00:00'
, if the SQL mode
permits this value. The precise behavior depends on which if
any of strict SQL mode and the
NO_ZERO_DATE
SQL mode are
enabled; see Section 5.1.10, “Server SQL Modes”.
Dates containing 2-digit year values are ambiguous because the century is unknown. MySQL interprets 2-digit year values using these rules:
Year values in the range 00-69
become
2000-2069
.
Year values in the range 70-99
become
1970-1999
.
See also Section 11.2.9, “2-Digit Years in Dates”.
The MySQL server can be run with the
MAXDB
SQL mode enabled. In
this case, TIMESTAMP
is identical with
DATETIME
. If this mode is enabled at the
time that a table is created, TIMESTAMP
columns are created as DATETIME
columns. As
a result, such columns use DATETIME
display
format, have the same range of values, and there is no
automatic initialization or updating to the current date and
time. See Section 5.1.10, “Server SQL Modes”.
MySQL retrieves and displays TIME
values in
'hh:mm:ss'
format (or
'hhh:mm:ss'
format for large hours
values). TIME
values may range from
'-838:59:59'
to
'838:59:59'
. The hours part may be so large
because the TIME
type can be used not only to
represent a time of day (which must be less than 24 hours), but
also elapsed time or a time interval between two events (which
may be much greater than 24 hours, or even negative).
MySQL recognizes TIME
values in several
formats, some of which can include a trailing fractional seconds
part in up to microseconds (6 digits) precision. See
Section 9.1.3, “Date and Time Literals”. For information about
fractional seconds support in MySQL, see
Section 11.2.7, “Fractional Seconds in Time Values”. In particular, as of MySQL
5.6.4, any fractional part in a value inserted into a
TIME
column is stored rather than discarded.
With the fractional part included, the range for
TIME
values is
'-838:59:59.000000'
to
'838:59:59.000000'
.
Be careful about assigning abbreviated values to a
TIME
column. MySQL interprets abbreviated
TIME
values with colons as time of the day.
That is, '11:12'
means
'11:12:00'
, not
'00:11:12'
. MySQL interprets abbreviated
values without colons using the assumption that the two
rightmost digits represent seconds (that is, as elapsed time
rather than as time of day). For example, you might think of
'1112'
and 1112
as meaning
'11:12:00'
(12 minutes after 11 o'clock), but
MySQL interprets them as '00:11:12'
(11
minutes, 12 seconds). Similarly, '12'
and
12
are interpreted as
'00:00:12'
.
The only delimiter recognized between a time part and a fractional seconds part is the decimal point.
By default, values that lie outside the TIME
range but are otherwise valid are clipped to the closest
endpoint of the range. For example,
'-850:00:00'
and
'850:00:00'
are converted to
'-838:59:59'
and
'838:59:59'
. Invalid TIME
values are converted to '00:00:00'
. Note that
because '00:00:00'
is itself a valid
TIME
value, there is no way to tell, from a
value of '00:00:00'
stored in a table,
whether the original value was specified as
'00:00:00'
or whether it was invalid.
For more restrictive treatment of invalid
TIME
values, enable strict SQL mode to cause
errors to occur. See Section 5.1.10, “Server SQL Modes”.
The YEAR
type is a 1-byte type used to
represent year values. It can be declared as
YEAR
with an implicit display width of 4
characters, or as YEAR(4)
or
YEAR(2)
with an explicit display width of 4
or 2 characters.
The 2-digit YEAR(2)
data type has certain
issues that you should consider before choosing to use it. As
of MySQL 5.6.6, YEAR(2)
is deprecated:
2-digit YEAR(2)
columns in existing tables
are treated as before, but 2-digit YEAR(2)
columns in new or altered tables are converted to 4-digit
YEAR
columns. For more information, see
Section 11.2.5, “2-Digit YEAR(2) Limitations and Migrating to 4-Digit YEAR”.
YEAR
/YEAR(4)
and
YEAR(2)
differ in display format, but have
the same range of values. For 4-digit format, MySQL displays
YEAR
values in
YYYY
format, with a range of
1901
to 2155
, and
0000
. For 2-digit format, MySQL displays only
the last (least significant) 2 digits; for example,
70
(1970 or 2070) or 69
(2069).
YEAR
accepts input values in a variety of
formats:
As 4-digit strings in the range '1901'
to
'2155'
.
As 4-digit numbers in the range 1901
to
2155
.
As 1- or 2-digit strings in the range '0'
to '99'
. MySQL converts values in the
ranges '0'
to '69'
and
'70'
to '99'
to
YEAR
values in the ranges
2000
to 2069
and
1970
to 1999
.
As 1- or 2-digit numbers in the range 0
to 99
. MySQL converts values in the
ranges 1
to 69
and
70
to 99
to
YEAR
values in the ranges
2001
to 2069
and
1970
to 1999
.
Inserting a numeric 0
has different
effects for
YEAR
/YEAR(4)
and
YEAR(2)
:
For YEAR
/YEAR(4)
,
the result has a display value of
0000
and an internal value of
0000
. To specify zero and have it be
interpreted as 2000
, specify it as a
string '0'
or
'00'
.
For YEAR(2)
, the result has a display
value of 00
and an internal value of
2000
.
As the result of functions that return a value that is
acceptable in YEAR
context, such as
NOW()
.
If strict SQL mode is not enabled, MySQL converts invalid
YEAR
values to 0000
. In
strict SQL mode, attempting to insert an invalid
YEAR
value produces an error.
See also Section 11.2.9, “2-Digit Years in Dates”.
This section describes problems that can occur when using the
2-digit YEAR(2)
data type and
provides information about converting existing
YEAR(2)
columns to 4-digit
year-valued columns, which can be declared as
YEAR
with an implicit display width of 4
characters, or equivalently as YEAR(4)
with
an explicit display width.
Although the internal range of values for
YEAR
/YEAR(4)
and the deprecated YEAR(2)
type
is the same (1901
to 2155
,
and 0000
), the display width for
YEAR(2)
makes that type
inherently ambiguous because displayed values indicate only the
last two digits of the internal values and omit the century
digits. The result can be a loss of information under certain
circumstances. For this reason, avoid using
YEAR(2)
in your applications and
use
YEAR
/YEAR(4)
wherever you need a year-valued data type. Note that conversion
is required at some point because support for
YEAR
data types with display
values other than 4, most notably
YEAR(2)
, is reduced as of MySQL
5.6.6 and is removed entirely in MySQL 5.7.
Issues with the YEAR(2)
data
type include ambiguity of displayed values, and possible loss
of information when values are dumped and reloaded or
converted to strings.
Displayed YEAR(2)
values
can be ambiguous. It is possible for up to three
YEAR(2)
values that have
different internal values to have the same displayed
value, as the following example demonstrates:
mysql>CREATE TABLE t (y2 YEAR(2), y4 YEAR);
Query OK, 0 rows affected, 1 warning (0.01 sec) mysql>INSERT INTO t (y2) VALUES(1912),(2012),(2112);
Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>UPDATE t SET y4 = y2;
Query OK, 3 rows affected (0.00 sec) Rows matched: 3 Changed: 3 Warnings: 0 mysql>SELECT * FROM t;
+------+------+ | y2 | y4 | +------+------+ | 12 | 1912 | | 12 | 2012 | | 12 | 2112 | +------+------+ 3 rows in set (0.00 sec)
If you use mysqldump to dump the table
created in the preceding example, the dump file represents
all y2
values using the same 2-digit
representation (12
). If you reload the
table from the dump file, all resulting rows have internal
value 2012
and display value
12
, thus losing the distinctions
between them.
Conversion of a 2-digit or 4-digit
YEAR
data value to string
form uses the data type display width. Suppose that a
YEAR(2)
column and a
YEAR
/YEAR(4)
column both contain the value 1970
.
Assigning each column to a string results in a value of
'70'
or '1970'
,
respectively. That is, loss of information occurs for
conversion from YEAR(2)
to
string.
Values outside the range from 1970
to
2069
are stored incorrectly when
inserted into a YEAR(2)
column in a CSV
table. For
example, inserting 2211
results in a
display value of 11
but an internal
value of 2011
.
To avoid these problems, use the 4-digit
YEAR
or
YEAR(4)
data type rather than
the 2-digit YEAR(2)
data type.
Suggestions regarding migration strategies appear later in
this section.
As of MySQL 5.6.6, support for
YEAR(2)
is diminished:
YEAR(2)
column definitions
for new tables are converted (with an
ER_INVALID_YEAR_COLUMN_LENGTH
warning) to 4-digit YEAR
columns:
mysql>CREATE TABLE t1 (y YEAR(2));
Query OK, 0 rows affected, 1 warning (0.04 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1818 Message: YEAR(2) column type is deprecated. Creating YEAR(4) column instead. 1 row in set (0.00 sec) mysql>SHOW CREATE TABLE t1\G
*************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `y` year(4) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
YEAR(2)
columns in existing
tables remain as YEAR(2)
and are processed in queries as in older versions of
MySQL. However, several programs or statements convert
YEAR(2)
columns to 4-digit
YEAR
columns automatically:
ALTER TABLE
statements
that result in a table rebuild.
REPAIR TABLE
(which
CHECK TABLE
recommends
you use, if it finds a table that contains
YEAR(2)
columns).
mysql_upgrade (which uses
REPAIR TABLE
).
Dumping with mysqldump and reloading the dump file. Unlike the conversions performed by the preceding three items, a dump and reload has the potential to change data values.
A MySQL upgrade usually involves at least one of the last
two items. However, with respect to
YEAR(2)
,
mysql_upgrade is preferable to
mysqldump, which, as noted, can change
data values.
To convert 2-digit YEAR(2)
columns to 4-digit YEAR
columns, you can do so manually at any time without upgrading.
Alternatively, you can upgrade to a version of MySQL with
reduced support for YEAR(2)
(MySQL 5.6.6 or later), then have MySQL convert
YEAR(2)
columns automatically.
In the latter case, avoid upgrading by dumping and reloading
your data because that can change data values. In addition, if
you use replication, there are upgrade considerations you must
take into account.
To convert 2-digit YEAR(2)
columns to 4-digit YEAR
manually, use ALTER TABLE
or
REPAIR TABLE
. Suppose that a
table t1
has this definition:
CREATE TABLE t1 (ycol YEAR(2) NOT NULL DEFAULT '70');
Modify the column using ALTER TABLE
as
follows:
ALTER TABLE t1 FORCE;
The ALTER TABLE
statement
converts the table without changing
YEAR(2)
values. If the server
is a replication master, the ALTER
TABLE
statement replicates to slaves and makes the
corresponding table change on each one.
Another migration method is to perform a binary upgrade:
Upgrade MySQL 5.6.6 or later in place without dumping and
reloading your data. Then run
mysql_upgrade, which uses
REPAIR TABLE
to convert 2-digit
YEAR(2)
columns to 4-digit
YEAR
columns without changing
data values. If the server is a replication master, the
REPAIR TABLE
statements
replicate to slaves and make the corresponding table changes
on each one, unless you invoke
mysql_upgrade with the
--skip-write-binlog
option.
Upgrades to replication servers usually involve upgrading
slaves to a newer version of MySQL, then upgrading the master.
For example, if a master and slave both run MySQL 5.5, a
typical upgrade sequence involves upgrading the slave to 5.6,
then upgrading the master to 5.6. With regard to the different
treatment of YEAR(2)
as of
MySQL 5.6.6, that upgrade sequence results in a problem:
Suppose that the slave has been upgraded but not yet the
master. Then creating a table containing a 2-digit
YEAR(2)
column on the master
results in a table containing a 4-digit
YEAR
column on the slave.
Consequently, the following operations have a different result
on the master and slave, if you use statement-based
replication:
To avoid such problems, use one of these strategies:
Use row-based replication instead of statement-based replication.
Modify all 2-digit YEAR(2)
columns on the master to 4-digit
YEAR
columns before
upgrading. (Use ALTER
TABLE
, as described previously.) That makes it
possible to upgrade normally (slave first, then master)
without introducing any
YEAR(2)
to
YEAR(4)
differences between
the master and slave.
One migration method should be avoided: Do not dump your data
with mysqldump and reload the dump file
after upgrading. That has the potential to change
YEAR(2)
values, as described
previously.
A migration from 2-digit
YEAR(2)
columns to 4-digit
YEAR
columns should also
involve examining application code for the possibility of
changed behavior under conditions such as these:
As of MySQL 5.6.5, TIMESTAMP
and
DATETIME
columns can be
automatically initializated and updated to the current date and
time (that is, the current timestamp). Before 5.6.5, this is
true only for TIMESTAMP
, and for
at most one TIMESTAMP
column per
table. The following notes first describe automatic
initialization and updating for MySQL 5.6.5 and up, then the
differences for versions preceding 5.6.5.
For any TIMESTAMP
or
DATETIME
column in a table, you
can assign the current timestamp as the default value, the
auto-update value, or both:
An auto-initialized column is set to the current timestamp for inserted rows that specify no value for the column.
An auto-updated column is automatically updated to the
current timestamp when the value of any other column in the
row is changed from its current value. An auto-updated
column remains unchanged if all other columns are set to
their current values. To prevent an auto-updated column from
updating when other columns change, explicitly set it to its
current value. To update an auto-updated column even when
other columns do not change, explicitly set it to the value
it should have (for example, set it to
CURRENT_TIMESTAMP
).
In addition, if the
explicit_defaults_for_timestamp
system variable is disabled, you can initialize or update any
TIMESTAMP
(but not
DATETIME
) column to the current date and time
by assigning it a NULL
value, unless it has
been defined with the NULL
attribute to
permit NULL
values.
To specify automatic properties, use the DEFAULT
CURRENT_TIMESTAMP
and ON UPDATE
CURRENT_TIMESTAMP
clauses in column definitions. The
order of the clauses does not matter. If both are present in a
column definition, either can occur first. Any of the synonyms
for CURRENT_TIMESTAMP
have the
same meaning as
CURRENT_TIMESTAMP
. These are
CURRENT_TIMESTAMP()
,
NOW()
,
LOCALTIME
,
LOCALTIME()
,
LOCALTIMESTAMP
, and
LOCALTIMESTAMP()
.
Use of DEFAULT CURRENT_TIMESTAMP
and
ON UPDATE CURRENT_TIMESTAMP
is specific to
TIMESTAMP
and
DATETIME
. The
DEFAULT
clause also can be used to specify a
constant (nonautomatic) default value (for example,
DEFAULT 0
or DEFAULT '2000-01-01
00:00:00'
).
The following examples use DEFAULT 0
, a
default that can produce warnings or errors depending on
whether strict SQL mode or the
NO_ZERO_DATE
SQL mode is
enabled. Be aware that the
TRADITIONAL
SQL mode
includes strict mode and
NO_ZERO_DATE
. See
Section 5.1.10, “Server SQL Modes”.
TIMESTAMP
or
DATETIME
column definitions can
specify the current timestamp for both the default and
auto-update values, for one but not the other, or for neither.
Different columns can have different combinations of automatic
properties. The following rules describe the possibilities:
With both DEFAULT CURRENT_TIMESTAMP
and
ON UPDATE CURRENT_TIMESTAMP
, the column
has the current timestamp for its default value and is
automatically updated to the current timestamp.
CREATE TABLE t1 ( ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, dt DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP );
With a DEFAULT
clause but no ON
UPDATE CURRENT_TIMESTAMP
clause, the column has
the given default value and is not automatically updated to
the current timestamp.
The default depends on whether the
DEFAULT
clause specifies
CURRENT_TIMESTAMP
or a constant value.
With CURRENT_TIMESTAMP
, the default is
the current timestamp.
CREATE TABLE t1 ( ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, dt DATETIME DEFAULT CURRENT_TIMESTAMP );
With a constant, the default is the given value. In this case, the column has no automatic properties at all.
CREATE TABLE t1 ( ts TIMESTAMP DEFAULT 0, dt DATETIME DEFAULT 0 );
With an ON UPDATE CURRENT_TIMESTAMP
clause and a constant DEFAULT
clause, the
column is automatically updated to the current timestamp and
has the given constant default value.
CREATE TABLE t1 ( ts TIMESTAMP DEFAULT 0 ON UPDATE CURRENT_TIMESTAMP, dt DATETIME DEFAULT 0 ON UPDATE CURRENT_TIMESTAMP );
With an ON UPDATE CURRENT_TIMESTAMP
clause but no DEFAULT
clause, the column
is automatically updated to the current timestamp but does
not have the current timestamp for its default value.
The default in this case is type dependent.
TIMESTAMP
has a default of 0
unless defined with the NULL
attribute,
in which case the default is NULL
.
CREATE TABLE t1 ( ts1 TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, -- default 0 ts2 TIMESTAMP NULL ON UPDATE CURRENT_TIMESTAMP -- default NULL );
DATETIME
has a default of
NULL
unless defined with the NOT
NULL
attribute, in which case the default is 0.
CREATE TABLE t1 ( dt1 DATETIME ON UPDATE CURRENT_TIMESTAMP, -- default NULL dt2 DATETIME NOT NULL ON UPDATE CURRENT_TIMESTAMP -- default 0 );
TIMESTAMP
and
DATETIME
columns have no
automatic properties unless they are specified explicitly, with
this exception: If the
explicit_defaults_for_timestamp
system variable is disabled, the first
TIMESTAMP
column has both
DEFAULT CURRENT_TIMESTAMP
and ON
UPDATE CURRENT_TIMESTAMP
if neither is specified
explicitly. To suppress automatic properties for the first
TIMESTAMP
column, use one of
these strategies:
Enable the
explicit_defaults_for_timestamp
system variable. In this case, the DEFAULT
CURRENT_TIMESTAMP
and ON UPDATE
CURRENT_TIMESTAMP
clauses that specify automatic
initialization and updating are available, but are not
assigned to any TIMESTAMP
column unless explicitly included in the column definition.
Alternatively, if
explicit_defaults_for_timestamp
is disabled, do either of the following:
Define the column with a DEFAULT
clause that specifies a constant default value.
Specify the NULL
attribute. This also
causes the column to permit NULL
values, which means that you cannot assign the current
timestamp by setting the column to
NULL
. Assigning
NULL
sets the column to
NULL
, not the current timestamp. To
assign the current timestamp, set the column to
CURRENT_TIMESTAMP
or a
synonym such as NOW()
.
Consider these table definitions:
CREATE TABLE t1 ( ts1 TIMESTAMP DEFAULT 0, ts2 TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP); CREATE TABLE t2 ( ts1 TIMESTAMP NULL, ts2 TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP); CREATE TABLE t3 ( ts1 TIMESTAMP NULL DEFAULT 0, ts2 TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP);
The tables have these properties:
In each table definition, the first
TIMESTAMP
column has no
automatic initialization or updating.
The tables differ in how the ts1
column
handles NULL
values. For
t1
, ts1
is
NOT NULL
and assigning it a value of
NULL
sets it to the current timestamp.
For t2
and t3
,
ts1
permits NULL
and
assigning it a value of NULL
sets it to
NULL
.
t2
and t3
differ in
the default value for ts1
. For
t2
, ts1
is defined to
permit NULL
, so the default is also
NULL
in the absence of an explicit
DEFAULT
clause. For
t3
, ts1
permits
NULL
but has an explicit default of 0.
If a TIMESTAMP
or
DATETIME
column definition
includes an explicit fractional seconds precision value
anywhere, the same value must be used throughout the column
definition. This is permitted:
CREATE TABLE t1 ( ts TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6) );
This is not permitted:
CREATE TABLE t1 ( ts TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP(3) );
Before MySQL 5.6.5, support for automatic initialization and updating is more limited:
DEFAULT CURRENT_TIMESTAMP
and ON
UPDATE CURRENT_TIMESTAMP
cannot be used with
DATETIME
columns.
DEFAULT CURRENT_TIMESTAMP
and ON
UPDATE CURRENT_TIMESTAMP
can be used with at most
one TIMESTAMP
column per
table. It is not possible to have the current timestamp be
the default value for one column and the auto-update value
for another column.
You can choose whether to use these properties and which
TIMESTAMP
column should have
them. It need not be the first one in a table that is
automatically initialized or updated to the current timestamp.
To specify automatic initialization or updating for a different
TIMESTAMP
column, you must
suppress the automatic properties for the first one, as
previously described. Then, for the other
TIMESTAMP
column, the rules for
the DEFAULT
and ON UPDATE
clauses are the same as for the first
TIMESTAMP
column, except that if
you omit both clauses, no automatic initialization or updating
occurs.
If the
explicit_defaults_for_timestamp
system variable is disabled,
TIMESTAMP
columns by default are
NOT NULL
, cannot contain
NULL
values, and assigning
NULL
assigns the current timestamp. To permit
a TIMESTAMP
column to contain
NULL
, explicitly declare it with the
NULL
attribute. In this case, the default
value also becomes NULL
unless overridden
with a DEFAULT
clause that specifies a
different default value. DEFAULT NULL
can be
used to explicitly specify NULL
as the
default value. (For a TIMESTAMP
column not declared with the NULL
attribute,
DEFAULT NULL
is invalid.) If a
TIMESTAMP
column permits
NULL
values, assigning
NULL
sets it to NULL
, not
to the current timestamp.
The following table contains several
TIMESTAMP
columns that permit
NULL
values:
CREATE TABLE t ( ts1 TIMESTAMP NULL DEFAULT NULL, ts2 TIMESTAMP NULL DEFAULT 0, ts3 TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP );
A TIMESTAMP
column that permits
NULL
values does not
take on the current timestamp at insert time except under one of
the following conditions:
Its default value is defined as
CURRENT_TIMESTAMP
and no
value is specified for the column
CURRENT_TIMESTAMP
or any of
its synonyms such as NOW()
is
explicitly inserted into the column
In other words, a TIMESTAMP
column defined to permit NULL
values
auto-initializes only if its definition includes
DEFAULT CURRENT_TIMESTAMP
:
CREATE TABLE t (ts TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP);
If the TIMESTAMP
column permits
NULL
values but its definition does not
include DEFAULT CURRENT_TIMESTAMP
, you must
explicitly insert a value corresponding to the current date and
time. Suppose that tables t1
and
t2
have these definitions:
CREATE TABLE t1 (ts TIMESTAMP NULL DEFAULT '0000-00-00 00:00:00'); CREATE TABLE t2 (ts TIMESTAMP NULL DEFAULT NULL);
To set the TIMESTAMP
column in
either table to the current timestamp at insert time, explicitly
assign it that value. For example:
INSERT INTO t2 VALUES (CURRENT_TIMESTAMP); INSERT INTO t1 VALUES (NOW());
If the
explicit_defaults_for_timestamp
system variable is enabled,
TIMESTAMP
columns permit
NULL
values only if declared with the
NULL
attribute. Also,
TIMESTAMP
columns do not permit
assigning NULL
to assign the current
timestamp, whether declared with the NULL
or
NOT NULL
attribute. To assign the current
timestamp, set the column to
CURRENT_TIMESTAMP
or a synonym
such as NOW()
.
MySQL 5.6 has fractional seconds support for
TIME
,
DATETIME
, and
TIMESTAMP
values, with up to
microseconds (6 digits) precision:
To define a column that includes a fractional seconds part,
use the syntax
,
where type_name
(fsp
)type_name
is
TIME
,
DATETIME
, or
TIMESTAMP
, and
fsp
is the fractional seconds
precision. For example:
CREATE TABLE t1 (t TIME(3), dt DATETIME(6));
The fsp
value, if given, must be
in the range 0 to 6. A value of 0 signifies that there is no
fractional part. If omitted, the default precision is 0.
(This differs from the standard SQL default of 6, for
compatibility with previous MySQL versions.)
Inserting a TIME
,
DATE
, or
TIMESTAMP
value with a
fractional seconds part into a column of the same type but
having fewer fractional digits results in rounding. Consider
a table created and populated as follows:
CREATE TABLE fractest( c1 TIME(2), c2 DATETIME(2), c3 TIMESTAMP(2) ); INSERT INTO fractest VALUES ('17:51:04.777', '2018-09-08 17:51:04.777', '2018-09-08 17:51:04.777');
The temporal values are inserted into the table with rounding:
mysql> SELECT * FROM fractest;
+-------------+------------------------+------------------------+
| c1 | c2 | c3 |
+-------------+------------------------+------------------------+
| 17:51:04.78 | 2018-09-08 17:51:04.78 | 2018-09-08 17:51:04.78 |
+-------------+------------------------+------------------------+
No warning or error is given when such rounding occurs. This
behavior follows the SQL standard, and is not affected by
the server sql_mode
setting.
Functions that take temporal arguments accept values with
fractional seconds. Return values from temporal functions
include fractional seconds as appropriate. For example,
NOW()
with no argument
returns the current date and time with no fractional part,
but takes an optional argument from 0 to 6 to specify that
the return value includes a fractional seconds part of that
many digits.
Syntax for temporal literals produces temporal values:
DATE '
,
str
'TIME '
,
and str
'TIMESTAMP
'
, and the
ODBC-syntax equivalents. The resulting value includes a
trailing fractional seconds part if specified. Previously,
the temporal type keyword was ignored and these constructs
produced the string value. See
Standard SQL and ODBC Date and Time Literals
str
'
In some cases, previously accepted syntax may produce different results. The following items indicate where existing code may need to be changed to avoid problems:
Some expressions produce results that differ from previous
results. Examples: The timestamp
system
variable returns a value that includes a microseconds
fractional part rather than an integer. Functions that
return a result that includes the current time (such as
CURTIME()
,
SYSDATE()
, or
UTC_TIMESTAMP()
) interpret an
argument as an fsp
value and the
return value includes a fractional seconds part of that many
digits. Previously, these functions permitted an argument
but ignored it.
TIME
values are converted to
DATETIME
by adding the time
to the current date. (This means that the date part of the
result differs from the current date if the time value is
outside the range from '00:00:00'
to
'23:59:59'
.) Previously, conversion of
TIME
values to
DATETIME
was unreliable. See
Section 11.2.8, “Conversion Between Date and Time Types”.
TIMESTAMP(
was permitted in old MySQL versions, but
N
)N
was a display width rather than
fractional seconds precision. Support for this behavior was
removed in MySQL 5.5.3, so applications that are reasonably
up to date should not be subject to this issue. Otherwise,
code must be rewritten.
To some extent, you can convert a value from one temporal type
to another. However, there may be some alteration of the value
or loss of information. In all cases, conversion between
temporal types is subject to the range of valid values for the
resulting type. For example, although
DATE
,
DATETIME
, and
TIMESTAMP
values all can be
specified using the same set of formats, the types do not all
have the same range of values.
TIMESTAMP
values cannot be
earlier than 1970
UTC or later than
'2038-01-19 03:14:07'
UTC. This means that a
date such as '1968-01-01'
, while valid as a
DATE
or
DATETIME
value, is not valid as a
TIMESTAMP
value and is converted
to 0
.
Conversion of DATE
values:
Conversion of DATETIME
and
TIMESTAMP
values:
Conversion to a DATE
value
takes fractional seconds into account and rounds the time
part. For example, '1999-12-31
23:59:59.499'
becomes
'1999-12-31'
, whereas
'1999-12-31 23:59:59.500'
becomes
'2000-01-01'
.
Conversion to a TIME
value
discards the date part because the
TIME
type contains no date
information.
For conversion of TIME
values to
other temporal types, the value of
CURRENT_DATE()
is used for the
date part. The TIME
is
interpreted as elapsed time (not time of day) and added to the
date. This means that the date part of the result differs from
the current date if the time value is outside the range from
'00:00:00'
to '23:59:59'
.
Suppose that the current date is
'2012-01-01'
.
TIME
values of
'12:00:00'
, '24:00:00'
,
and '-12:00:00'
, when converted to
DATETIME
or
TIMESTAMP
values, result in
'2012-01-01 12:00:00'
, '2012-01-02
00:00:00'
, and '2011-12-31
12:00:00'
, respectively.
Conversion of TIME
to
DATE
is similar but discards the
time part from the result: '2012-01-01'
,
'2012-01-02'
, and
'2011-12-31'
, respectively.
Explicit conversion can be used to override implicit conversion.
For example, in comparison of
DATE
and
DATETIME
values, the
DATE
value is coerced to the
DATETIME
type by adding a time
part of '00:00:00'
. To perform the comparison
by ignoring the time part of the
DATETIME
value instead, use the
CAST()
function in the following
way:
date_col
= CAST(datetime_col
AS DATE)
Conversion of TIME
and
DATETIME
values to numeric form
(for example, by adding +0
) depends on
whether the value contains a fractional seconds part.
TIME(
or
N
)DATETIME(
is converted to integer when N
)N
is 0
(or omitted) and to a DECIMAL
value with
N
decimal digits when
N
is greater than 0:
mysql>SELECT CURTIME(), CURTIME()+0, CURTIME(3)+0;
+-----------+-------------+--------------+ | CURTIME() | CURTIME()+0 | CURTIME(3)+0 | +-----------+-------------+--------------+ | 09:28:00 | 92800 | 92800.887 | +-----------+-------------+--------------+ mysql>SELECT NOW(), NOW()+0, NOW(3)+0;
+---------------------+----------------+--------------------+ | NOW() | NOW()+0 | NOW(3)+0 | +---------------------+----------------+--------------------+ | 2012-08-15 09:28:00 | 20120815092800 | 20120815092800.889 | +---------------------+----------------+--------------------+
Date values with 2-digit years are ambiguous because the century is unknown. Such values must be interpreted into 4-digit form because MySQL stores years internally using 4 digits.
For DATETIME
,
DATE
, and
TIMESTAMP
types, MySQL interprets
dates specified with ambiguous year values using these rules:
Year values in the range 00-69
become
2000-2069
.
Year values in the range 70-99
become
1970-1999
.
For YEAR
, the rules are the same, with this
exception: A numeric 00
inserted into
YEAR(4)
results in 0000
rather than 2000
. To specify zero for
YEAR(4)
and have it be interpreted as
2000
, specify it as a string
'0'
or '00'
.
Remember that these rules are only heuristics that provide reasonable guesses as to what your data values mean. If the rules used by MySQL do not produce the values you require, you must provide unambiguous input containing 4-digit year values.
ORDER BY
properly sorts
YEAR
values that have 2-digit
years.
Some functions like MIN()
and
MAX()
convert a
YEAR
to a number. This means that
a value with a 2-digit year does not work properly with these
functions. The fix in this case is to convert the
YEAR
to 4-digit year format.
The string data types are CHAR
,
VARCHAR
,
BINARY
,
VARBINARY
,
BLOB
,
TEXT
,
ENUM
, and
SET
.
For information about storage requirements of the string data types, see Section 11.6, “Data Type Storage Requirements”.
For descriptions of functions that operate on string values, see Section 12.7, “String Functions and Operators”.
The string data types are CHAR
,
VARCHAR
,
BINARY
,
VARBINARY
,
BLOB
,
TEXT
,
ENUM
, and
SET
.
In some cases, MySQL may change a string column to a type
different from that given in a CREATE
TABLE
or ALTER TABLE
statement. See Section 13.1.17.7, “Silent Column Specification Changes”.
For definitions of character string columns
(CHAR
,
VARCHAR
, and the
TEXT
types), MySQL interprets
length specifications in character units. For definitions of
binary string columns (BINARY
,
VARBINARY
, and the
BLOB
types), MySQL interprets
length specifications in byte units.
Column definitions for character string data types
CHAR
,
VARCHAR
, the
TEXT
types,
ENUM
,
SET
, and any synonyms) can
specify the column character set and collation:
CHARACTER SET
specifies the character
set. If desired, a collation for the character set can be
specified with the COLLATE
attribute,
along with any other attributes. For example:
CREATE TABLE t ( c1 VARCHAR(20) CHARACTER SET utf8, c2 TEXT CHARACTER SET latin1 COLLATE latin1_general_cs );
This table definition creates a column named
c1
that has a character set of
utf8
with the default collation for that
character set, and a column named c2
that
has a character set of latin1
and a
case-sensitive (_cs
) collation.
The rules for assigning the character set and collation when
either or both of CHARACTER SET
and the
COLLATE
attribute are missing are
described in Section 10.3.5, “Column Character Set and Collation”.
CHARSET
is a synonym for
CHARACTER SET
.
Specifying the CHARACTER SET binary
attribute for a character string data type causes the column
to be created as the corresponding binary string data type:
CHAR
becomes
BINARY
,
VARCHAR
becomes
VARBINARY
, and
TEXT
becomes
BLOB
. For the
ENUM
and
SET
data types, this does not
occur; they are created as declared. Suppose that you
specify a table using this definition:
CREATE TABLE t ( c1 VARCHAR(10) CHARACTER SET binary, c2 TEXT CHARACTER SET binary, c3 ENUM('a','b','c') CHARACTER SET binary );
The resulting table has this definition:
CREATE TABLE t ( c1 VARBINARY(10), c2 BLOB, c3 ENUM('a','b','c') CHARACTER SET binary );
The BINARY
attribute is a nonstandard
MySQL extension that is shorthand for specifying the binary
(_bin
) collation of the column character
set (or of the table default character set if no column
character set is specified). In this case, comparison and
sorting are based on numeric character code values. Suppose
that you specify a table using this definition:
CREATE TABLE t ( c1 VARCHAR(10) CHARACTER SET latin1 BINARY, c2 TEXT BINARY ) CHARACTER SET utf8mb4;
The resulting table has this definition:
CREATE TABLE t ( c1 VARCHAR(10) CHARACTER SET latin1 COLLATE latin1_bin, c2 TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin ) CHARACTER SET utf8mb4;
The ASCII
attribute is shorthand for
CHARACTER SET latin1
.
The UNICODE
attribute is shorthand for
CHARACTER SET ucs2
.
Character column comparison and sorting are based on the
collation assigned to the column. For the
CHAR
,
VARCHAR
,
TEXT
,
ENUM
, and
SET
data types, you can declare a
column with a binary (_bin
) collation or the
BINARY
attribute to cause comparison and
sorting to use the underlying character code values rather than
a lexical ordering.
For additional information about use of character sets in MySQL, see Chapter 10, Character Sets, Collations, Unicode.
[NATIONAL] CHAR[(
M
)]
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
A fixed-length string that is always right-padded with
spaces to the specified length when stored.
M
represents the column length in
characters. The range of M
is 0
to 255. If M
is omitted, the
length is 1.
Trailing spaces are removed when
CHAR
values are retrieved
unless the
PAD_CHAR_TO_FULL_LENGTH
SQL mode is enabled.
CHAR
is shorthand for
CHARACTER
.
NATIONAL CHAR
(or its
equivalent short form, NCHAR
)
is the standard SQL way to define that a
CHAR
column should use some
predefined character set. MySQL uses utf8
as this predefined character set.
Section 10.3.7, “The National Character Set”.
The CHAR BYTE
data type is an
alias for the BINARY
data
type. This is a compatibility feature.
MySQL permits you to create a column of type
CHAR(0)
. This is useful primarily when
you must be compliant with old applications that depend on
the existence of a column but that do not actually use its
value. CHAR(0)
is also quite nice when
you need a column that can take only two values: A column
that is defined as CHAR(0) NULL
occupies
only one bit and can take only the values
NULL
and ''
(the empty
string).
[NATIONAL] VARCHAR(
M
)
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
A variable-length string. M
represents the maximum column length in characters. The
range of M
is 0 to 65,535. The
effective maximum length of a
VARCHAR
is subject to the
maximum row size (65,535 bytes, which is shared among all
columns) and the character set used. For example,
utf8
characters can require up to three
bytes per character, so a
VARCHAR
column that uses the
utf8
character set can be declared to be
a maximum of 21,844 characters. See
Section 8.4.7, “Limits on Table Column Count and Row Size”.
MySQL stores VARCHAR
values
as a 1-byte or 2-byte length prefix plus data. The length
prefix indicates the number of bytes in the value. A
VARCHAR
column uses one
length byte if values require no more than 255 bytes, two
length bytes if values may require more than 255 bytes.
MySQL follows the standard SQL specification, and does
not remove trailing spaces from
VARCHAR
values.
VARCHAR
is shorthand for
CHARACTER VARYING
.
NATIONAL VARCHAR
is the
standard SQL way to define that a
VARCHAR
column should use
some predefined character set. MySQL uses
utf8
as this predefined character set.
Section 10.3.7, “The National Character Set”.
NVARCHAR
is shorthand for
NATIONAL VARCHAR
.
The BINARY
type is similar to
the CHAR
type, but stores
binary byte strings rather than nonbinary character strings.
An optional length M
represents
the column length in bytes. If omitted,
M
defaults to 1.
The VARBINARY
type is similar
to the VARCHAR
type, but
stores binary byte strings rather than nonbinary character
strings. M
represents the maximum
column length in bytes.
A BLOB
column with a maximum
length of 255 (28 − 1)
bytes. Each TINYBLOB
value is
stored using a 1-byte length prefix that indicates the
number of bytes in the value.
TINYTEXT
[CHARACTER SET
charset_name
]
[COLLATE
collation_name
]
A TEXT
column with a maximum
length of 255 (28 − 1)
characters. The effective maximum length is less if the
value contains multibyte characters. Each
TINYTEXT
value is stored
using a 1-byte length prefix that indicates the number of
bytes in the value.
A BLOB
column with a maximum
length of 65,535 (216 − 1)
bytes. Each BLOB
value is
stored using a 2-byte length prefix that indicates the
number of bytes in the value.
An optional length M
can be given
for this type. If this is done, MySQL creates the column as
the smallest BLOB
type large
enough to hold values M
bytes
long.
TEXT[(
M
)]
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
A TEXT
column with a maximum
length of 65,535 (216 − 1)
characters. The effective maximum length is less if the
value contains multibyte characters. Each
TEXT
value is stored using a
2-byte length prefix that indicates the number of bytes in
the value.
An optional length M
can be given
for this type. If this is done, MySQL creates the column as
the smallest TEXT
type large
enough to hold values M
characters long.
A BLOB
column with a maximum
length of 16,777,215 (224 −
1) bytes. Each MEDIUMBLOB
value is stored using a 3-byte length prefix that indicates
the number of bytes in the value.
MEDIUMTEXT
[CHARACTER SET
charset_name
]
[COLLATE
collation_name
]
A TEXT
column with a maximum
length of 16,777,215 (224 −
1) characters. The effective maximum length is less if the
value contains multibyte characters. Each
MEDIUMTEXT
value is stored
using a 3-byte length prefix that indicates the number of
bytes in the value.
A BLOB
column with a maximum
length of 4,294,967,295 or 4GB
(232 − 1) bytes. The
effective maximum length of
LONGBLOB
columns depends on
the configured maximum packet size in the client/server
protocol and available memory. Each
LONGBLOB
value is stored
using a 4-byte length prefix that indicates the number of
bytes in the value.
LONGTEXT
[CHARACTER SET
charset_name
]
[COLLATE
collation_name
]
A TEXT
column with a maximum
length of 4,294,967,295 or 4GB
(232 − 1) characters. The
effective maximum length is less if the value contains
multibyte characters. The effective maximum length of
LONGTEXT
columns also depends on the configured maximum packet size
in the client/server protocol and available memory. Each
LONGTEXT
value is stored using a 4-byte length prefix that indicates
the number of bytes in the value.
ENUM('
value1
','value2
',...)
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
An enumeration. A string object that can have only one
value, chosen from the list of values
'
,
value1
''
,
value2
'...
, NULL
or the
special ''
error value.
ENUM
values are represented
internally as integers.
An ENUM
column can have a
maximum of 65,535 distinct elements. (The practical limit is
less than 3000.) A table can have no more than 255 unique
element list definitions among its
ENUM
and
SET
columns considered as a
group. For more information on these limits, see
Limits Imposed by .frm File Structure.
SET('
value1
','value2
',...)
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
A set. A string object that can have zero or more values,
each of which must be chosen from the list of values
'
,
value1
''
,
value2
'...
SET
values are represented internally as integers.
A SET
column can have a
maximum of 64 distinct members. A table can have no more
than 255 unique element list definitions among its
ENUM
and
SET
columns considered as a
group. For more information on this limit, see
Limits Imposed by .frm File Structure.
The CHAR
and VARCHAR
types
are similar, but differ in the way they are stored and
retrieved. They also differ in maximum length and in whether
trailing spaces are retained.
The CHAR
and VARCHAR
types
are declared with a length that indicates the maximum number of
characters you want to store. For example,
CHAR(30)
can hold up to 30 characters.
The length of a CHAR
column is fixed to the
length that you declare when you create the table. The length
can be any value from 0 to 255. When CHAR
values are stored, they are right-padded with spaces to the
specified length. When CHAR
values are
retrieved, trailing spaces are removed unless the
PAD_CHAR_TO_FULL_LENGTH
SQL
mode is enabled.
Values in VARCHAR
columns are variable-length
strings. The length can be specified as a value from 0 to
65,535. The effective maximum length of a
VARCHAR
is subject to the maximum row size
(65,535 bytes, which is shared among all columns) and the
character set used. See Section 8.4.7, “Limits on Table Column Count and Row Size”.
In contrast to CHAR
,
VARCHAR
values are stored as a 1-byte or
2-byte length prefix plus data. The length prefix indicates the
number of bytes in the value. A column uses one length byte if
values require no more than 255 bytes, two length bytes if
values may require more than 255 bytes.
If strict SQL mode is not enabled and you assign a value to a
CHAR
or VARCHAR
column
that exceeds the column's maximum length, the value is truncated
to fit and a warning is generated. For truncation of nonspace
characters, you can cause an error to occur (rather than a
warning) and suppress insertion of the value by using strict SQL
mode. See Section 5.1.10, “Server SQL Modes”.
For VARCHAR
columns, trailing spaces in
excess of the column length are truncated prior to insertion and
a warning is generated, regardless of the SQL mode in use. For
CHAR
columns, truncation of excess trailing
spaces from inserted values is performed silently regardless of
the SQL mode.
VARCHAR
values are not padded when they are
stored. Trailing spaces are retained when values are stored and
retrieved, in conformance with standard SQL.
The following table illustrates the differences between
CHAR
and VARCHAR
by
showing the result of storing various string values into
CHAR(4)
and VARCHAR(4)
columns (assuming that the column uses a single-byte character
set such as latin1
).
Value | CHAR(4) |
Storage Required | VARCHAR(4) |
Storage Required |
---|---|---|---|---|
'' |
' ' |
4 bytes | '' |
1 byte |
'ab' |
'ab ' |
4 bytes | 'ab' |
3 bytes |
'abcd' |
'abcd' |
4 bytes | 'abcd' |
5 bytes |
'abcdefgh' |
'abcd' |
4 bytes | 'abcd' |
5 bytes |
The values shown as stored in the last row of the table apply only when not using strict mode; if MySQL is running in strict mode, values that exceed the column length are not stored, and an error results.
InnoDB
encodes fixed-length fields greater
than or equal to 768 bytes in length as variable-length fields,
which can be stored off-page. For example, a
CHAR(255)
column can exceed 768 bytes if the
maximum byte length of the character set is greater than 3, as
it is with utf8mb4
.
If a given value is stored into the CHAR(4)
and VARCHAR(4)
columns, the values retrieved
from the columns are not always the same because trailing spaces
are removed from CHAR
columns upon retrieval.
The following example illustrates this difference:
mysql>CREATE TABLE vc (v VARCHAR(4), c CHAR(4));
Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO vc VALUES ('ab ', 'ab ');
Query OK, 1 row affected (0.00 sec) mysql>SELECT CONCAT('(', v, ')'), CONCAT('(', c, ')') FROM vc;
+---------------------+---------------------+ | CONCAT('(', v, ')') | CONCAT('(', c, ')') | +---------------------+---------------------+ | (ab ) | (ab) | +---------------------+---------------------+ 1 row in set (0.06 sec)
Values in CHAR
and VARCHAR
columns are sorted and compared according to the character set
collation assigned to the column.
All MySQL collations are of type PAD SPACE. This means that all
CHAR
, VARCHAR
, and
TEXT
values are compared without regard to
any trailing spaces. “Comparison” in this context
does not include the LIKE
pattern-matching operator, for which trailing spaces are
significant. For example:
mysql>CREATE TABLE names (myname CHAR(10));
Query OK, 0 rows affected (0.03 sec) mysql>INSERT INTO names VALUES ('Jones');
Query OK, 1 row affected (0.00 sec) mysql>SELECT myname = 'Jones', myname = 'Jones ' FROM names;
+------------------+--------------------+ | myname = 'Jones' | myname = 'Jones ' | +------------------+--------------------+ | 1 | 1 | +------------------+--------------------+ 1 row in set (0.00 sec) mysql>SELECT myname LIKE 'Jones', myname LIKE 'Jones ' FROM names;
+---------------------+-----------------------+ | myname LIKE 'Jones' | myname LIKE 'Jones ' | +---------------------+-----------------------+ | 1 | 0 | +---------------------+-----------------------+ 1 row in set (0.00 sec)
This is true for all MySQL versions, and is not affected by the server SQL mode.
For more information about MySQL character sets and collations, see Chapter 10, Character Sets, Collations, Unicode. For additional information about storage requirements, see Section 11.6, “Data Type Storage Requirements”.
For those cases where trailing pad characters are stripped or
comparisons ignore them, if a column has an index that requires
unique values, inserting into the column values that differ only
in number of trailing pad characters will result in a
duplicate-key error. For example, if a table contains
'a'
, an attempt to store
'a '
causes a duplicate-key error.
The BINARY
and VARBINARY
types are similar to CHAR
and
VARCHAR
, except that they store
binary strings rather than nonbinary strings. That is, they
store byte strings rather than character strings. This means
they have the binary
character set and
collation, and comparison and sorting are based on the numeric
values of the bytes in the values.
The permissible maximum length is the same for
BINARY
and VARBINARY
as it
is for CHAR
and
VARCHAR
, except that the length
for BINARY
and VARBINARY
is measured in bytes rather than characters.
The BINARY
and VARBINARY
data types are distinct from the CHAR BINARY
and VARCHAR BINARY
data types. For the latter
types, the BINARY
attribute does not cause
the column to be treated as a binary string column. Instead, it
causes the binary (_bin
) collation for the
column character set (or the table default character set if no
column character set is specified) to be used, and the column
itself stores nonbinary character strings rather than binary
byte strings. For example, if the default character set is
latin1
, CHAR(5) BINARY
is
treated as CHAR(5) CHARACTER SET latin1 COLLATE
latin1_bin
. This differs from
BINARY(5)
, which stores 5-byte binary strings
that have the binary
character set and
collation. For information about the differences between the
binary
collation of the
binary
character set and the
_bin
collations of nonbinary character sets,
see Section 10.8.5, “The binary Collation Compared to _bin Collations”.
If strict SQL mode is not enabled and you assign a value to a
BINARY
or VARBINARY
column
that exceeds the column's maximum length, the value is truncated
to fit and a warning is generated. For cases of truncation, to
cause an error to occur (rather than a warning) and suppress
insertion of the value, use strict SQL mode. See
Section 5.1.10, “Server SQL Modes”.
When BINARY
values are stored, they are
right-padded with the pad value to the specified length. The pad
value is 0x00
(the zero byte). Values are
right-padded with 0x00
for inserts, and no
trailing bytes are removed for retrievals. All bytes are
significant in comparisons, including ORDER
BY
and DISTINCT
operations.
0x00
and space differ in comparisons, with
0x00
sorting before space.
Example: For a BINARY(3)
column,
'a '
becomes
'a \0'
when inserted.
'a\0'
becomes 'a\0\0'
when
inserted. Both inserted values remain unchanged for retrievals.
For VARBINARY
, there is no padding for
inserts and no bytes are stripped for retrievals. All bytes are
significant in comparisons, including ORDER
BY
and DISTINCT
operations.
0x00
and space differ in comparisons, with
0x00
sorting before space.
For those cases where trailing pad bytes are stripped or
comparisons ignore them, if a column has an index that requires
unique values, inserting values into the column that differ only
in number of trailing pad bytes results in a duplicate-key
error. For example, if a table contains 'a'
,
an attempt to store 'a\0'
causes a
duplicate-key error.
You should consider the preceding padding and stripping
characteristics carefully if you plan to use the
BINARY
data type for storing binary data and
you require that the value retrieved be exactly the same as the
value stored. The following example illustrates how
0x00
-padding of BINARY
values affects column value comparisons:
mysql>CREATE TABLE t (c BINARY(3));
Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO t SET c = 'a';
Query OK, 1 row affected (0.01 sec) mysql>SELECT HEX(c), c = 'a', c = 'a\0\0' from t;
+--------+---------+-------------+ | HEX(c) | c = 'a' | c = 'a\0\0' | +--------+---------+-------------+ | 610000 | 0 | 1 | +--------+---------+-------------+ 1 row in set (0.09 sec)
If the value retrieved must be the same as the value specified
for storage with no padding, it might be preferable to use
VARBINARY
or one of the
BLOB
data types instead.
A BLOB
is a binary large object that can hold
a variable amount of data. The four BLOB
types are TINYBLOB
, BLOB
,
MEDIUMBLOB
, and LONGBLOB
.
These differ only in the maximum length of the values they can
hold. The four TEXT
types are
TINYTEXT
, TEXT
,
MEDIUMTEXT
, and LONGTEXT
.
These correspond to the four BLOB
types and
have the same maximum lengths and storage requirements. See
Section 11.6, “Data Type Storage Requirements”.
BLOB
values are treated as binary strings
(byte strings). They have the binary
character set and collation, and comparison and sorting are
based on the numeric values of the bytes in column values.
TEXT
values are treated as nonbinary strings
(character strings). They have a character set other than
binary
, and values are sorted and compared
based on the collation of the character set.
If strict SQL mode is not enabled and you assign a value to a
BLOB
or TEXT
column that
exceeds the column's maximum length, the value is truncated to
fit and a warning is generated. For truncation of nonspace
characters, you can cause an error to occur (rather than a
warning) and suppress insertion of the value by using strict SQL
mode. See Section 5.1.10, “Server SQL Modes”.
Truncation of excess trailing spaces from values to be inserted
into TEXT
columns always
generates a warning, regardless of the SQL mode.
For TEXT
and BLOB
columns,
there is no padding on insert and no bytes are stripped on
select.
If a TEXT
column is indexed, index entry
comparisons are space-padded at the end. This means that, if the
index requires unique values, duplicate-key errors will occur
for values that differ only in the number of trailing spaces.
For example, if a table contains 'a'
, an
attempt to store 'a '
causes a
duplicate-key error. This is not true for
BLOB
columns.
In most respects, you can regard a BLOB
column as a VARBINARY
column that
can be as large as you like. Similarly, you can regard a
TEXT
column as a
VARCHAR
column.
BLOB
and TEXT
differ from
VARBINARY
and
VARCHAR
in the following ways:
For indexes on BLOB
and
TEXT
columns, you must specify an index
prefix length. For CHAR
and
VARCHAR
, a prefix length is
optional. See Section 8.3.4, “Column Indexes”.
If you use the BINARY
attribute with a
TEXT
data type, the column is assigned the
binary (_bin
) collation of the column
character set.
LONG
and LONG VARCHAR
map
to the MEDIUMTEXT
data type. This is a
compatibility feature.
MySQL Connector/ODBC defines BLOB
values as
LONGVARBINARY
and TEXT
values as LONGVARCHAR
.
Because BLOB
and TEXT
values can be extremely long, you might encounter some
constraints in using them:
Only the first
max_sort_length
bytes of
the column are used when sorting. The default value of
max_sort_length
is 1024.
You can make more bytes significant in sorting or grouping
by increasing the value of
max_sort_length
at server
startup or runtime. Any client can change the value of its
session max_sort_length
variable:
mysql>SET max_sort_length = 2000;
mysql>SELECT id, comment FROM t
->ORDER BY comment;
Instances of BLOB
or
TEXT
columns in the result of a query
that is processed using a temporary table causes the server
to use a table on disk rather than in memory because the
MEMORY
storage engine does not support
those data types (see
Section 8.4.4, “Internal Temporary Table Use in MySQL”). Use of disk
incurs a performance penalty, so include
BLOB
or TEXT
columns
in the query result only if they are really needed. For
example, avoid using
SELECT *
,
which selects all columns.
The maximum size of a BLOB
or
TEXT
object is determined by its type,
but the largest value you actually can transmit between the
client and server is determined by the amount of available
memory and the size of the communications buffers. You can
change the message buffer size by changing the value of the
max_allowed_packet
variable, but you must do so for both the server and your
client program. For example, both mysql
and mysqldump enable you to change the
client-side
max_allowed_packet
value.
See Section 5.1.1, “Configuring the Server”,
Section 4.5.1, “mysql — The MySQL Command-Line Client”, and Section 4.5.4, “mysqldump — A Database Backup Program”.
You may also want to compare the packet sizes and the size
of the data objects you are storing with the storage
requirements, see Section 11.6, “Data Type Storage Requirements”
Each BLOB
or TEXT
value is
represented internally by a separately allocated object. This is
in contrast to all other data types, for which storage is
allocated once per column when the table is opened.
In some cases, it may be desirable to store binary data such as
media files in BLOB
or
TEXT
columns. You may find MySQL's string
handling functions useful for working with such data. See
Section 12.7, “String Functions and Operators”. For security and other
reasons, it is usually preferable to do so using application
code rather than giving application users the
FILE
privilege. You can discuss
specifics for various languages and platforms in the MySQL
Forums (http://forums.mysql.com/).
An ENUM
is a string object with a value
chosen from a list of permitted values that are enumerated
explicitly in the column specification at table creation time.
It has these advantages:
Compact data storage in situations where a column has a
limited set of possible values. The strings you specify as
input values are automatically encoded as numbers. See
Section 11.6, “Data Type Storage Requirements” for the storage
requirements for ENUM
types.
Readable queries and output. The numbers are translated back to the corresponding strings in query results.
and these potential issues to consider:
If you make enumeration values that look like numbers, it is easy to mix up the literal values with their internal index numbers, as explained in Enumeration Limitations.
Using ENUM
columns in ORDER
BY
clauses requires extra care, as explained in
Enumeration Sorting.
An enumeration value must be a quoted string literal. For
example, you can create a table with an
ENUM
column like this:
CREATE TABLE shirts ( name VARCHAR(40), size ENUM('x-small', 'small', 'medium', 'large', 'x-large') ); INSERT INTO shirts (name, size) VALUES ('dress shirt','large'), ('t-shirt','medium'), ('polo shirt','small'); SELECT name, size FROM shirts WHERE size = 'medium'; +---------+--------+ | name | size | +---------+--------+ | t-shirt | medium | +---------+--------+ UPDATE shirts SET size = 'small' WHERE size = 'large'; COMMIT;
Inserting 1 million rows into this table with a value of
'medium'
would require 1 million bytes of
storage, as opposed to 6 million bytes if you stored the
actual string 'medium'
in a
VARCHAR
column.
Each enumeration value has an index:
The elements listed in the column specification are assigned index numbers, beginning with 1.
The index value of the empty string error value is 0. This
means that you can use the following
SELECT
statement to find
rows into which invalid ENUM
values
were assigned:
mysql> SELECT * FROM tbl_name
WHERE enum_col
=0;
The index of the NULL
value is
NULL
.
The term “index” here refers to a position within the list of enumeration values. It has nothing to do with table indexes.
For example, a column specified as ENUM('Mercury',
'Venus', 'Earth')
can have any of the values shown
here. The index of each value is also shown.
Value | Index |
---|---|
NULL |
NULL |
'' |
0 |
'Mercury' |
1 |
'Venus' |
2 |
'Earth' |
3 |
An ENUM
column can have a
maximum of 65,535 distinct elements. (The practical limit is
less than 3000.) A table can have no more than 255 unique
element list definitions among its
ENUM
and
SET
columns considered as a
group. For more information on these limits, see
Limits Imposed by .frm File Structure.
If you retrieve an ENUM
value in a numeric
context, the column value's index is returned. For example,
you can retrieve numeric values from an
ENUM
column like this:
mysql> SELECT enum_col
+0 FROM tbl_name
;
Functions such as SUM()
or
AVG()
that expect a numeric
argument cast the argument to a number if necessary. For
ENUM
values, the index number is used in
the calculation.
Trailing spaces are automatically deleted from
ENUM
member values in the table definition
when a table is created.
When retrieved, values stored into an ENUM
column are displayed using the lettercase that was used in the
column definition. Note that ENUM
columns
can be assigned a character set and collation. For binary or
case-sensitive collations, lettercase is taken into account
when assigning values to the column.
If you store a number into an ENUM
column,
the number is treated as the index into the possible values,
and the value stored is the enumeration member with that
index. (However, this does not work with
LOAD DATA
, which treats all
input as strings.) If the numeric value is quoted, it is still
interpreted as an index if there is no matching string in the
list of enumeration values. For these reasons, it is not
advisable to define an ENUM
column with
enumeration values that look like numbers, because this can
easily become confusing. For example, the following column has
enumeration members with string values of
'0'
, '1'
, and
'2'
, but numeric index values of
1
, 2
, and
3
:
numbers ENUM('0','1','2')
If you store 2
, it is interpreted as an
index value, and becomes '1'
(the value
with index 2). If you store '2'
, it matches
an enumeration value, so it is stored as
'2'
. If you store '3'
,
it does not match any enumeration value, so it is treated as
an index and becomes '2'
(the value with
index 3).
mysql>INSERT INTO t (numbers) VALUES(2),('2'),('3');
mysql>SELECT * FROM t;
+---------+ | numbers | +---------+ | 1 | | 2 | | 2 | +---------+
To determine all possible values for an
ENUM
column, use
SHOW COLUMNS
FROM
and parse the
tbl_name
LIKE
'enum_col
'ENUM
definition in the
Type
column of the output.
In the C API, ENUM
values are returned as
strings. For information about using result set metadata to
distinguish them from other strings, see
Section 23.7.4, “C API Data Structures”.
An enumeration value can also be the empty string
(''
) or NULL
under
certain circumstances:
If you insert an invalid value into an
ENUM
(that is, a string not present in
the list of permitted values), the empty string is
inserted instead as a special error value. This string can
be distinguished from a “normal” empty string
by the fact that this string has the numeric value 0. See
Index Values for Enumeration Literals for details about the
numeric indexes for the enumeration values.
If strict SQL mode is enabled, attempts to insert invalid
ENUM
values result in an error.
If an ENUM
column is declared to permit
NULL
, the NULL
value
is a valid value for the column, and the default value is
NULL
. If an ENUM
column is declared NOT NULL
, its
default value is the first element of the list of
permitted values.
ENUM
values are sorted based on their index
numbers, which depend on the order in which the enumeration
members were listed in the column specification. For example,
'b'
sorts before 'a'
for
ENUM('b', 'a')
. The empty string sorts
before nonempty strings, and NULL
values
sort before all other enumeration values.
To prevent unexpected results when using the ORDER
BY
clause on an ENUM
column, use
one of these techniques:
Specify the ENUM
list in alphabetic
order.
Make sure that the column is sorted lexically rather than
by index number by coding ORDER BY
CAST(
or
col
AS CHAR)ORDER BY
CONCAT(
.
col
)
An enumeration value cannot be an expression, even one that evaluates to a string value.
For example, this CREATE TABLE
statement does not work because the
CONCAT
function cannot be used to construct
an enumeration value:
CREATE TABLE sizes ( size ENUM('small', CONCAT('med','ium'), 'large') );
You also cannot employ a user variable as an enumeration value. This pair of statements do not work:
SET @mysize = 'medium'; CREATE TABLE sizes ( size ENUM('small', @mysize, 'large') );
We strongly recommend that you do not use
numbers as enumeration values, because it does not save on
storage over the appropriate
TINYINT
or
SMALLINT
type, and it is easy
to mix up the strings and the underlying number values (which
might not be the same) if you quote the
ENUM
values incorrectly. If you do use a
number as an enumeration value, always enclose it in quotation
marks. If the quotation marks are omitted, the number is
regarded as an index. See Handling of Enumeration Literals to
see how even a quoted number could be mistakenly used as a
numeric index value.
Duplicate values in the definition cause a warning, or an error if strict SQL mode is enabled.
A SET
is a string object that can have zero
or more values, each of which must be chosen from a list of
permitted values specified when the table is created.
SET
column values that consist of multiple
set members are specified with members separated by commas
(,
). A consequence of this is that
SET
member values should not themselves
contain commas.
For example, a column specified as SET('one', 'two')
NOT NULL
can have any of these values:
'' 'one' 'two' 'one,two'
A SET
column can have a maximum
of 64 distinct members. A table can have no more than 255 unique
element list definitions among its
ENUM
and
SET
columns considered as a
group. For more information on this limit, see
Limits Imposed by .frm File Structure.
Duplicate values in the definition cause a warning, or an error if strict SQL mode is enabled.
Trailing spaces are automatically deleted from
SET
member values in the table definition
when a table is created.
When retrieved, values stored in a SET
column
are displayed using the lettercase that was used in the column
definition. Note that SET
columns can be
assigned a character set and collation. For binary or
case-sensitive collations, lettercase is taken into account when
assigning values to the column.
MySQL stores SET
values numerically, with the
low-order bit of the stored value corresponding to the first set
member. If you retrieve a SET
value in a
numeric context, the value retrieved has bits set corresponding
to the set members that make up the column value. For example,
you can retrieve numeric values from a SET
column like this:
mysql> SELECT set_col
+0 FROM tbl_name
;
If a number is stored into a SET
column, the
bits that are set in the binary representation of the number
determine the set members in the column value. For a column
specified as SET('a','b','c','d')
, the
members have the following decimal and binary values.
SET Member |
Decimal Value | Binary Value |
---|---|---|
'a' |
1 |
0001 |
'b' |
2 |
0010 |
'c' |
4 |
0100 |
'd' |
8 |
1000 |
If you assign a value of 9
to this column,
that is 1001
in binary, so the first and
fourth SET
value members
'a'
and 'd'
are selected
and the resulting value is 'a,d'
.
For a value containing more than one SET
element, it does not matter what order the elements are listed
in when you insert the value. It also does not matter how many
times a given element is listed in the value. When the value is
retrieved later, each element in the value appears once, with
elements listed according to the order in which they were
specified at table creation time. Suppose that a column is
specified as SET('a','b','c','d')
:
mysql> CREATE TABLE myset (col SET('a', 'b', 'c', 'd'));
If you insert the values 'a,d'
,
'd,a'
, 'a,d,d'
,
'a,d,a'
, and 'd,a,d'
:
mysql> INSERT INTO myset (col) VALUES
-> ('a,d'), ('d,a'), ('a,d,a'), ('a,d,d'), ('d,a,d');
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0
Then all these values appear as 'a,d'
when
retrieved:
mysql> SELECT col FROM myset;
+------+
| col |
+------+
| a,d |
| a,d |
| a,d |
| a,d |
| a,d |
+------+
5 rows in set (0.04 sec)
If you set a SET
column to an unsupported
value, the value is ignored and a warning is issued:
mysql>INSERT INTO myset (col) VALUES ('a,d,d,s');
Query OK, 1 row affected, 1 warning (0.03 sec) mysql>SHOW WARNINGS;
+---------+------+------------------------------------------+ | Level | Code | Message | +---------+------+------------------------------------------+ | Warning | 1265 | Data truncated for column 'col' at row 1 | +---------+------+------------------------------------------+ 1 row in set (0.04 sec) mysql>SELECT col FROM myset;
+------+ | col | +------+ | a,d | | a,d | | a,d | | a,d | | a,d | | a,d | +------+ 6 rows in set (0.01 sec)
If strict SQL mode is enabled, attempts to insert invalid
SET
values result in an error.
SET
values are sorted numerically.
NULL
values sort before
non-NULL
SET
values.
Functions such as SUM()
or
AVG()
that expect a numeric
argument cast the argument to a number if necessary. For
SET
values, the cast operation causes the
numeric value to be used.
Normally, you search for SET
values using the
FIND_IN_SET()
function or the
LIKE
operator:
mysql>SELECT * FROM
mysql>tbl_name
WHERE FIND_IN_SET('value
',set_col
)>0;SELECT * FROM
tbl_name
WHEREset_col
LIKE '%value
%';
The first statement finds rows where
set_col
contains the
value
set member. The second is
similar, but not the same: It finds rows where
set_col
contains
value
anywhere, even as a substring
of another set member.
The following statements also are permitted:
mysql>SELECT * FROM
mysql>tbl_name
WHEREset_col
& 1;SELECT * FROM
tbl_name
WHEREset_col
= 'val1
,val2
';
The first of these statements looks for values containing the
first set member. The second looks for an exact match. Be
careful with comparisons of the second type. Comparing set
values to
'
returns different results than comparing values to
val1
,val2
''
.
You should specify the values in the same order they are listed
in the column definition.
val2
,val1
'
To determine all possible values for a SET
column, use SHOW COLUMNS FROM
and parse the
tbl_name
LIKE
set_col
SET
definition in the Type
column of the output.
In the C API, SET
values are returned as
strings. For information about using result set metadata to
distinguish them from other strings, see
Section 23.7.4, “C API Data Structures”.
The Open Geospatial Consortium (OGC) is an international consortium of more than 250 companies, agencies, and universities participating in the development of publicly available conceptual solutions that can be useful with all kinds of applications that manage spatial data.
The Open Geospatial Consortium publishes the OpenGIS® Implementation Standard for Geographic information - Simple Feature Access - Part 2: SQL Option, a document that proposes several conceptual ways for extending an SQL RDBMS to support spatial data. This specification is available from the OGC website at http://www.opengeospatial.org/standards/sfs.
Following the OGC specification, MySQL implements spatial extensions as a subset of the SQL with Geometry Types environment. This term refers to an SQL environment that has been extended with a set of geometry types. A geometry-valued SQL column is implemented as a column that has a geometry type. The specification describes a set of SQL geometry types, as well as functions on those types to create and analyze geometry values.
MySQL spatial extensions enable the generation, storage, and analysis of geographic features:
Data types for representing spatial values
Functions for manipulating spatial values
Spatial indexing for improved access times to spatial columns
The spatial data types and functions are available for
MyISAM
,
InnoDB
,
NDB
, and
ARCHIVE
tables. For indexing spatial
columns, MyISAM
supports both
SPATIAL
and non-SPATIAL
indexes. The other storage engines support
non-SPATIAL
indexes, as described in
Section 13.1.13, “CREATE INDEX Statement”.
A geographic feature is anything in the world that has a location. A feature can be:
An entity. For example, a mountain, a pond, a city.
A space. For example, town district, the tropics.
A definable location. For example, a crossroad, as a particular place where two streets intersect.
Some documents use the term geospatial feature to refer to geographic features.
Geometry is another word that denotes a geographic feature. Originally the word geometry meant measurement of the earth. Another meaning comes from cartography, referring to the geometric features that cartographers use to map the world.
The discussion here considers these terms synonymous: geographic feature, geospatial feature, feature, or geometry. The term most commonly used is geometry, defined as a point or an aggregate of points representing anything in the world that has a location.
The following material covers these topics:
The spatial data types implemented in MySQL model
The basis of the spatial extensions in the OpenGIS geometry model
Data formats for representing spatial data
How to use spatial data in MySQL
Use of indexing for spatial data
MySQL differences from the OpenGIS specification
For information about functions that operate on spatial data, see Section 12.16, “Spatial Analysis Functions”.
MySQL does not implement the following GIS features:
Additional Metadata Views
OpenGIS specifications propose several additional metadata
views. For example, a system view named
GEOMETRY_COLUMNS
contains a description of
geometry columns, one row for each geometry column in the
database.
The OpenGIS function Length()
on LineString
and
MultiLineString
should be called in MySQL
as GLength()
The problem is that there is an existing SQL function
Length()
that calculates the
length of string values, and sometimes it is not possible to
distinguish whether the function is called in a textual or
spatial context.
The Open Geospatial Consortium publishes the OpenGIS® Implementation Standard for Geographic information - Simple feature access - Part 2: SQL option, a document that proposes several conceptual ways for extending an SQL RDBMS to support spatial data. The Open Geospatial Consortium (OGC) maintains a website at http://www.opengeospatial.org/. The specification is available there at http://www.opengeospatial.org/standards/sfs. It contains additional information relevant to the material here.
If you have questions or concerns about the use of the spatial extensions to MySQL, you can discuss them in the GIS forum: https://forums.mysql.com/list.php?23.
MySQL has spatial data types that correspond to OpenGIS classes. The basis for these types is described in Section 11.4.2, “The OpenGIS Geometry Model”.
Some spatial data types hold single geometry values:
GEOMETRY
POINT
LINESTRING
POLYGON
GEOMETRY
can store geometry values of any
type. The other single-value types (POINT
,
LINESTRING
, and POLYGON
)
restrict their values to a particular geometry type.
The other spatial data types hold collections of values:
MULTIPOINT
MULTILINESTRING
MULTIPOLYGON
GEOMETRYCOLLECTION
GEOMETRYCOLLECTION
can store a collection of
objects of any type. The other collection types
(MULTIPOINT
,
MULTILINESTRING
, and
MULTIPOLYGON
) restrict collection members to
those having a particular geometry type.
Example: To create a table named geom
that
has a column named g
that can store values of
any geometry type, use this statement:
CREATE TABLE geom (g GEOMETRY);
SPATIAL
indexes can be created on
NOT NULL
spatial columns, so if you plan to
index the column, declare it NOT NULL
:
CREATE TABLE geom (g GEOMETRY NOT NULL);
For other examples showing how to use spatial data types in MySQL, see Section 11.4.4, “Creating Spatial Columns”.
The set of geometry types proposed by OGC's SQL with Geometry Types environment is based on the OpenGIS Geometry Model. In this model, each geometric object has the following general properties:
It is associated with a spatial reference system, which describes the coordinate space in which the object is defined.
It belongs to some geometry class.
The geometry classes define a hierarchy as follows:
Geometry
(noninstantiable)
Point
(instantiable)
Curve
(noninstantiable)
LineString
(instantiable)
Line
LinearRing
Surface
(noninstantiable)
Polygon
(instantiable)
GeometryCollection
(instantiable)
MultiPoint
(instantiable)
MultiCurve
(noninstantiable)
MultiLineString
(instantiable)
MultiSurface
(noninstantiable)
MultiPolygon
(instantiable)
It is not possible to create objects in noninstantiable classes. It is possible to create objects in instantiable classes. All classes have properties, and instantiable classes may also have assertions (rules that define valid class instances).
Geometry
is the base class. It is an
abstract class. The instantiable subclasses of
Geometry
are restricted to zero-, one-, and
two-dimensional geometric objects that exist in
two-dimensional coordinate space. All instantiable geometry
classes are defined so that valid instances of a geometry
class are topologically closed (that is, all defined
geometries include their boundary).
The base Geometry
class has subclasses for
Point
, Curve
,
Surface
, and
GeometryCollection
:
Point
represents zero-dimensional
objects.
Curve
represents one-dimensional
objects, and has subclass LineString
,
with sub-subclasses Line
and
LinearRing
.
Surface
is designed for two-dimensional
objects and has subclass Polygon
.
GeometryCollection
has specialized
zero-, one-, and two-dimensional collection classes named
MultiPoint
,
MultiLineString
, and
MultiPolygon
for modeling geometries
corresponding to collections of Points
,
LineStrings
, and
Polygons
, respectively.
MultiCurve
and
MultiSurface
are introduced as abstract
superclasses that generalize the collection interfaces to
handle Curves
and
Surfaces
.
Geometry
, Curve
,
Surface
, MultiCurve
, and
MultiSurface
are defined as noninstantiable
classes. They define a common set of methods for their
subclasses and are included for extensibility.
Point
, LineString
,
Polygon
,
GeometryCollection
,
MultiPoint
,
MultiLineString
, and
MultiPolygon
are instantiable classes.
Geometry
is the root class of the
hierarchy. It is a noninstantiable class but has a number of
properties, described in the following list, that are common
to all geometry values created from any of the
Geometry
subclasses. Particular subclasses
have their own specific properties, described later.
Geometry Properties
A geometry value has the following properties:
Its type. Each geometry belongs to one of the instantiable classes in the hierarchy.
Its SRID, or spatial reference identifier. This value identifies the geometry's associated spatial reference system that describes the coordinate space in which the geometry object is defined.
In MySQL, the SRID value is an integer associated with the geometry value. The maximum usable SRID value is 232−1. If a larger value is given, only the lower 32 bits are used. All computations are done assuming SRID 0, regardless of the actual SRID value. SRID 0 represents an infinite flat Cartesian plane with no units assigned to its axes.
Its coordinates in its spatial reference system, represented as double-precision (8-byte) numbers. All nonempty geometries include at least one pair of (X,Y) coordinates. Empty geometries contain no coordinates.
Coordinates are related to the SRID. For example, in different coordinate systems, the distance between two objects may differ even when objects have the same coordinates, because the distance on the planar coordinate system and the distance on the geodetic system (coordinates on the Earth's surface) are different things.
Its interior, boundary, and exterior.
Every geometry occupies some position in space. The exterior of a geometry is all space not occupied by the geometry. The interior is the space occupied by the geometry. The boundary is the interface between the geometry's interior and exterior.
Its MBR (minimum bounding rectangle), or envelope. This is the bounding geometry, formed by the minimum and maximum (X,Y) coordinates:
((MINX MINY, MAXX MINY, MAXX MAXY, MINX MAXY, MINX MINY))
Whether the value is
simple or
nonsimple. Geometry
values of types (LineString
,
MultiPoint
,
MultiLineString
) are either simple or
nonsimple. Each type determines its own assertions for
being simple or nonsimple.
Whether the value is
closed or
not closed. Geometry
values of types (LineString
,
MultiString
) are either closed or not
closed. Each type determines its own assertions for being
closed or not closed.
Whether the value is
empty or
nonempty A geometry is
empty if it does not have any points. Exterior, interior,
and boundary of an empty geometry are not defined (that
is, they are represented by a NULL
value). An empty geometry is defined to be always simple
and has an area of 0.
Its dimension. A geometry can have a dimension of −1, 0, 1, or 2:
−1 for an empty geometry.
0 for a geometry with no length and no area.
1 for a geometry with nonzero length and zero area.
2 for a geometry with nonzero area.
Point
objects have a dimension of zero.
LineString
objects have a dimension of
1. Polygon
objects have a dimension of
2. The dimensions of MultiPoint
,
MultiLineString
, and
MultiPolygon
objects are the same as
the dimensions of the elements they consist of.
A Point
is a geometry that represents a
single location in coordinate space.
Point
Examples
Imagine a large-scale map of the world with many cities. A
Point
object could represent each city.
On a city map, a Point
object could
represent a bus stop.
Point
Properties
X-coordinate value.
Y-coordinate value.
Point
is defined as a zero-dimensional
geometry.
The boundary of a Point
is the empty
set.
A Curve
is a one-dimensional geometry,
usually represented by a sequence of points. Particular
subclasses of Curve
define the type of
interpolation between points. Curve
is a
noninstantiable class.
Curve
Properties
A Curve
has the coordinates of its
points.
A Curve
is defined as a one-dimensional
geometry.
A Curve
is simple if it does not pass
through the same point twice, with the exception that a
curve can still be simple if the start and end points are
the same.
A Curve
is closed if its start point is
equal to its endpoint.
The boundary of a closed Curve
is
empty.
The boundary of a nonclosed Curve
consists of its two endpoints.
A Curve
that is simple and closed is a
LinearRing
.
A LineString
is a Curve
with linear interpolation between points.
LineString
Examples
On a world map, LineString
objects
could represent rivers.
In a city map, LineString
objects could
represent streets.
LineString
Properties
A LineString
has coordinates of
segments, defined by each consecutive pair of points.
A LineString
is a
Line
if it consists of exactly two
points.
A LineString
is a
LinearRing
if it is both closed and
simple.
A Surface
is a two-dimensional geometry. It
is a noninstantiable class. Its only instantiable subclass is
Polygon
.
Surface
Properties
A Surface
is defined as a
two-dimensional geometry.
The OpenGIS specification defines a simple
Surface
as a geometry that consists of
a single “patch” that is associated with a
single exterior boundary and zero or more interior
boundaries.
The boundary of a simple Surface
is the
set of closed curves corresponding to its exterior and
interior boundaries.
A Polygon
is a planar
Surface
representing a multisided geometry.
It is defined by a single exterior boundary and zero or more
interior boundaries, where each interior boundary defines a
hole in the Polygon
.
Polygon
Examples
On a region map, Polygon
objects could
represent forests, districts, and so on.
Polygon
Assertions
The boundary of a Polygon
consists of a
set of LinearRing
objects (that is,
LineString
objects that are both simple
and closed) that make up its exterior and interior
boundaries.
A Polygon
has no rings that cross. The
rings in the boundary of a Polygon
may
intersect at a Point
, but only as a
tangent.
A Polygon
has no lines, spikes, or
punctures.
A Polygon
has an interior that is a
connected point set.
A Polygon
may have holes. The exterior
of a Polygon
with holes is not
connected. Each hole defines a connected component of the
exterior.
The preceding assertions make a Polygon
a
simple geometry.
A GeometryCollection
is a geometry that is
a collection of zero or more geometries of any class.
All the elements in a geometry collection must be in the same
spatial reference system (that is, in the same coordinate
system). There are no other constraints on the elements of a
geometry collection, although the subclasses of
GeometryCollection
described in the
following sections may restrict membership. Restrictions may
be based on:
Element type (for example, a MultiPoint
may contain only Point
elements)
Dimension
Constraints on the degree of spatial overlap between elements
A MultiPoint
is a geometry collection
composed of Point
elements. The points are
not connected or ordered in any way.
MultiPoint
Examples
On a world map, a MultiPoint
could
represent a chain of small islands.
On a city map, a MultiPoint
could
represent the outlets for a ticket office.
MultiPoint
Properties
A MultiPoint
is a zero-dimensional
geometry.
A MultiPoint
is simple if no two of its
Point
values are equal (have identical
coordinate values).
The boundary of a MultiPoint
is the
empty set.
A MultiCurve
is a geometry collection
composed of Curve
elements.
MultiCurve
is a noninstantiable class.
MultiCurve
Properties
A MultiCurve
is a one-dimensional
geometry.
A MultiCurve
is simple if and only if
all of its elements are simple; the only intersections
between any two elements occur at points that are on the
boundaries of both elements.
A MultiCurve
boundary is obtained by
applying the “mod 2 union rule” (also known
as the “odd-even rule”): A point is in the
boundary of a MultiCurve
if it is in
the boundaries of an odd number of
Curve
elements.
A MultiCurve
is closed if all of its
elements are closed.
The boundary of a closed MultiCurve
is
always empty.
A MultiLineString
is a
MultiCurve
geometry collection composed of
LineString
elements.
MultiLineString
Examples
On a region map, a MultiLineString
could represent a river system or a highway system.
A MultiSurface
is a geometry collection
composed of surface elements. MultiSurface
is a noninstantiable class. Its only instantiable subclass is
MultiPolygon
.
MultiSurface
Assertions
Surfaces within a MultiSurface
have no
interiors that intersect.
Surfaces within a MultiSurface
have
boundaries that intersect at most at a finite number of
points.
A MultiPolygon
is a
MultiSurface
object composed of
Polygon
elements.
MultiPolygon
Examples
On a region map, a MultiPolygon
could
represent a system of lakes.
MultiPolygon
Assertions
A MultiPolygon
has no two
Polygon
elements with interiors that
intersect.
A MultiPolygon
has no two
Polygon
elements that cross (crossing
is also forbidden by the previous assertion), or that
touch at an infinite number of points.
A MultiPolygon
may not have cut lines,
spikes, or punctures. A MultiPolygon
is
a regular, closed point set.
A MultiPolygon
that has more than one
Polygon
has an interior that is not
connected. The number of connected components of the
interior of a MultiPolygon
is equal to
the number of Polygon
values in the
MultiPolygon
.
MultiPolygon
Properties
A MultiPolygon
is a two-dimensional
geometry.
A MultiPolygon
boundary is a set of
closed curves (LineString
values)
corresponding to the boundaries of its
Polygon
elements.
Each Curve
in the boundary of the
MultiPolygon
is in the boundary of
exactly one Polygon
element.
Every Curve
in the boundary of an
Polygon
element is in the boundary of
the MultiPolygon
.
Two standard spatial data formats are used to represent geometry objects in queries:
Well-Known Text (WKT) format
Well-Known Binary (WKB) format
Internally, MySQL stores geometry values in a format that is not identical to either WKT or WKB format. (Internal format is like WKB but with an initial 4 bytes to indicate the SRID.)
There are functions available to convert between different data formats; see Section 12.16.6, “Geometry Format Conversion Functions”.
The following sections describe the spatial data formats MySQL uses:
The Well-Known Text (WKT) representation of geometry values is designed for exchanging geometry data in ASCII form. The OpenGIS specification provides a Backus-Naur grammar that specifies the formal production rules for writing WKT values (see Section 11.4, “Spatial Data Types”).
Examples of WKT representations of geometry objects:
A Point
:
POINT(15 20)
The point coordinates are specified with no separating
comma. This differs from the syntax for the SQL
Point()
function, which
requires a comma between the coordinates. Take care to use
the syntax appropriate to the context of a given spatial
operation. For example, the following statements both use
ST_X()
to extract the
X-coordinate from a Point
object. The
first produces the object directly using the
Point()
function. The
second uses a WKT representation converted to a
Point
with
ST_GeomFromText()
.
mysql>SELECT X(Point(15, 20));
+------------------+ | X(POINT(15, 20)) | +------------------+ | 15 | +------------------+ mysql>SELECT X(GeomFromText('POINT(15 20)'));
+---------------------------------+ | X(GeomFromText('POINT(15 20)')) | +---------------------------------+ | 15 | +---------------------------------+
A LineString
with four points:
LINESTRING(0 0, 10 10, 20 25, 50 60)
The point coordinate pairs are separated by commas.
A Polygon
with one exterior ring and
one interior ring:
POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))
A MultiPoint
with three
Point
values:
MULTIPOINT(0 0, 20 20, 60 60)
A MultiLineString
with two
LineString
values:
MULTILINESTRING((10 10, 20 20), (15 15, 30 15))
A MultiPolygon
with two
Polygon
values:
MULTIPOLYGON(((0 0,10 0,10 10,0 10,0 0)),((5 5,7 5,7 7,5 7, 5 5)))
A GeometryCollection
consisting of two
Point
values and one
LineString
:
GEOMETRYCOLLECTION(POINT(10 10), POINT(30 30), LINESTRING(15 15, 20 20))
The Well-Known Binary (WKB) representation of geometric values
is used for exchanging geometry data as binary streams
represented by BLOB
values
containing geometric WKB information. This format is defined
by the OpenGIS specification (see
Section 11.4, “Spatial Data Types”). It is also defined in the
ISO SQL/MM Part 3: Spatial standard.
WKB uses 1-byte unsigned integers, 4-byte unsigned integers, and 8-byte double-precision numbers (IEEE 754 format). A byte is eight bits.
For example, a WKB value that corresponds to POINT(1
-1)
consists of this sequence of 21 bytes, each
represented by two hexadecimal digits:
0101000000000000000000F03F000000000000F0BF
The sequence consists of the components shown in the following table.
Table 11.2 WKB Components Example
Component | Size | Value |
---|---|---|
Byte order | 1 byte | 01 |
WKB type | 4 bytes | 01000000 |
X coordinate | 8 bytes | 000000000000F03F |
Y coordinate | 8 bytes | 000000000000F0BF |
Component representation is as follows:
The byte order indicator is either 1 or 0 to signify little-endian or big-endian storage. The little-endian and big-endian byte orders are also known as Network Data Representation (NDR) and External Data Representation (XDR), respectively.
The WKB type is a code that indicates the geometry type.
MySQL uses values from 1 through 7 to indicate
Point
, LineString
,
Polygon
, MultiPoint
,
MultiLineString
,
MultiPolygon
, and
GeometryCollection
.
A Point
value has X and Y coordinates,
each represented as a double-precision value.
WKB values for more complex geometry values have more complex data structures, as detailed in the OpenGIS specification.
MySQL stores geometry values using 4 bytes to indicate the SRID followed by the WKB representation of the value. For a description of WKB format, see Well-Known Binary (WKB) Format.
For the WKB part, these MySQL-specific considerations apply:
The byte-order indicator byte is 1 because MySQL stores geometries as little-ending values.
MySQL supports geometry types of Point
,
LineString
, Polygon
,
MultiPoint
,
MultiLineString
,
MultiPolygon
, and
GeometryCollection
. Other geometry
types are not supported.
The LENGTH()
function returns
the space in bytes required for value storage. Example:
mysql>SET @g = ST_GeomFromText('POINT(1 -1)');
mysql>SELECT LENGTH(@g);
+------------+ | LENGTH(@g) | +------------+ | 25 | +------------+ mysql>SELECT HEX(@g);
+----------------------------------------------------+ | HEX(@g) | +----------------------------------------------------+ | 000000000101000000000000000000F03F000000000000F0BF | +----------------------------------------------------+
The value length is 25 bytes, made up of these components (as can be seen from the hexadecimal value):
4 bytes for integer SRID (0)
1 byte for integer byte order (1 = little-endian)
4 bytes for integer type information (1 =
Point
)
8 bytes for double-precision X coordinate (1)
8 bytes for double-precision Y coordinate (−1)
MySQL provides a standard way of creating spatial columns for
geometry types, for example, with CREATE
TABLE
or ALTER TABLE
.
Spatial columns are supported for
MyISAM
,
InnoDB
,
NDB
, and
ARCHIVE
tables. See also the notes
about spatial indexes under
Section 11.4.8, “Creating Spatial Indexes”.
Use the CREATE TABLE
statement to create a table with a spatial column:
CREATE TABLE geom (g GEOMETRY);
Use the ALTER TABLE
statement
to add or drop a spatial column to or from an existing
table:
ALTER TABLE geom ADD pt POINT; ALTER TABLE geom DROP pt;
After you have created spatial columns, you can populate them with spatial data.
Values should be stored in internal geometry format, but you can convert them to that format from either Well-Known Text (WKT) or Well-Known Binary (WKB) format. The following examples demonstrate how to insert geometry values into a table by converting WKT values to internal geometry format:
The following examples insert more complex geometries into the table:
SET @g = 'LINESTRING(0 0,1 1,2 2)'; INSERT INTO geom VALUES (GeomFromText(@g)); SET @g = 'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))'; INSERT INTO geom VALUES (GeomFromText(@g)); SET @g = 'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))'; INSERT INTO geom VALUES (GeomFromText(@g));
The preceding examples use
GeomFromText()
to create geometry
values. You can also use type-specific functions:
SET @g = 'POINT(1 1)'; INSERT INTO geom VALUES (PointFromText(@g)); SET @g = 'LINESTRING(0 0,1 1,2 2)'; INSERT INTO geom VALUES (LineStringFromText(@g)); SET @g = 'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))'; INSERT INTO geom VALUES (PolygonFromText(@g)); SET @g = 'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))'; INSERT INTO geom VALUES (GeomCollFromText(@g));
A client application program that wants to use WKB representations of geometry values is responsible for sending correctly formed WKB in queries to the server. There are several ways to satisfy this requirement. For example:
Inserting a POINT(1 1)
value with hex
literal syntax:
INSERT INTO geom VALUES (GeomFromWKB(X'0101000000000000000000F03F000000000000F03F'));
An ODBC application can send a WKB representation, binding
it to a placeholder using an argument of
BLOB
type:
INSERT INTO geom VALUES (GeomFromWKB(?))
Other programming interfaces may support a similar placeholder mechanism.
In a C program, you can escape a binary value using
mysql_real_escape_string()
and include the result in a query string that is sent to the
server. See Section 23.7.6.54, “mysql_real_escape_string()”.
Geometry values stored in a table can be fetched in internal format. You can also convert them to WKT or WKB format.
Fetching spatial data in internal format:
Fetching geometry values using internal format can be useful in table-to-table transfers:
CREATE TABLE geom2 (g GEOMETRY) SELECT g FROM geom;
Fetching spatial data in WKT format:
The AsText()
function
converts a geometry from internal format to a WKT string.
SELECT AsText(g) FROM geom;
Fetching spatial data in WKB format:
The AsBinary()
function
converts a geometry from internal format to a
BLOB
containing the WKB
value.
SELECT AsBinary(g) FROM geom;
For MyISAM
tables, search
operations in columns containing spatial data can be optimized
using SPATIAL
indexes. The most typical
operations are:
Point queries that search for all objects that contain a given point
Region queries that search for all objects that overlap a given region
MySQL uses R-Trees with quadratic
splitting for SPATIAL
indexes on
spatial columns. A SPATIAL
index is built
using the minimum bounding rectangle (MBR) of a geometry. For
most geometries, the MBR is a minimum rectangle that surrounds
the geometries. For a horizontal or a vertical linestring, the
MBR is a rectangle degenerated into the linestring. For a point,
the MBR is a rectangle degenerated into the point.
It is also possible to create normal indexes on spatial columns.
In a non-SPATIAL
index, you must declare a
prefix for any spatial column except for
POINT
columns.
MyISAM
supports both
SPATIAL
and non-SPATIAL
indexes. Other storage engines support
non-SPATIAL
indexes, as described in
Section 13.1.13, “CREATE INDEX Statement”.
For MyISAM
tables, MySQL can create spatial
indexes using syntax similar to that for creating regular
indexes, but using the SPATIAL
keyword.
Columns in spatial indexes must be declared NOT
NULL
. The following examples demonstrate how to create
spatial indexes:
With CREATE TABLE
:
CREATE TABLE geom (g GEOMETRY NOT NULL, SPATIAL INDEX(g)) ENGINE=MyISAM;
With ALTER TABLE
:
CREATE TABLE geom (g GEOMETRY NOT NULL) ENGINE=MyISAM; ALTER TABLE geom ADD SPATIAL INDEX(g);
With CREATE INDEX
:
CREATE TABLE geom (g GEOMETRY NOT NULL) ENGINE=MyISAM; CREATE SPATIAL INDEX g ON geom (g);
SPATIAL INDEX
creates an R-tree index. For
storage engines that support nonspatial indexing of spatial
columns, the engine creates a B-tree index. A B-tree index on
spatial values is useful for exact-value lookups, but not for
range scans.
For more information on indexing spatial columns, see Section 13.1.13, “CREATE INDEX Statement”.
To drop spatial indexes, use ALTER
TABLE
or DROP INDEX
:
With ALTER TABLE
:
ALTER TABLE geom DROP INDEX g;
With DROP INDEX
:
DROP INDEX g ON geom;
Example: Suppose that a table geom
contains
more than 32,000 geometries, which are stored in the column
g
of type GEOMETRY
. The
table also has an AUTO_INCREMENT
column
fid
for storing object ID values.
mysql>DESCRIBE geom;
+-------+----------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+----------+------+-----+---------+----------------+ | fid | int(11) | | PRI | NULL | auto_increment | | g | geometry | | | | | +-------+----------+------+-----+---------+----------------+ 2 rows in set (0.00 sec) mysql>SELECT COUNT(*) FROM geom;
+----------+ | count(*) | +----------+ | 32376 | +----------+ 1 row in set (0.00 sec)
To add a spatial index on the column g
, use
this statement:
mysql> ALTER TABLE geom ADD SPATIAL INDEX(g) ENGINE=MyISAM;
Query OK, 32376 rows affected (4.05 sec)
Records: 32376 Duplicates: 0 Warnings: 0
The optimizer investigates whether available spatial indexes can
be involved in the search for queries that use a function such
as MBRContains()
or
MBRWithin()
in the
WHERE
clause. The following query finds all
objects that are in the given rectangle:
mysql>SET @poly =
->'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
mysql>SELECT fid,AsText(g) FROM geom WHERE
->MBRContains(GeomFromText(@poly),g);
+-----+---------------------------------------------------------------+ | fid | AsText(g) | +-----+---------------------------------------------------------------+ | 21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... | | 22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8, ... | | 23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4, ... | | 24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4, ... | | 25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... | | 26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4, ... | | 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... | | 1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... | | 2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136, ... | | 3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... | | 4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... | | 5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4, ... | | 6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... | | 7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8, ... | | 10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6, ... | | 11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2, ... | | 13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... | | 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... | | 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... | | 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4, ... | +-----+---------------------------------------------------------------+ 20 rows in set (0.00 sec)
Use EXPLAIN
to check the way this
query is executed:
mysql>SET @poly =
->'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
mysql>EXPLAIN SELECT fid,AsText(g) FROM geom WHERE
->MBRContains(GeomFromText(@poly),g)\G
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: geom type: range possible_keys: g key: g key_len: 32 ref: NULL rows: 50 Extra: Using where 1 row in set (0.00 sec)
Check what would happen without a spatial index:
mysql>SET @poly =
->'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
mysql>EXPLAIN SELECT fid,AsText(g) FROM g IGNORE INDEX (g) WHERE
->MBRContains(GeomFromText(@poly),g)\G
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: geom type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 32376 Extra: Using where 1 row in set (0.00 sec)
Executing the SELECT
statement
without the spatial index yields the same result but causes the
execution time to rise from 0.00 seconds to 0.46 seconds:
mysql>SET @poly =
->'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
mysql>SELECT fid,AsText(g) FROM geom IGNORE INDEX (g) WHERE
->MBRContains(GeomFromText(@poly),g);
+-----+---------------------------------------------------------------+ | fid | AsText(g) | +-----+---------------------------------------------------------------+ | 1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... | | 2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136, ... | | 3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... | | 4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... | | 5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4, ... | | 6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... | | 7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8, ... | | 10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6, ... | | 11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2, ... | | 13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... | | 21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... | | 22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8, ... | | 23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4, ... | | 24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4, ... | | 25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... | | 26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4, ... | | 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... | | 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... | | 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4, ... | | 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... | +-----+---------------------------------------------------------------+ 20 rows in set (0.46 sec)
Data type specifications can have explicit or implicit default values.
A DEFAULT
clause in a data type specification explicitly indicates a
default value for a column. Examples:
value
CREATE TABLE t1 ( i INT DEFAULT -1, c VARCHAR(10) DEFAULT '', price DOUBLE(16,2) DEFAULT '0.00' );
SERIAL DEFAULT VALUE
is a special case. In
the definition of an integer column, it is an alias for
NOT NULL AUTO_INCREMENT UNIQUE
.
With one exception, the default value specified in a
DEFAULT
clause must be a literal constant; it
cannot be a function or an expression. This means, for example,
that you cannot set the default for a date column to be the
value of a function such as NOW()
or CURRENT_DATE
. The exception is
that, for TIMESTAMP
and
DATETIME
columns, you can specify
CURRENT_TIMESTAMP
as the default.
See Section 11.2.6, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”.
The BLOB
and
TEXT
data types cannot be
assigned a default value.
If a data type specification includes no explicit
DEFAULT
value, MySQL determines the default
value as follows:
If the column can take NULL
as a value, the
column is defined with an explicit DEFAULT
NULL
clause.
If the column cannot take NULL
as a value,
MySQL defines the column with no explicit
DEFAULT
clause. Exception: If the column is
defined as part of a PRIMARY KEY
but not
explicitly as NOT NULL
, MySQL creates it as a
NOT NULL
column (because PRIMARY
KEY
columns must be NOT NULL
), but
also assigns it a DEFAULT
clause using the
implicit default value. To prevent this, include an explicit
NOT NULL
in the definition of any
PRIMARY KEY
column.
For data entry into a NOT NULL
column that
has no explicit DEFAULT
clause, if an
INSERT
or
REPLACE
statement includes no
value for the column, or an
UPDATE
statement sets the column
to NULL
, MySQL handles the column according
to the SQL mode in effect at the time:
If strict SQL mode is enabled, an error occurs for transactional tables and the statement is rolled back. For nontransactional tables, an error occurs, but if this happens for the second or subsequent row of a multiple-row statement, the preceding rows will have been inserted.
If strict mode is not enabled, MySQL sets the column to the implicit default value for the column data type.
Suppose that a table t
is defined as follows:
CREATE TABLE t (i INT NOT NULL);
In this case, i
has no explicit default, so
in strict mode each of the following statements produce an error
and no row is inserted. When not using strict mode, only the
third statement produces an error; the implicit default is
inserted for the first two statements, but the third fails
because DEFAULT(i)
cannot produce
a value:
INSERT INTO t VALUES(); INSERT INTO t VALUES(DEFAULT); INSERT INTO t VALUES(DEFAULT(i));
See Section 5.1.10, “Server SQL Modes”.
For a given table, the SHOW CREATE
TABLE
statement displays which columns have an
explicit DEFAULT
clause.
Implicit defaults are defined as follows:
For numeric types, the default is 0
, with
the exception that for integer or floating-point types
declared with the AUTO_INCREMENT
attribute, the default is the next value in the sequence.
For date and time types other than
TIMESTAMP
, the default is the
appropriate “zero” value for the type. This is
also true for TIMESTAMP
if
the
explicit_defaults_for_timestamp
system variable is enabled (see
Section 5.1.7, “Server System Variables”). Otherwise, for
the first TIMESTAMP
column in
a table, the default value is the current date and time. See
Section 11.2, “Date and Time Data Types”.
For string types other than
ENUM
, the default value is
the empty string. For ENUM
,
the default is the first enumeration value.
The storage requirements for table data on disk depend on several factors. Different storage engines represent data types and store raw data differently. Table data might be compressed, either for a column or an entire row, complicating the calculation of storage requirements for a table or column.
Despite differences in storage layout on disk, the internal MySQL APIs that communicate and exchange information about table rows use a consistent data structure that applies across all storage engines.
This section includes guidelines and information for the storage requirements for each data type supported by MySQL, including the internal format and size for storage engines that use a fixed-size representation for data types. Information is listed by category or storage engine.
The internal representation of a table has a maximum row size of
65,535 bytes, even if the storage engine is capable of supporting
larger rows. This figure excludes
BLOB
or
TEXT
columns, which contribute only
9 to 12 bytes toward this size. For
BLOB
and
TEXT
data, the information is
stored internally in a different area of memory than the row
buffer. Different storage engines handle the allocation and
storage of this data in different ways, according to the method
they use for handling the corresponding types. For more
information, see Chapter 15, Alternative Storage Engines, and
Section 8.4.7, “Limits on Table Column Count and Row Size”.
See Section 14.11, “InnoDB Row Formats” for information about
storage requirements for InnoDB
tables.
NDB
tables use
4-byte alignment; all
NDB
data storage is done in
multiples of 4 bytes. Thus, a column value that would
typically take 15 bytes requires 16 bytes in an
NDB
table. For example, in
NDB
tables, the
TINYINT
,
SMALLINT
,
MEDIUMINT
, and
INTEGER
(INT
) column types each require
4 bytes storage per record due to the alignment factor.
Each BIT(
column takes M
)M
bits of storage
space. Although an individual
BIT
column is
not 4-byte aligned,
NDB
reserves 4 bytes (32 bits)
per row for the first 1-32 bits needed for
BIT
columns, then another 4 bytes for bits
33-64, and so on.
While a NULL
itself does not require any
storage space, NDB
reserves 4
bytes per row if the table definition contains any columns
defined as NULL
, up to 32
NULL
columns. (If an NDB Cluster table is
defined with more than 32 NULL
columns up
to 64 NULL
columns, then 8 bytes per row
are reserved.)
Every table using the NDB
storage
engine requires a primary key; if you do not define a primary
key, a “hidden” primary key is created by
NDB
. This hidden primary key
consumes 31-35 bytes per table record.
You can use the ndb_size.pl Perl script to
estimate NDB
storage requirements.
It connects to a current MySQL (not NDB Cluster) database and
creates a report on how much space that database would require
if it used the NDB
storage engine.
See Section 18.4.27, “ndb_size.pl — NDBCLUSTER Size Requirement Estimator” for
more information.
Data Type | Storage Required |
---|---|
TINYINT |
1 byte |
SMALLINT |
2 bytes |
MEDIUMINT |
3 bytes |
INT ,
INTEGER |
4 bytes |
BIGINT |
8 bytes |
FLOAT( |
4 bytes if 0 <= p <= 24, 8 bytes if 25
<= p <= 53 |
FLOAT |
4 bytes |
DOUBLE [PRECISION] ,
REAL |
8 bytes |
DECIMAL( ,
NUMERIC( |
Varies; see following discussion |
BIT( |
approximately (M +7)/8 bytes |
Values for DECIMAL
(and
NUMERIC
) columns are represented
using a binary format that packs nine decimal (base 10) digits
into four bytes. Storage for the integer and fractional parts of
each value are determined separately. Each multiple of nine
digits requires four bytes, and the “leftover”
digits require some fraction of four bytes. The storage required
for excess digits is given by the following table.
Leftover Digits | Number of Bytes |
---|---|
0 | 0 |
1 | 1 |
2 | 1 |
3 | 2 |
4 | 2 |
5 | 3 |
6 | 3 |
7 | 4 |
8 | 4 |
For TIME
,
DATETIME
, and
TIMESTAMP
columns, the storage
required for tables created before MySQL 5.6.4 differs from
tables created from 5.6.4 on. This is due to a change in 5.6.4
that permits these types to have a fractional part, which
requires from 0 to 3 bytes.
Data Type | Storage Required Before MySQL 5.6.4 | Storage Required as of MySQL 5.6.4 |
---|---|---|
YEAR |
1 byte | 1 byte |
DATE |
3 bytes | 3 bytes |
TIME |
3 bytes | 3 bytes + fractional seconds storage |
DATETIME |
8 bytes | 5 bytes + fractional seconds storage |
TIMESTAMP |
4 bytes | 4 bytes + fractional seconds storage |
As of MySQL 5.6.4, storage for
YEAR
and
DATE
remains unchanged. However,
TIME
,
DATETIME
, and
TIMESTAMP
are represented
differently. DATETIME
is packed
more efficiently, requiring 5 rather than 8 bytes for the
nonfractional part, and all three parts have a fractional part
that requires from 0 to 3 bytes, depending on the fractional
seconds precision of stored values.
Fractional Seconds Precision | Storage Required |
---|---|
0 | 0 bytes |
1, 2 | 1 byte |
3, 4 | 2 bytes |
5, 6 | 3 bytes |
For example, TIME(0)
,
TIME(2)
,
TIME(4)
, and
TIME(6)
use 3, 4, 5, and 6 bytes,
respectively. TIME
and
TIME(0)
are equivalent and
require the same storage.
For details about internal representation of temporal values, see MySQL Internals: Important Algorithms and Structures.
In the following table, M
represents
the declared column length in characters for nonbinary string
types and bytes for binary string types.
L
represents the actual length in
bytes of a given string value.
Data Type | Storage Required |
---|---|
CHAR( |
The compact family of InnoDB row formats optimize storage for
variable-length character sets. See
COMPACT Row Format Storage Characteristics.
Otherwise, M ×
w bytes, <=
255, where
w is the number of bytes
required for the maximum-length character in the character
set. |
BINARY( |
M bytes, 0 <=
255 |
VARCHAR( ,
VARBINARY( |
L + 1 bytes if column values require 0
− 255 bytes, L + 2 bytes
if values may require more than 255 bytes |
TINYBLOB ,
TINYTEXT |
L + 1 bytes, where
L <
28 |
BLOB , TEXT |
L + 2 bytes, where
L <
216 |
MEDIUMBLOB ,
MEDIUMTEXT |
L + 3 bytes, where
L <
224 |
LONGBLOB ,
LONGTEXT |
L + 4 bytes, where
L <
232 |
ENUM(' |
1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum) |
SET(' |
1, 2, 3, 4, or 8 bytes, depending on the number of set members (64 members maximum) |
Variable-length string types are stored using a length prefix
plus data. The length prefix requires from one to four bytes
depending on the data type, and the value of the prefix is
L
(the byte length of the string).
For example, storage for a
MEDIUMTEXT
value requires
L
bytes to store the value plus three
bytes to store the length of the value.
To calculate the number of bytes used to store a particular
CHAR
,
VARCHAR
, or
TEXT
column value, you must take
into account the character set used for that column and whether
the value contains multibyte characters. In particular, when
using a utf8
Unicode character set, you must
keep in mind that not all characters use the same number of
bytes. utf8mb3
and utf8mb4
character sets can require up to three and four bytes per
character, respectively. For a breakdown of the storage used for
different categories of utf8mb3
or
utf8mb4
characters, see
Section 10.9, “Unicode Support”.
VARCHAR
,
VARBINARY
, and the
BLOB
and
TEXT
types are variable-length
types. For each, the storage requirements depend on these
factors:
The actual length of the column value
The column's maximum possible length
The character set used for the column, because some character sets contain multibyte characters
For example, a VARCHAR(255)
column can hold a
string with a maximum length of 255 characters. Assuming that
the column uses the latin1
character set (one
byte per character), the actual storage required is the length
of the string (L
), plus one byte to
record the length of the string. For the string
'abcd'
, L
is 4 and
the storage requirement is five bytes. If the same column is
instead declared to use the ucs2
double-byte
character set, the storage requirement is 10 bytes: The length
of 'abcd'
is eight bytes and the column
requires two bytes to store lengths because the maximum length
is greater than 255 (up to 510 bytes).
The effective maximum number of bytes that
can be stored in a VARCHAR
or
VARBINARY
column is subject to
the maximum row size of 65,535 bytes, which is shared among all
columns. For a VARCHAR
column
that stores multibyte characters, the effective maximum number
of characters is less. For example,
utf8mb3
characters can require up to three
bytes per character, so a VARCHAR
column that uses the utf8mb3
character set
can be declared to be a maximum of 21,844 characters. See
Section 8.4.7, “Limits on Table Column Count and Row Size”.
InnoDB
encodes fixed-length fields greater
than or equal to 768 bytes in length as variable-length fields,
which can be stored off-page. For example, a
CHAR(255)
column can exceed 768 bytes if the
maximum byte length of the character set is greater than 3, as
it is with utf8mb4
.
The NDB
storage engine supports
variable-width columns. This means that a
VARCHAR
column in an NDB Cluster
table requires the same amount of storage as would any other
storage engine, with the exception that such values are 4-byte
aligned. Thus, the string 'abcd'
stored in a
VARCHAR(50)
column using the
latin1
character set requires 8 bytes (rather
than 5 bytes for the same column value in a
MyISAM
table).
TEXT
and
BLOB
columns are implemented
differently in the NDB
storage
engine, wherein each row in a
TEXT
column is made up of two
separate parts. One of these is of fixed size (256 bytes), and
is actually stored in the original table. The other consists of
any data in excess of 256 bytes, which is stored in a hidden
table. The rows in this second table are always 2000 bytes long.
This means that the size of a
TEXT
column is 256 if
size
<= 256 (where
size
represents the size of the row);
otherwise, the size is 256 +
size
+ (2000 ×
(size
− 256) % 2000).
The size of an ENUM
object is
determined by the number of different enumeration values. One
byte is used for enumerations with up to 255 possible values.
Two bytes are used for enumerations having between 256 and
65,535 possible values. See Section 11.3.5, “The ENUM Type”.
The size of a SET
object is
determined by the number of different set members. If the set
size is N
, the object occupies
(
bytes,
rounded up to 1, 2, 3, 4, or 8 bytes. A
N
+7)/8SET
can have a maximum of 64
members. See Section 11.3.6, “The SET Type”.
MySQL stores geometry values using 4 bytes to indicate the SRID
followed by the WKB representation of the value. The
LENGTH()
function returns the
space in bytes required for value storage.
For descriptions of WKB and internal storage formats for spatial values, see Section 11.4.3, “Supported Spatial Data Formats”.
For optimum storage, you should try to use the most precise type
in all cases. For example, if an integer column is used for values
in the range from 1
to
99999
, MEDIUMINT UNSIGNED
is
the best type. Of the types that represent all the required
values, this type uses the least amount of storage.
All basic calculations (+
,
-
, *
, and
/
) with DECIMAL
columns are done with precision of 65 decimal (base 10) digits.
See Section 11.1.1, “Numeric Data Type Syntax”.
If accuracy is not too important or if speed is the highest
priority, the DOUBLE
type may be
good enough. For high precision, you can always convert to a
fixed-point type stored in a
BIGINT
. This enables you to do all
calculations with 64-bit integers and then convert results back to
floating-point values as necessary.
PROCEDURE ANALYSE
can be used to obtain
suggestions for optimal column data types. For more information,
see Section 8.4.2.4, “Using PROCEDURE ANALYSE”.
To facilitate the use of code written for SQL implementations from other vendors, MySQL maps data types as shown in the following table. These mappings make it easier to import table definitions from other database systems into MySQL.
Other Vendor Type | MySQL Type |
---|---|
BOOL |
TINYINT |
BOOLEAN |
TINYINT |
CHARACTER VARYING( |
VARCHAR( |
FIXED |
DECIMAL |
FLOAT4 |
FLOAT |
FLOAT8 |
DOUBLE |
INT1 |
TINYINT |
INT2 |
SMALLINT |
INT3 |
MEDIUMINT |
INT4 |
INT |
INT8 |
BIGINT |
LONG VARBINARY |
MEDIUMBLOB |
LONG VARCHAR |
MEDIUMTEXT |
LONG |
MEDIUMTEXT |
MIDDLEINT |
MEDIUMINT |
NUMERIC |
DECIMAL |
Data type mapping occurs at table creation time, after which the
original type specifications are discarded. If you create a table
with types used by other vendors and then issue a
DESCRIBE
statement, MySQL reports the table structure using the equivalent
MySQL types. For example:
tbl_name
mysql>CREATE TABLE t (a BOOL, b FLOAT8, c LONG VARCHAR, d NUMERIC);
Query OK, 0 rows affected (0.00 sec) mysql>DESCRIBE t;
+-------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+---------------+------+-----+---------+-------+ | a | tinyint(1) | YES | | NULL | | | b | double | YES | | NULL | | | c | mediumtext | YES | | NULL | | | d | decimal(10,0) | YES | | NULL | | +-------+---------------+------+-----+---------+-------+ 4 rows in set (0.01 sec)