50.0% complete question in general, what is the difference between the char and string data types?

This topic describes the string/text data types, including binary strings, supported in Snowflake, along with the supported formats for string constants/literals.

In this Topic:

Data Types for Text Strings¶

Snowflake supports the following data types for text (i.e. character) strings.

VARCHAR¶

VARCHAR holds Unicode UTF-8 characters.

When you declare a column of type VARCHAR, you can specify an optional parameter (N), which is the maximum number of characters to store. For example:

create table t1 (v varchar(16777216));

If no length is specified, the default is the maximum allowed length (16,777,216).

Although a VARCHAR’s maximum length is specified in characters, a VARCHAR is also limited to a maximum number of bytes (16,777,216 (16 MB)). The maximum number of Unicode characters that can be stored in a VARCHAR column is shown below:

Single-byte

16,777,216.

Multi-byte

Between 8,388,608 (2 bytes per character) and 4,194,304 (4 bytes per character).

For example, if you declare a column as VARCHAR(16777216), the column can hold a maximum of 8,388,608 2-byte Unicode characters, even though you specified a maximum length of 16777216.

A column consumes storage for only the amount of actual data stored. For example, a 1-character string in a VARCHAR(16777216) column only consumes a single character.

Note that in some systems, data types such as CHAR and VARCHAR store ASCII while data types such as NCHAR and NVARCHAR store Unicode. In Snowflake, VARCHAR and all other string data types store Unicode UTF-8 characters. There is no difference with respect to Unicode handling between CHAR and NCHAR data types. Synonyms such as NCHAR are primarily for syntax compatibility when porting DDL commands to Snowflake.

There is no performance difference between using the full-length VARCHAR declaration VARCHAR(16777216) or a smaller length. Note that in any relational database, SELECT statements in which a WHERE clause references VARCHAR columns or string columns are not as fast as SELECT statements filtered using a date or numeric column condition.

Some BI/ETL tools define the maximum size of the VARCHAR data in storage or in memory. If you know the maximum size for a column, you could limit the size when you add the column.

CHAR , CHARACTER , NCHAR¶

Synonymous with VARCHAR, except that if the length is not specified, CHAR(1) is the default.

Note

Snowflake currently deviates from common CHAR semantics in that strings shorter than the maximum length are not space-padded at the end.

STRING , TEXT , NVARCHAR , NVARCHAR2 , CHAR VARYING , NCHAR VARYING¶

Synonymous with VARCHAR.

String Examples in Table Columns¶

CREATE OR REPLACE TABLE test_text(v VARCHAR, v50 VARCHAR(50), c CHAR, c10 CHAR(10), s STRING, s20 STRING(20), t TEXT, t30 TEXT(30) ); DESC TABLE test_text; +------+-------------------+--------+-------+---------+-------------+------------+-------+------------+---------+ | name | type | kind | null? | default | primary key | unique key | check | expression | comment | |------+-------------------+--------+-------+---------+-------------+------------+-------+------------+---------| | V | VARCHAR(16777216) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | V50 | VARCHAR(50) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | C | VARCHAR(1) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | C10 | VARCHAR(10) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | S | VARCHAR(16777216) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | S20 | VARCHAR(20) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | T | VARCHAR(16777216) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | T30 | VARCHAR(30) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | +------+-------------------+--------+-------+---------+-------------+------------+-------+------------+---------+

Data Types for Binary Strings¶

Snowflake supports the following data types for binary strings.

BINARY¶

The maximum length is 8 MB (8,388,608 bytes). Unlike VARCHAR, the BINARY data type has no notion of Unicode characters, so the length is always measured in terms of bytes.

(BINARY values are limited to 8 MB so that they fit within 16 MB when converted to hexadecimal strings, e.g. via TO_CHAR(, 'HEX').)

If a length is not specified, the default is the maximum length.

VARBINARY¶

VARBINARY is synonymous with BINARY.

Internal Representation¶

The BINARY data type holds a sequence of 8-bit bytes.

When Snowflake displays BINARY data values, Snowflake often represents each byte as 2 hexadecimal characters. For example, the word “HELP” might be displayed as 48454C50, where “48” is the hexadecimal equivalent of the ASCII (Unicode) letter “H”, “45” is the hexadecimal representation of the letter “E”, etc.

For more information about entering and displaying BINARY data, see: Binary Input and Output.

Binary Examples in Table Columns¶

CREATE OR REPLACE TABLE test_binary(b BINARY, b100 BINARY(100), vb VARBINARY ); DESC TABLE test_binary; +------+-----------------+--------+-------+---------+-------------+------------+-------+------------+---------+ | name | type | kind | null? | default | primary key | unique key | check | expression | comment | |------+-----------------+--------+-------+---------+-------------+------------+-------+------------+---------| | B | BINARY(8388608) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | B100 | BINARY(100) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | | VB | BINARY(8388608) | COLUMN | Y | NULL | N | N | NULL | NULL | NULL | +------+-----------------+--------+-------+---------+-------------+------------+-------+------------+---------+

String Constants¶

Constants (also known as literals) refer to fixed data values. String constants in Snowflake must always be enclosed between delimiter characters. Snowflake supports using either of the following to delimit string constants:

  • Single quotes.

  • Pairs of dollar signs.

Single-Quoted String Constants¶

A string constant can be enclosed between single quote delimiters (e.g. 'This is a string'). To include a single quote character within a string constant, type two adjacent single quotes (e.g. '').

For example:

SELECT 'Today''s sales projections', '-''''-'; +------------------------------+----------+ | 'TODAY''S SALES PROJECTIONS' | '-''''-' | |------------------------------+----------| | Today's sales projections | -''- | +------------------------------+----------+

Note

Two single quotes is not the same as the double quote character ("), which is used (as needed) for delimiting object identifiers. For more information, see Identifier Requirements.

Escape Sequences in Single-Quoted String Constants¶

To include a single quote or other special characters (e.g. newlines) in a single-quoted string constant, you must escape these characters by using backslash escape sequences. A backslash escape sequence is a sequence of characters that begins with a backslash (\).

Note

If the string contains contains many single quotes, backslashes, or other special characters, you can use a dollar-quoted string constant instead to avoid escaping these characters.

You can also use escape sequences to insert ASCII characters by specifying their code points (the numeric values that correspond to those characters) in octal or hexadecimal. For example, in ASCII, the code point for the space character is 32, which is 20 in hexadecimal. To specify a space, you can use the hexadecimal escape sequence \x20.

You can also use escape sequences to insert Unicode characters, for example \u26c4.

The following table lists the supported escape sequences in four categories: simple, octal, hexadecimal, and Unicode:

Escape Sequence

Character Represented

Simple Escape Sequences

\'

A single quote (') character

\"

A double quote (") character

\\

A backslash (\) character

\b

A backspace character

\f

A formfeed character

\n

A newline (linefeed) character

\r

A carriage return character

\t

A tab character

\0

An ASCII NUL character

Octal Escape Sequences

\ooo

ASCII character in octal notation (i.e., where each o represents an octal digit).

Hexadecimal Escape Sequences

\xhh

ASCII character in hexadecimal notation (i.e., where each h represents a hexadecimal digit).

Unicode Escape Sequences

\uhhhh

Unicode character in hexadecimal notation (i.e., where each h represents a hexadecimal digit). The number of hexadecimal digits must be exactly 4.

As shown in the table above, if a string constant must include a backslash character (e.g. C:\ in a Windows path or \d in a regular expression), you must escape the backslash with a second backslash. For example, to include \d in a regular expression in a string constant, you must use \\d.

Note that if a backslash is used in sequences other than the ones listed above, the backslash is ignored. For example, the sequence of characters '\z' is interpreted as 'z'.

The following example demonstrates how to use backslash escape sequences. This includes examples of specifying:

  • a tab character

  • a newline

  • a backslash

  • the octal and hexadecimal escape sequences for an exclamation mark (code point 33, which is \041 in octal and \x21 in hexadecimal)

  • the Unicode escape sequence for a small image of a snowman

  • something that is not a valid escape sequence

    SELECT $1, $2 FROM VALUES ('Tab','Hello\tWorld'), ('Newline','Hello\nWorld'), ('Backslash','C:\\user'), ('Octal','-\041-'), ('Hexadecimal','-\x21-'), ('Unicode','-\u26c4-'), ('Not an escape sequence', '\z') ; +------------------------+---------------+ | $1 | $2 | |------------------------+---------------| | Tab | Hello World | | Newline | Hello | | | World | | Backslash | C:\user | | Octal | -!- | | Hexadecimal | -!- | | Unicode | -⛄- | | Not an escape sequence | z | +------------------------+---------------+

Dollar-Quoted String Constants¶

In some cases, you might need to specify a string constant that contains:

  • Single quote characters.

  • Backslash characters (e.g. in a regular expression).

  • Newline characters (e.g. in the body of a stored procedure or function that you specify in CREATE PROCEDURE or CREATE FUNCTION).

In these cases, you can avoid escaping these characters by using a pair of dollar signs ($$) rather than a single quote (') to delimit the beginning and ending of the string.

In a dollar-quoted string constant, you can include quotes, backslashes, newlines and any other special character (except for double-dollar signs) without escaping those characters. The content of a dollar-quoted string constant is always interpreted literally.

The following examples are equivalent ways of specifying string constants:

Example Using Single Quote Delimiters

Example Using Double Dollar Sign Delimiters

'string with a \' character'

$$string with a ' character$$

'regular expression with \\ characters: \\d{2}-\\d{3}-\\d{4}'

$$regular expression with \ characters: \d{2}-\d{3}-\d{4}$$

'string with a newline\\ncharacter'

$$string with a newline character$$

The following example uses a dollar-quoted string constant that contains newlines and several escape sequences.

SELECT $1, $2 FROM VALUES ('row1', $$a ' \ \t \x21 z $ $$); +------+-------------------------------------------------------+ | $1 | $2 | |------+-------------------------------------------------------| | row1 | a | | | ' \ \t | | | \x21 z $ | +------+-------------------------------------------------------+

In this example, note how the escape sequences are interpreted as their individual characters (e.g. a backslash followed by a “t”), rather than as escape sequences.

What is the difference between char and string data type?

The main difference between Character and String is that Character refers to a single letter, number, space, punctuation mark or a symbol that can be represented using a computer while String refers to a set of characters. In C programming, we can use char data type to store both character and string values.

What are the differences between character data type and string data type with example?

Difference between String and Character array in Java.

What is the difference between char and string data types in Java?

A char is a single character, which is a 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or 65,535 inclusive). On the other hand, a string consists of a sequence of characters (zero or more characters).

What is the difference between the char data type and the VARCHAR data types and when should each be used?

Differences between Char & Varchar Char datatype is used to store character strings of fixed length. Varchar datatype is used to store character strings of variable length. It uses static memory location.