JSON5 Formal Grammar

The complete formal grammar specification for JSON5 implementers and language enthusiasts.

TL;DR: This page contains the formal grammar specification for JSON5 in ABNF notation. It's primarily intended for implementers building JSON5 parsers or tools.

Grammar Notation

This specification uses a modified ABNF (Augmented Backus-Naur Form) notation similar to that used in ECMAScript specifications.

Notation Conventions

Notation Meaning Example
A B A followed by B (concatenation) { } means { then }
A | B A or B (alternation) true | false
A? Zero or one A (optional) Sign? optional sign
A* Zero or more A Digit* any digits
A+ One or more A Digit+ at least one digit
"text" Literal text "null" the word null
[a-z] Character range [0-9] digits 0-9
<NAME> Unicode category or named character <TAB> tab character

JSON5 Values

A JSON5 document consists of a single JSON5 value optionally surrounded by whitespace and comments.

Value Productions

JSON5Text : JSON5Value JSON5Value : JSON5Null JSON5Boolean JSON5String JSON5Number JSON5Object JSON5Array JSON5Null : "null" JSON5Boolean : "true" "false"

Note: Unlike JSON, which only allows objects or arrays at the top level, JSON5 allows any value type as the root element.

Object Grammar

Objects are unordered collections of key-value pairs enclosed in curly braces.

Object Productions

JSON5Object : "{" "}" "{" JSON5MemberList "}" "{" JSON5MemberList "," "}" JSON5MemberList : JSON5Member JSON5MemberList "," JSON5Member JSON5Member : JSON5MemberName ":" JSON5Value JSON5MemberName : JSON5Identifier JSON5String

Identifier Grammar

JSON5Identifier : IdentifierName but not ReservedWord IdentifierName : IdentifierStart IdentifierName IdentifierPart IdentifierStart : UnicodeLetter "$" "_" "\" UnicodeEscapeSequence IdentifierPart : IdentifierStart UnicodeCombiningMark UnicodeDigit UnicodeConnectorPunctuation <ZWNJ> <ZWJ> UnicodeLetter : any character in Unicode categories "Lu" "Ll" "Lt" "Lm" "Lo" "Nl" UnicodeDigit : any character in Unicode category "Nd" UnicodeCombiningMark : any character in Unicode categories "Mn" "Mc" UnicodeConnectorPunctuation : any character in Unicode category "Pc"

Array Grammar

Arrays are ordered sequences of values enclosed in square brackets.

Array Productions

JSON5Array : "[" "]" "[" JSON5ElementList "]" "[" JSON5ElementList "," "]" JSON5ElementList : JSON5Value JSON5ElementList "," JSON5Value

String Grammar

Strings can use single or double quotes and support various escape sequences.

String Productions

JSON5String : '"' JSON5DoubleStringCharacter? '"' "'" JSON5SingleStringCharacter? "'" JSON5DoubleStringCharacter : JSON5DoubleStringCharacter JSON5DoubleStringCharacter? JSON5SingleStringCharacter : JSON5SingleStringCharacter JSON5SingleStringCharacter? JSON5DoubleStringCharacter : SourceCharacter but not '"' or "\" or LineTerminator "\" EscapeSequence LineContinuation <LS> <PS> JSON5SingleStringCharacter : SourceCharacter but not "'" or "\" or LineTerminator "\" EscapeSequence LineContinuation <LS> <PS> LineContinuation : "\" LineTerminatorSequence

Escape Sequences

EscapeSequence : CharacterEscapeSequence "0" [lookahead not in DecimalDigit] HexEscapeSequence UnicodeEscapeSequence CharacterEscapeSequence : SingleEscapeCharacter NonEscapeCharacter SingleEscapeCharacter : "'" | '"' | "\" | "b" | "f" | "n" | "r" | "t" | "v" NonEscapeCharacter : SourceCharacter but not EscapeCharacter or LineTerminator EscapeCharacter : SingleEscapeCharacter DecimalDigit "x" "u" HexEscapeSequence : "x" HexDigit HexDigit UnicodeEscapeSequence : "u" HexDigit HexDigit HexDigit HexDigit HexDigit : [0-9a-fA-F]

Number Grammar

Numbers can be decimal or hexadecimal, and include special values like Infinity and NaN.

Number Productions

JSON5Number : JSON5NumericLiteral "Infinity" "NaN" JSON5NumericLiteral : NumericLiteral "+" NumericLiteral "-" NumericLiteral NumericLiteral : DecimalLiteral HexIntegerLiteral DecimalLiteral : DecimalIntegerLiteral "." DecimalDigits? ExponentPart? "." DecimalDigits ExponentPart? DecimalIntegerLiteral ExponentPart? DecimalIntegerLiteral : "0" NonZeroDigit DecimalDigits? DecimalDigits : DecimalDigit DecimalDigits DecimalDigit DecimalDigit : [0-9] NonZeroDigit : [1-9] ExponentPart : ExponentIndicator SignedInteger ExponentIndicator : "e" | "E" SignedInteger : DecimalDigits "+" DecimalDigits "-" DecimalDigits HexIntegerLiteral : "0x" HexDigit+ "0X" HexDigit+

Note: Unlike JSON, JSON5 allows leading decimal points (.5), trailing decimal points (5.), explicit positive signs (+5), and hexadecimal literals (0xFF).

Lexical Grammar

Whitespace, line terminators, and comments that separate tokens.

Whitespace

WhiteSpace : <TAB> U+0009 Tab <VT> U+000B Vertical Tab <FF> U+000C Form Feed <SP> U+0020 Space <NBSP> U+00A0 No-Break Space <BOM> U+FEFF Byte Order Mark <USP> Any Unicode "Zs" category character

Line Terminators

LineTerminator : <LF> U+000A Line Feed <CR> U+000D Carriage Return <LS> U+2028 Line Separator <PS> U+2029 Paragraph Separator LineTerminatorSequence : <LF> <CR> [lookahead not in <LF>] <LS> <PS> <CR> <LF>

Comments

Comment : MultiLineComment SingleLineComment MultiLineComment : "/*" MultiLineCommentChars? "*/" MultiLineCommentChars : MultiLineNotAsteriskChar MultiLineCommentChars? "*" PostAsteriskCommentChars? PostAsteriskCommentChars : MultiLineNotForwardSlashOrAsteriskChar MultiLineCommentChars? "*" PostAsteriskCommentChars? MultiLineNotAsteriskChar : SourceCharacter but not "*" MultiLineNotForwardSlashOrAsteriskChar : SourceCharacter but not "/" or "*" SingleLineComment : "//" SingleLineCommentChars? SingleLineCommentChars : SingleLineCommentChar SingleLineCommentChars? SingleLineCommentChar : SourceCharacter but not LineTerminator

Reserved Words

The following ECMAScript 5.1 reserved words cannot be used as unquoted object keys:

ReservedWord : Keyword FutureReservedWord NullLiteral BooleanLiteral Keyword : "break" "case" "catch" "continue" "debugger" "default" "delete" "do" "else" "finally" "for" "function" "if" "in" "instanceof" "new" "return" "switch" "this" "throw" "try" "typeof" "var" "void" "while" "with" FutureReservedWord : "class" "const" "enum" "export" "extends" "import" "super" FutureReservedWord (strict mode) : "implements" "interface" "let" "package" "private" "protected" "public" "static" "yield" NullLiteral : "null" BooleanLiteral : "true" "false"

Workaround: To use reserved words as object keys, simply quote them: { "class": "value" } or { 'class': 'value' }

Building a JSON5 Parser?

Check out the reference implementation or explore language-specific libraries.