This document defines the complete syntax and semantics of cspaced, an indentation-based dialect of the C programming language. This specification is comprehensive enough to enable implementation of parsers, syntax highlighters, and language servers.
- Overview
- Lexical Structure
- Syntax Grammar
- Expressions
- Statements
- Declarations
- Preprocessor
- Tree-Sitter Grammar Notes
cspaced is a transpiler that converts clean, indentation-based C syntax to traditional brace-based C. Key design principles:
- Significant Indentation: Uses 2-space indentation instead of braces
- Optional Semicolons: Inferred from newlines (except in some complex expressions)
- Full C Compatibility: All C features supported, zero runtime overhead
- Line-Oriented: Each statement/declaration typically occupies one line
// cspaced syntax
#include <stdio.h>
int factorial(int n):
if (n <= 1):
return 1
else:
return n * factorial(n - 1)
int main(void):
printf("Hello World!\n")
for (int i = 0; i < 10; i++):
printf("factorial(%d) = %d\n", i, factorial(i))
return 0
Generates equivalent C code with braces and semicolons.
- UTF-8 encoded source files
- Whitespace: space (U+0020), tab (U+0009), newline (U+000A)
- Significant whitespace: indentation level determines block structure
auto break case char const
continue default do double else
enum extern float for goto
if inline int long register
restrict return short signed sizeof
static struct switch typedef union
unsigned void volatile while _Alignas
_Alignof _Atomic _Bool _Complex _Generic
_Imaginary _Noreturn _Static_assert _Thread_local
+ - * / % ++ -- ~
& | ^ << >> && ||
== != < > <= >= ? :
= += -= *= /= %= &= |=
^= <<= >>= .
-> ! ~
( ) [ ] { } ; ,
. -> : ::
integer-literal: decimal | octal | hexadecimal | binary
floating-literal: decimal-floating | hexadecimal-floating
character-literal: 'c-char-sequence'
string-literal: "s-char-sequence"
- Indentation: Leading spaces at line start determine indent level
- Newline Handling: Newlines terminate statements unless in continuations
- Semicolon Inference: Added automatically unless explicitly present
- Comment Handling:
//and/* */comments preserved
indent-level ::= (space space)* // exactly 2 spaces per level
block ::= ':' newline indent statements dedent
// Examples:
if (condition): // level 0
statement1 // level 1
if (nested): // level 1
statement2 // level 2
statement3 // level 1
function(args): // level 0
statement // level 1
translation-unit ::= external-declaration*
external-declaration ::= function-definition
| declaration
| ';'
declaration ::= declaration-specifiers init-declarator-list? ';'
declaration-specifiers ::= storage-class-specifier declaration-specifiers?
| type-specifier declaration-specifiers?
| type-qualifier declaration-specifiers?
| function-specifier declaration-specifiers?
| alignment-specifier declaration-specifiers?
init-declarator-list ::= init-declarator (',' init-declarator)*
init-declarator ::= declarator ('=' initializer)?
declarator ::= pointer? direct-declarator
direct-declarator ::= identifier
| '(' declarator ')'
| direct-declarator '[' type-qualifier-list? assignment-expression? ']'
| direct-declarator '[' 'static' type-qualifier-list? assignment-expression ']'
| direct-declarator '[' type-qualifier-list 'static' assignment-expression ']'
| direct-declarator '[' type-qualifier-list? '*' ']'
| direct-declarator '(' parameter-type-list ')'
| direct-declarator '(' identifier-list? ')'
pointer ::= '*' type-qualifier-list?
type-qualifier-list ::= type-qualifier+
parameter-type-list ::= parameter-list (',' '...')?
parameter-list ::= parameter-declaration (',' parameter-declaration)*
parameter-declaration ::= declaration-specifiers declarator
| declaration-specifiers abstract-declarator?
type-name ::= specifier-qualifier-list abstract-declarator?
abstract-declarator ::= pointer
| pointer? direct-abstract-declarator
direct-abstract-declarator ::= '(' abstract-declarator ')'
| direct-abstract-declarator? '[' type-qualifier-list? assignment-expression? ']'
| direct-abstract-declarator? '[' 'static' type-qualifier-list? assignment-expression ']'
| direct-abstract-declarator? '[' type-qualifier-list 'static' assignment-expression ']'
| direct-abstract-declarator? '[' '*' ']'
| direct-abstract-declarator? '(' parameter-type-list? ')'
initializer ::= assignment-expression
| '{' initializer-list ','? '}'
initializer-list ::= designation? initializer (',' designation? initializer)*
designation ::= designator+
designator ::= '[' constant-expression ']'
| '.' identifier
// Storage class specifiers
storage-class-specifier ::= 'auto' | 'register' | 'static' | 'extern' | 'typedef' | '_Thread_local'
// Type specifiers
type-specifier ::= 'void' | 'char' | 'short' | 'int' | 'long' | 'float' | 'double'
| 'signed' | 'unsigned' | '_Bool' | '_Complex' | '_Imaginary'
| 'struct' struct-specifier | 'union' union-specifier | 'enum' enum-specifier
| typedef-name
struct-specifier ::= identifier? '{' struct-declaration+ '}'
| identifier
union-specifier ::= identifier? '{' struct-declaration+ '}'
| identifier
struct-declaration ::= specifier-qualifier-list struct-declarator-list ';'
struct-declarator-list ::= struct-declarator (',' struct-declarator)*
struct-declarator ::= declarator? ':' constant-expression
| declarator
enum-specifier ::= identifier? '{' enumerator-list ','? '}'
| identifier
enumerator-list ::= enumerator (',' enumerator)*
enumerator ::= enumeration-constant ('=' constant-expression)?
enumeration-constant ::= identifier
// Type qualifiers
type-qualifier ::= 'const' | 'restrict' | 'volatile' | '_Atomic'
// Function specifiers
function-specifier ::= 'inline' | '_Noreturn'
// Alignment specifier
alignment-specifier ::= '_Alignas' '(' type-name ')'
| '_Alignas' '(' constant-expression ')'
function-definition ::= declaration-specifiers declarator declaration-list? compound-statement
declaration-list ::= declaration+
// cspaced specific: compound-statement uses indentation
compound-statement ::= ':' newline indent statement-list dedent
statement-list ::= statement*
// In traditional C: compound-statement ::= '{' block-item-list? '}'
// block-item-list ::= block-item+
// block-item ::= declaration | statement
statement ::= labeled-statement
| compound-statement // indentation-based blocks
| expression-statement
| selection-statement
| iteration-statement
| jump-statement
labeled-statement ::= identifier ':' statement
| 'case' constant-expression ':' statement
| 'default' ':' statement
// cspaced extends selection with indentation
selection-statement ::= 'if' '(' expression ')' compound-statement ('else' compound-statement)?
| 'switch' '(' expression ')' compound-statement
// cspaced extends iteration with indentation
iteration-statement ::= 'while' '(' expression ')' compound-statement
| 'do' compound-statement 'while' '(' expression ')' ';'
| 'for' '(' for-init ';' expression? ';' expression? ')' compound-statement
for-init ::= declaration
| expression?
jump-statement ::= 'goto' identifier ';'
| 'continue' ';'
| 'break' ';'
| 'return' expression? ';'
// cspaced: semicolons optional in jump statements
jump-statement ::= 'goto' identifier
| 'continue'
| 'break'
| 'return' expression?
expression ::= assignment-expression (',' assignment-expression)*
assignment-expression ::= conditional-expression
| unary-expression assignment-operator assignment-expression
assignment-operator ::= '=' | '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|='
conditional-expression ::= logical-or-expression ('?' expression ':' conditional-expression)?
logical-or-expression ::= logical-and-expression ('||' logical-and-expression)*
logical-and-expression ::= inclusive-or-expression ('&&' inclusive-or-expression)*
inclusive-or-expression ::= exclusive-or-expression ('|' exclusive-or-expression)*
exclusive-or-expression ::= and-expression ('^' and-expression)*
and-expression ::= equality-expression ('&' equality-expression)*
equality-expression ::= relational-expression (('==' | '!=') relational-expression)*
relational-expression ::= shift-expression (('<' | '>' | '<=' | '>=') shift-expression)*
shift-expression ::= additive-expression (('<<' | '>>') additive-expression)*
additive-expression ::= multiplicative-expression (('+' | '-') multiplicative-expression)*
multiplicative-expression ::= cast-expression (('*' | '/' | '%') cast-expression)*
cast-expression ::= '(' type-name ')' cast-expression
| unary-expression
unary-expression ::= postfix-expression
| ('++' | '--') unary-expression
| unary-operator cast-expression
| 'sizeof' '(' type-name ')'
| 'sizeof' unary-expression
| '_Alignof' '(' type-name ')'
unary-operator ::= '&' | '*' | '+' | '-' | '~' | '!'
postfix-expression ::= primary-expression
| postfix-expression '[' expression ']'
| postfix-expression '(' argument-expression-list? ')'
| postfix-expression '.' identifier
| postfix-expression '->' identifier
| postfix-expression ('++' | '--')
| '(' type-name ')' '{' initializer-list ','? '}'
primary-expression ::= identifier
| constant
| string-literal
| '(' expression ')'
| generic-selection
generic-selection ::= '_Generic' '(' assignment-expression ',' generic-assoc-list ')'
generic-assoc-list ::= generic-association (',' generic-association)*
generic-association ::= type-name ':' assignment-expression
| 'default' ':' assignment-expression
constant ::= integer-constant | floating-constant | enumeration-constant | character-constant
argument-expression-list ::= assignment-expression (',' assignment-expression)*
block ::= ':' newline indent statement* dedent
indent ::= exactly 2 spaces per nesting level
dedent ::= reduce indentation by multiples of 2 spaces
// Level 0
if (condition):
// Level 1 (2 spaces)
statement1
if (nested):
// Level 2 (4 spaces)
statement2
// Level 1 again
statement3
// Level 0 again
statement ::= ... \
continued-line
// Automatically handled for:
- Function calls: func(arg1,
arg2,
arg3)
// - Binary ops: result = a + b +
// c + d
cspaced supports all standard C preprocessor directives:
preprocessing-file ::= group*
group ::= group-part*
| if-section
| control-line
| text-line
| '# non-directive'
group-part ::= if-section
| control-line
| text-line
| '# non-directive'
if-section ::= if-group elif-groups? else-group? endif-line
if-group ::= '# if' constant-expression newline group?
| '# ifdef' identifier newline group?
| '# ifndef' identifier newline group?
elif-groups ::= elif-group*
elif-group ::= '# elif' constant-expression newline group?
else-group ::= '# else' newline group?
endif-line ::= '# endif' newline
control-line ::= '# include' pp-tokens newline
| '# define' identifier replacement-list newline
| '# define' identifier '(' identifier-list? ')' replacement-list newline
| '# define' identifier '(' '...' ')' replacement-list newline
| '# define' identifier '(' identifier-list ',' '...' ')' replacement-list newline
| '# undef' identifier newline
| '# line' pp-tokens newline
| '# error' pp-tokens? newline
| '# pragma' pp-tokens? newline
| '# ' pp-tokens? newline
text-line ::= pp-tokens? newline
replacement-list ::= pp-tokens?
pp-tokens ::= preprocessing-token*
preprocessing-token ::= header-name
| identifier
| pp-number
| character-constant
| string-literal
| punctuator
| 'each non-white-space character'
// grammar.js structure for tree-sitter-cspaced
module.exports = grammar({
name: 'cspaced',
rules: {
// Top level
translation_unit: $ => repeat($.external_declaration),
// External declarations
external_declaration: $ => choice(
$.function_definition,
$.declaration,
$.preproc_directive
),
// Function definitions with indented bodies
function_definition: $ => seq(
$.declaration_specifiers,
$.declarator,
$.compound_statement
),
// Indented compound statements (key innovation)
compound_statement: $ => seq(
':',
$._indent,
repeat($.statement),
$._dedent
),
// Handle significant indentation
_indent: $ => token.immediate(/\n /), // 2 spaces
_dedent: $ => token.immediate(/\n/), // back to previous level
// Statement rules with optional semicolons
statement: $ => choice(
$.if_statement,
$.while_statement,
$.for_statement,
$.return_statement,
$.expression_statement
),
// Selection statements with indented blocks
if_statement: $ => seq(
'if',
$.parenthesized_expression,
$.compound_statement,
optional(seq('else', $.compound_statement))
),
// Expression statements (optional semicolon)
expression_statement: $ => seq(
$.expression,
optional(';')
),
// ... rest of rules
},
// Reserved words
word: $ => $.identifier,
// Externals for indentation tracking
externals: $ => [
$._indent,
$._dedent,
$._newline
],
// Conflict resolution for optional semicolons
conflicts: $ => [
[$.expression_statement, $.declaration]
],
// Precedence and associativity
precedences: $ => [
// ... operator precedence rules
]
});- Lexer must track indentation levels in stack
INDENT/DEDENTtokens generated for level changes- Consistency enforced within files
- Statements ending at newline get implicit
; - Complex expressions may require explicit
; - Context-aware insertion rules
- Skip to next consistent indentation level
- Report indentation mismatches as syntax errors
- Allow recovery from malformed blocks
- Detects
tcc,gcc,clangautomatically - Passes through compiler flags
- Generates intermediate
.cfiles
.cspfor cspaced source files- Auto-generates
.cfiles - Executables named without extensions
async/awaitsyntax for coroutines- Pattern matching for switch expressions
- Type inference for variable declarations
- Module system beyond
#include
- Named parameters in function calls
- String interpolation
- Range-based for loops (
for x in array)
This specification provides a complete foundation for implementing cspaced parsers, syntax highlighters, formatters, and language servers.