Add a real README, license, and rewrite spec.

This commit is contained in:
Havoc Pennington 2011-11-15 19:14:22 -05:00
parent aae70ecb8e
commit 1841dd9c67
4 changed files with 1372 additions and 348 deletions

929
HOCON.md Normal file
View File

@ -0,0 +1,929 @@
# HOCON (Human-Optimized Config Object Notation)
This is an informal spec, but hopefully it's clear.
## Goals / Background
The primary goal is: keep the semantics (tree structure; set of
types; encoding/escaping) from JSON, but make it more convenient
as a human-editable config file format.
The following features are desirable, to support human usage:
- less noisy / less pedantic syntax
- ability to refer to another part of the configuration (set a value to
another value)
- import/include another configuration file into the current file
- a mapping to a flat properties list such as Java's System properties
- ability to get values from environment variables
- ability to write comments
Implementation-wise, the format should have these properties:
- a JSON superset, that is, all valid JSON should be valid and
should result in the same in-memory data that a JSON parser
would have produced.
- be deterministic; the format is flexible, but it is not
heuristic. It should be clear what's invalid and invalid files
should generate errors.
- require minimal look-ahead; should be able to tokenize the file
by looking at only the current character and the next
character.
HOCON is significantly harder to specify and to parse than
JSON. Think of it as moving the work from the person maintaining
the config file to the computer program.
## Definitions
- a _key_ is a string JSON would have to the left of `:` and a _value_ is
anything JSON would have to the right of `:`. i.e. the two
halves of an object _field_.
- a _value_ is any "value" as defined in the JSON spec, plus
unquoted strings and substitutions as defined in this spec.
- a _simple value_ is any value excluding an object or array
value.
- a _field_ is a key, any separator such as ':', and a value.
- references to a _file_ ("the file being parsed") can be
understood to mean any byte stream being parsed, not just
literal files in a filesystem.
## Syntax
Much of this is defined with reference to JSON; you can find the
JSON spec at http://json.org/ of course.
### Unchanged from JSON
- files must be valid UTF-8
- quoted strings are in the same format as JSON strings
- values have possible types: string, number, object, array, boolean, null
- allowed number formats matches JSON; as in JSON, some possible
floating-point values are not represented, such as `NaN`
### Comments
Anything between `//` or `#` and the next newline is considered a comment
and ignored, unless the `//` or `#` is inside a quoted string.
### Omit root braces
JSON documents must have an array or object at the root. Empty
files are invalid documents, as are files containing only a
non-array non-object value such as a string.
In HOCON, if the file does not begin with a square bracket or
curly brace, it is parsed as if it were enclosed with `{}` curly
braces.
A HOCON file is invalid if it omits the opening `{` but still has
a closing `}`; the curly braces must be balanced.
### Key-value separator
The `=` character can be used anywhere JSON allows `:`, i.e. to
separate keys from values.
If a key is followed by `{`, the `:` or `=` may be omitted. So
`"foo" {}` means `"foo" : {}"`
### Commas
Values in arrays, and fields in objects, need not have a comma
between them as long as they have at least one ASCII newline
(`\n`, decimal value 10) between them.
The last element in an array or last field in an object may be
followed by a single comma. This extra comma is ignored.
- `[1,2,3,]` and `[1,2,3]` are the same array.
- `[1\n2\n3]` and `[1,2,3]` are the same array.
- `[1,2,3,,]` is invalid because it has two trailing commas.
- `[,1,2,3]` is invalid because it has an initial comma.
- `[1,,2,3]` is invalid because it has two commas in a row.
- these same comma rules apply to fields in objects.
### Whitespace
The JSON spec simply says "whitespace"; in HOCON whitespace is
defined as follows:
- any Unicode space separator (Zs category), line separator (Zl
category), or paragraph separator (Zp category), including
nonbreaking spaces (such as 0x00A0, 0x2007, and 0x202F).
- tab (`\t` 0x0009), newline ('\n' 0x000A), vertical tab ('\v'
0x000B)`, form feed (`\f' 0x000C), carriage return ('\r'
0x000D), file separator (0x001C), group separator (0x001D),
record separator (0x001E), unit separator (0x001F).
In Java, the `isWhitespace()` method covers these characters with
the exception of nonbreaking spaces.
While all Unicode separators should be treated as whitespace, in
this spec "newline" refers only and specifically to ASCII newline
0x000A.
### Duplicate keys
The JSON spec does not clarify how duplicate keys in the same
object should be handled. In HOCON, duplicate keys that appear
later override those that appear earlier, unless both values are
objects. If both values are objects, then the objects are merged.
Note: this would make HOCON a non-superset of JSON if you assume
that JSON requires duplicate keys to have a behavior. The
assumption here is that duplicate keys are invalid JSON.
To merge objects:
- add fields present in only one of the two objects to the merged
object.
- for non-object-valued fields present in both objects,
the field found in the second object must be used.
- for object-valued fields present in both objects, the
object values should be recursively merged according to
these same rules.
Object merge can be prevented by setting the key to another value
first.
These two are equivalent:
{
"foo" : { "a" : 42 },
"foo" : { "b" : 43 }
}
{
"foo" : { "a" : 42, "b" : 43 }
}
And these two are equivalent:
{
"foo" : { "a" : 42 },
"foo" : null,
"foo" : { "b" : 43 }
}
{
"foo" : { "b" : 43 }
}
The intermediate setting of `"foo"` to `null` prevents the object merge.
### Unquoted strings
A sequence of characters outside of a quoted string is a string
value if:
- it does not contain "forbidden characters" '$', '"', '{', '}',
'[', ']', ':', '=', ',', '+', '#', '/', '\' (backslash), or
whitespace.
- its initial characters do not parse as `true`, `false`, `null`,
or a number.
Unquoted strings are used literally, they do not support any kind
of escaping. Quoted strings may always be used as an alternative
when you need to write a character that is not permitted in an
unquoted string.
`truefoo` parses as the boolean token `true` followed by the
unquoted string `foo`. However, `footrue` parses as the unquoted
string `footrue`. Similarly, `10.0bar` is the number `10.0` then
the unquoted string `bar` but `bar10.0` is the unquoted string
`bar10.0`.
In general, once an unquoted string begins, it continues until a
forbidden character is encountered. Embedded (non-initial)
booleans, nulls, and numbers are not recognized as such, they are
part of the string.
An unquoted string may not _begin_ with the digits 0-9 or with a
hyphen (`-`, 0x002D) because those are valid characters to begin a
JSON number. The initial number character, plus any valid-in-JSON
number characters that follow it, must be parsed as a number
value. Again, these characters are not special _inside_ an
unquoted string; they only trigger number parsing if they appear
initially.
### Value concatenation
The value of an object field or an array element may consist of
multiple values which are concatenated into one string.
Only simple values participate in value concatenation. Recall that
a simple value is any value other than arrays and objects.
As long as simple values are separated only by non-newline
whitespace, the _whitespace between them is preserved_ and the
values, along with the whitespace, are concatenated into a string.
Value concatenations never span a newline, or a character that is
not part of a simple value.
A value concatenation may appear in any place that a string may
appear, including object keys, object values, and array elements.
Whenever a value would appear in JSON, a HOCON parser instead
collects multiple values (including the whitespace between them)
and concatenates those values into a string.
Whitespace before the first and after the last simple value must
be discarded. Only whitespace _between_ simple values must be
preserved.
So for example ` foo bar baz ` parses as three unquoted strings,
and the three are value-concatenated into one string. The inner
whitespace is kept and the leading and trailing whitespace is
trimmed. The equivalent string, written in quoted form, would be
`"foo bar baz"`.
Value concatenation `foo bar` (two unquoted strings with
whitespace) and quoted string `"foo bar"` would result in the same
in-memory representation, seven characters.
For purposes of value concatenation, non-string values are
converted to strings as follows (strings shown as quoted strings):
- `true` and `false` become the strings `"true"` and `"false"`.
- `null` becomes the string `"null"`.
- quoted and unquoted strings are themselves.
- numbers should be kept as they were originally written in the
file. For example, if you parse `1e5` then you might render
it alternatively as `1E5` with capital `E`, or just `100000`.
For purposes of value concatenation, it should be rendered
as it was written in the file.
- a substitution is replaced with its value which is then
converted to a string as above, except that a substitution
which evaluates to `null` becomes the empty string `""`.
- it is invalid for arrays or objects to appear in a value
concatenation.
A single value is never converted to a string. That is, it would
be wrong to value concatenate `true` by itself; that should be
parsed as a boolean-typed value. Only `true foo` (`true` with
another simple value on the same line) should be parsed as a value
concatenation and converted to a string.
### Path expressions
Path expressions are used to write out a path through the object
graph. They appear in two places; in substitutions, like
`${foo.bar}`, and as the keys in objects like `{ foo.bar : 42 }`.
Path expressions are syntactically identical to a value
concatenation, except that they may not contain
substitutions. This means that you can't nest substitutions inside
other substitutions, and you can't have substitutions in keys.
When concatenating the path expression, any `.` characters outside
quoted strings or numbers are understood as path separators, while
inside quoted strings and numbers `.` has no special meaning. So
`foo.bar."hello.world"` would be a path with three elements,
looking up key `foo`, key `bar`, then key `hello.world`.
- `10.0foo` is a number then unquoted string `foo` so this would
be a single-element path.
- `foo10.0` is an unquoted string with a `.` in it, so this would
be a two-element path with `foo10` and `0` as the elements.
- `foo"10.0"` is an unquoted then a quoted string which are
concatenated, so this is a single-element path.
Unlike value concatenations, path expressions are _always_
converted to a string, even if they are just a single value.
If you have an array or element value consisting of the single
value `true`, it's a value concatenation and retains its character
as a boolean value.
If you have a path expression (in a key or substitution) then it
must always be converted to a string, so `true` becomes the string
that would be quoted as `"true"`.
If a path element is an empty string, it must always be quoted.
That is, `a."".b` is a valid path with three elements, and the
middle element is an empty string. But `a..b` is invalid and
should generate an error. Following the same rule, a path that
starts or ends with a `.` is invalid and should generate an error.
### Paths as keys
If a key is a path expression with multiple elements, it is
expanded to create an object for each path element other than the
last. The last path element, combined with the value, becomes a
field in the most-nested object.
In other words:
foo.bar : 42
is equivalent to:
foo { bar : 42 }
and:
foo.bar.baz : 42
is equivalent to:
foo { bar { baz : 42 } }
and so on. These values are merged in the usual way; which implies
that:
a.x : 42, a.y : 43
is equivalent to:
a { x : 42, y : 43 }
Because path expressions work like value concatenations, you can
have whitespace in keys:
a b c : 42
is equivalent to:
"a b c" : 42
Because path expressions are always converted to strings, even
single values that would normally have another type become
strings.
- `true : 42` is `"true" : 42`
- `3.14 : 42` is `"3.14" : 42`
As a special rule, the unquoted string `include` may not begin a
path expression in a key, because it has a special interpretation
(see below).
### Substitutions
Substitutions are a way of referring to other parts of the
configuration tree.
For substitutions which are not found in the configuration tree,
implementations may try to resolve them by looking at system
environment variables, Java system properties, or other external
sources of configuration.
The syntax is `${pathexpression}` where the `pathexpression` is a
path expression as described above. This path expression has the
same syntax that you could use for an object key.
Substitutions are not parsed inside quoted strings. To get a
string containing a substitution, you must use value concatenation
with the substitution in the unquoted portion:
key : ${animal.favorite} is my favorite animal
Or you could quote the non-substitution portion:
key : ${animal.favorite}" is my favorite animal"
Substitutions are resolved by looking up the path in the
configuration. The path begins with the root configuration object,
i.e. it is "absolute" rather than "relative."
Substitution processing is performed as the last parsing step, so
a substitution can look forward in the configuration. If a
configuration consists of multiple files, it may even end up
retrieving a value from another file. If a key has been specified
more than once, the substitution will always evaluate to its
latest-assigned value (the merged object or the last non-object
value that was set).
If a substitutions does not match any value present in the
configuration, implementations may look up that substitution in
one or more external sources, such as a Java system property or an
environment variable. (More detail on this in a later section.)
If a configuration sets a value to `null` then it should not be
looked up in the external source. Unfortunately there is no way to
"undo" this in a later configuration file; if you have `{ "HOME" :
null }` in a root object, then `${HOME}` will never look at the
environment variable. There is no equivalent to JavaScript's
`delete` operation in other words.
If a substitution does not match any value present in the
configuration and is not resolved by an external source, it is
evaluated to `null`.
Substitutions are only allowed in object field values and array
elements (value concatenations), they are not allowed in keys or
nested inside other substitutions (path expressions).
A substitution is replaced with any value type (number, object,
string, array, true, false, null). If the substitution is the only
part of a value, then the type is preserved. Otherwise, it is
value-concatenated to form a string. There is one special rule:
- `null` is converted to an empty string, not the string `null`.
Because missing substitutions are evaluated to `null`, either
missing or explicitly-set-to-null substitutions become an empty
string when concatenated.
Circular substitutions are invalid and should generate an error.
Implementations must take care, however, to allow objects to refer
to paths within themselves. For example, this must work:
bar : { foo : 42,
baz : ${bar.foo}
}
Here, if an implementation resolved all substitutions in `bar` as
part of resolving the substitution `${bar.foo}`, there would be a
cycle. The implementation must only resolve the `foo` field in
`bar`, rather than recursing the entire `bar` object.
### Includes
#### Include syntax
An _include statement_ consists of the unquoted string `include`
and a single quoted string immediately following it. An include
statement can appear in place of an object field.
If the unquoted string `include` appears at the start of a path
expression where an object key would be expected, then it is not
interpreted as a path expression or a key.
Instead, the next value must be a _quoted_ string. The quoted
string is interpreted as a filename or resource name to be
included.
Together, the unquoted `include` and the quoted string substitute
for an object field syntactically, and are separated from the
following object fields or includes by the usual comma (and as
usual the comma may be omitted if there's a newline).
If an unquoted `include` at the start of a key is followed by
anything other than a single quoted string, it is invalid and an
error should be generated.
There can be any amount of whitespace, including newlines, between
the unquoted `include` and the quoted string.
Value concatenation is NOT performed on the "argument" to
`include`. The argument must be a single quoted string. No
substitutions are allowed, and the argument may not be an unquoted
string or any other kind of value.
Unquoted `include` has no special meaning if it is not the start
of a key's path expression.
It may appear later in the key:
# this is valid
{ foo include : 42 }
# equivalent to
{ "foo include" : 42 }
It may appear as an object or array value:
{ foo : include } # value is the string "include"
[ include ] # array of one string "include"
You can quote `"include"` if you want a key that starts with the
word `"include"`, only unquoted `include` is special:
{ "include" : 42 }
#### Include semantics: merging
An _including file_ contains the include statement and an
_included file_ is the one specified in the include statement.
(They need not be regular files on a filesystem, but assume they
are for the moment.)
An included file must contain an object, not an array. This is
significant because both JSON and HOCON allow arrays as root
values in a document.
If an included file contains an array as the root value, it is
invalid and an error should be generated.
The included file should be parsed, producing a root object. The
keys from the root object are conceptually substituted for the
include statement in the including file.
- If a key in the included object occurred prior to the include
statement in the including object, the included key's value
overrides or merges with the earlier value, exactly as with
duplicate keys found in a single file.
- If the including file repeats a key from an earlier-included
object, the including file's value would override or merge
with the one from the included file.
#### Include semantics: substitution
Recall that substitution happens as a final step, _after_
parsing. It should be done for the entire app's configuration, not
for single files in isolation.
Therefore, if an included file contains substitutions, they must
be "fixed up" to be relative to the app's configuration root.
Say for example that the root configuration is this:
{ a : { include "foo.conf" } }
And "foo.conf" might look like this:
{ x : 10, y : ${x} }
If you parsed "foo.conf" in isolation, then `${x}` would evaluate
to 10, the value at the path `x`. If you include "foo.conf" in an
object at key `a`, however, then it must be fixed up to be
`${a.x}` rather than `${x}`.
Say that the root configuration redefines `a.x`, like this:
{
a : { include "foo.conf" }
a : { x : 42 }
}
Then the `${x}` in "foo.conf", which has been fixed up to
`${a.x}`, would evaluate to `42` rather than to `10`.
Substitution happens _after_ parsing the whole configuration.
#### Include semantics: missing files
If an included file does not exist, the include statement should
be silently ignored (as if the included file contained only an
empty object).
#### Include semantics: file formats and extensions
Implementations may support including files in other formats.
Those formats must be compatible with the JSON type system, or
have some documented mapping to JSON's type system.
If an implementation supports multiple formats, then the extension
may be omitted from the name of included files:
include "foo"
If a filename has no extension, the implementation should treat it
as a basename and try loading the file with all known extensions.
If the file exists with multiple extensions, they should _all_ be
loaded and merged together.
Files in HOCON format should be parsed last. Files in JSON format
should be parsed next-to-last.
In short, `include "foo"` might be equivalent to:
include "foo.properties"
include "foo.json"
include "foo.conf"
#### Include semantics: locating resources
Conceptually speaking, the quoted string in an include statement
identifies a file or other resource "adjacent to" the one being
parsed and of the same type as the one being parsed. The meaning
of "adjacent to", and the string itself, has to be specified
separately for each kind of resource.
Implementations may vary in the kinds of resources they support
including.
For plain files on the filesystem:
- if the included file is an absolute path then it should be kept
absolute and loaded as such.
- if the included file is a relative path, then it should be
located relative to the directory containing the including
file. The current working directory of the process parsing a
file must NOT be used when interpreting included paths.
For resources located on the Java classpath:
- included resources are looked up in the same class or class
loader as the including resource.
- if the included resource name starts with '/' then it
should be passed to `getResource()` as-is.
- if the included resource name does not start with '/'
then it should have the "directory" of the including resource
prepended to it, before passing it to `getResource()`.
- it would be wrong to use `getResource()` to get a URL and then
locate the included name relative to that URL, because a class
loader is not required to have a one-to-one mapping between
paths in its URLs and the paths it handles in `getResource()`.
In other words, the "adjacent to" computation should be done
on the resource name not on the resource's URL.
URLs:
- for both filesystem files and Java resources, if the
included name is a URL (begins with a protocol), it would
be reasonable behavior to try to load the URL rather than
treating the name as a filename or resource name.
- for files loaded from a URL, "adjacent to" should be based
on parsing the URL's path component, replacing the last
path element with the included name.
Implementations need not support files, Java resources, or URLs;
and they need not support particular URL protocols. However, if
they do support them they should do so as described above.
## API Recommendations
Implementations of HOCON ideally follow certain conventions and
work in a predictable way.
### Automatic type conversions
If an application asks for a value with a particular type, the
implementation should attempt to convert types as follows:
- number to string: convert the number into a string
representation that would be a valid number in JSON.
- boolean to string: should become the string "true" or "false"
- string to number: parse the number with the JSON rules
- string to boolean: the strings "true", "yes", "false", "no"
should be converted to boolean values. It's tempting to
support a long list of other ways to write a boolean, but
for interoperability and keeping it simple, it's recommended to
stick to these four.
- string to null: the string `"null"` should be converted to a
null value if the application specifically asks for a null
value, though there's probably no reason an app would do this.
The following type conversions should NOT be performed:
- null to anything: If the application asks for a specific type
and finds null instead, that should usually result in an error.
- object to anything
- array to anything
- anything to object
- anything to array
Converting objects and arrays to and from strings is tempting, but
in practical situations raises thorny issues of quoting and
double-escaping.
### Units format
Implementations may wish to support interpreting a value with some
family of units, such as time units or memory size units: `10ms`
or `512K`. HOCON does not have an extensible type system and there
is no way to add a "duration" type. However, for example, if an
application asks for milliseconds, the implementation can try to
interpret a value as a milliseconds value.
If an API supports this, for each family of units it should define
a default unit in the family. For example, the family of duration
units might default to milliseconds (see below for details on
durations). The implementation should then interpret values as
follows:
- if the value is a number, it is taken to be a number in
the default unit.
- if the value is a string, it is taken to be:
- optional whitespace
- a number
- optional whitespace
- an optional unit name consisting only of letters (letters
are the Unicode `L*` categories, Java `isLetter()`)
- optional whitespace
If a string value has no unit name, then it should be
interpreted with the default unit, as if it were a number. If a
string value has a unit name, that name of course specifies the
value's interpretation.
### Duration format
Implementations may wish to support a `getMilliseconds()` (and
similar for other time units).
This can use the general "units format" described above; bare
numbers are taken to be in milliseconds already, while strings are
parsed as a number plus an optional unit string.
The supported unit strings for duration are case sensitive and
must be lowercase. Exactly these strings are supported:
- `ns`, `nanosecond`, `nanoseconds`
- `us`, `microsecond`, `microseconds`
- `ms`, `millisecond`, `milliseconds`
- `s`, `second`, `seconds`
- `m`, `minute`, `minutes`
- `h`, `hour`, `hours`
- `d`, `day`, `days`
### Size in bytes format
Implementations may wish to support a `getMemorySizeInBytes()`
returning a size in bytes.
This can use the general "units format" described above; bare
numbers are taken to be in bytes already, while strings are
parsed as a number plus an optional unit string.
The one-letter unit strings may be uppercase (note: duration units
are always lowercase, so this convention is specific to size
units).
Exactly these strings are supported:
- `B`, `b`, `byte`, `bytes`
- `K`, `k`, `kilobyte`, `kilobytes`
- `M`, `m`, `megabyte`, `megabytes`
- `G`, `g`, `gigabyte`, `gigabytes`
- `T`, `t`, `terabyte`, `terabytes`
Values are interpreted as for memory (powers of two scale) not as
for hard drives (powers of ten scale).
(A generic `getBytes()`, as opposed to `getMemorySizeInBytes()`,
might wish to support both the SI power of ten units and the IEC
power of two units. But until an implementation needs that, no
such thing is documented here.)
### Java properties mapping
It may be useful to merge Java properties data with data loaded
from JSON or HOCON. See the Java properties spec here:
http://download.oracle.com/javase/7/docs/api/java/util/Properties.html#load%28java.io.Reader%29
Java properties parse as a one-level map from string keys to
string values.
To convert to HOCON, first split each key on the `.` character,
keeping any empty strings (including leading and trailing empty
strings). Note that this is _very different_ from parsing a path
expression.
The key split on `.` is a series of path elements. So the
properties key with just `.` is a path with two elements, both of
them an empty string. `a.` is a path with two elements, `a` and
empty string. (Java's `String.split()` does NOT do what you want
for this.)
It is impossible to represent a key with a `.` in it in a
properties file. If a JSON/HOCON key has a `.` in it, which is
possible if the key is quoted, then there is no way to refer to it
as a Java property. It is not recommended to name HOCON keys with
a `.` in them, since it would be confusing at best in any case.
Once you have a path for each value, construct a tree of
JSON-style objects with the string value of each property located
at that value's path.
Values from properties files are _always_ strings, even if they
could be parsed as some other type. Implementations should do type
conversion if an app asks for an integer, as described in an
earlier section.
When Java loads a properties file, unfortunately it does not
preserve the order of the file. As a result, there is an
intractable case where a single key needs to refer to both a
parent object and a string value. For example, say the Java
properties file has:
a=hello
a.b=world
In this case, `a` needs to be both an object and a string value.
The _object_ must always win in this case... the "object wins"
rule throws out at most one value (the string) while "string wins"
would throw out all values in the object. Unfortunately, when
properties files are mapped to the JSON structure, there is no way
to access these strings that conflict with objects.
The usual rule in HOCON would be that the later assignment in the
file wins, rather than "object wins"; but implementing that for
Java properties would require implementing a custom Java
properties parser, which is surely not worth it.
### Root paths
By convention, a given application or library has a "root path."
Most commonly the root path has a single path element - "akka" for
example. But it could have multiple.
Conventional config file names and property names are derived from
the root path.
If an API looks like `load(rootPath)` then it would return an
object conceptually "at" the root path, not an object containing
the root path.
### Conventional configuration file names for JVM apps
To get config file names, join the elements of the root path with
a hyphen, then add appropriate filename extensions.
If the root path is `foo.bar` (two elements, `foo` and `bar`),
then the configuration files should be searched for under the
following resource names on the classpath:
- /foo-bar.conf
- /foo-bar.json
- /foo-bar.properties
- /foo-bar-reference.conf
- /foo-bar-reference.json
- /foo-bar-reference.properties
The .json and .properties files are examples, different
implementations may support different file types. The "reference"
files are intended to contain defaults and be shipped with the
library or application being configured.
Note that the configuration files are absolute resource paths, not
relative to the package. So you would call
`klass.getResource("/foo-bar.conf")` not
`klass.getResource("foo-bar.conf")`.
### Conventional override by system properties
For an application's config, Java System properties _override_
HOCON found in the configuration file. This supports specifying
config options on the command line.
Those system properties which begin with an application's root
path should override the configuration for that application.
For example, say your config is for root path "akka" then your
config key "foo" would go with `-Dakka.foo=10`. When loading your
config, any system properties starting with `akka.` would be
merged into the config.
### Substitution fallback to system properties
Recall that if a substitution is not present (not even set to
`null`) within a configuration tree, implementations may search
for it from external sources. One such source could be Java system
properties.
To find a value for substitution, Java applications should look at
system properties directly, without the root path namespace.
Remember that namespaced system properties were already used as
overrides.
`${user.home}` would first look for a `user.home` in the
configuration tree (which has a scoped system property like
`akka.user.home` merged in!).
If no value for `${user.home}` exists in the configuration, the
implementation would look at system property `user.home` without
the `akka.` prefix.
The unprefixed system properties are _not_ merged in to the
configuration tree; if you iterate over your configuration, they
should not be in there. They are only used as a fallback when
evaluating substitutions.
The effect is to allow using generic system properties like
`user.home` and also to allow overriding those per-app.
So if someone wants to set their home directory for _all_ apps,
they set the `user.home` system property. If they then want to
force a particular home directory only for Akka, they could set
`akka.user.home` instead.
### Substitution fallback to environment variables
Substitutions not found in the configuration may also fall back to
environment variables. In Java, fallback should be to system
properties first and environment variables second.
It's recommended that HOCON keys always use lowercase, because
environment variables generally are capitalized. This avoids
naming collisions between environment variables and configuration
properties. (While on Windows getenv() is generally not
case-sensitive, the lookup will be case sensitive all the way
until the env variable fallback lookup is reached.)
An application can explicitly block looking up a substitution in
the environment by setting a value in the configuration, with the
same name as the environment variable. You could set `HOME : null`
in your root object to avoid expanding `${HOME}` from the
environment, for example.
Environment variables are interpreted as follows:
- present and set to empty string: treated as not present
- System.getenv throws SecurityException: treated as not present
- encoding is handled by Java (System.getenv already returns
a Unicode string)
- environment variables always become a string value, though
if an app asks for another type automatic type conversion
would kick in
## Open issues
- should a few more special characters be banned from unquoted strings, to allow future extensions?

202
LICENSE-2.0.txt Normal file
View File

@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

242
README.md
View File

@ -1 +1,241 @@
Configuration library for Java.
Configuration library for JVM languages.
## Overview
- implemented in plain Java with no dependencies
- _extensive_ test coverage
- supports files in three formats: Java properties, JSON, and a
human-friendly JSON superset
- merges multiple files across all formats
- can load from files, URLs, or classpath
- good support for "nesting" (treat any subtree of the config the
same as the whole config)
- users can override the config with Java system properties,
`java -Dmyapp.foo.bar=10`
- parses duration and size settings, "512k" or "10 seconds"
- converts types, so if you ask for a boolean and the value
is the string "yes", or you ask for a float and the value is
an int, it will figure it out.
- JSON superset features:
- comments
- includes
- substitutions (`"foo" : ${bar}`, `"foo" : Hello ${who}`)
- properties-like notation (`a.b=c`)
- less noisy, more lenient syntax
- substitute environment variables and system properties
This library limits itself to config files. If you want to load
config from a database or something, you would need to build a
config object yourself and then merge it in.
## License
The license is Apache 2.0, see LICENSE-2.0.txt.
## Bugs and Patches
Report bugs to the GitHub issue tracker. Send patches as pull
requests on GitHub.
Along with any pull requests (or other means of contributing),
please state that the contribution is your original work (or that
you have the authority to license it) and that you license the
work under the Apache 2.0 license.
Whether or not you state this explicitly, by submitting any
copyrighted material via pull request, email, or other means you
agree to license your the material under the Apache 2.0 license
and warrant that you have the legal authority to do so.
## API Example
ConfigRoot root = Config.load("myapp")
int a = conf.getInt("foo.bar")
ConfigObject obj = conf.getObject("foo")
int b = obj.getInt("bar")
## Standard behavior
You can load any files and merge them in any order, but the
convenience method `Config.load()` loads the following
(first-listed are higher priority):
- `myapp.*` system properties
- `myapp.conf` (these files are all from classpath)
- `myapp.json`
- `myapp.properties`
- `myapp-reference.conf`
- `myapp-reference.json`
- `myapp-reference.properties`
## JSON Superset
Tentatively called "Human-Optimized Config Object Notation" or
HOCON, also called `.conf`, see HOCON.md in this directory for more
detail.
### Features of HOCON
- Comments, with `#` or `//`
- Allow omitting the `{}` around a root object
- Allow `=` as a synonym for `:`
- Allow omitting the `=` or `:` before a `{` so
`foo { a : 42 }`
- Allow omitting commas as long as there's a newline
- Allow trailing commas after last element in objects and arrays
- Allow unquoted strings for keys and values
- Unquoted keys can use dot-notation for nested objects,
`foo.bar=42` means `foo { bar : 42 }`
- Duplicate keys are allowed; later values override earlier,
except for object-valued keys where the two objects are merged
recursively
- `include` feature merges root object in another file into
current object, so `foo { include "bar.json" }` merges keys in
`bar.json` into the object `foo`
- include with no file extension includes any of `.conf`,
`.json`, `.properties`
- substitutions `foo : ${a.b}` sets key `foo` to the same value
as the `b` field in the `a` object
- substitutions concatenate into unquoted strings, `foo : the
quick ${colors.fox} jumped`
- substitutions fall back to system properties and then
environment variables if they don't resolve in the
config itself, so `${HOME}` or `${user.home}` would
work as you expect.
### Examples of HOCON
Start with valid JSON:
{
"foo" : {
"bar" : 10,
"baz" : 12
}
}
Drop root braces:
"foo" : {
"bar" : 10,
"baz" : 12
}
Drop quotes:
foo : {
bar : 10,
baz : 12
}
Use `=` and omit it before `{`:
foo {
bar = 10,
baz = 12
}
Remove commas:
foo {
bar = 10
baz = 12
}
Use dotted notation for unquoted keys:
foo.bar=10
foo.baz=12
The syntax is well-defined (including handling of whitespace and
escaping). But it handles many reasonable ways you might want to
format the file.
## Future Directions
Here are some features that might be nice to add.
- "Type consistency": if a later config file changes the type of a
value from its type in `myapp-reference.conf` then complain
at parse time.
Right now if you set the wrong type, it will only complain
when the app tries to use the setting, not when the config
file is loaded.
- "myapp.d directory": allow parsing a directory. All `.json`,
`.properties` and `.conf` files should be loaded in a
deterministic order based on their filename.
- some way to merge array and object types. One approach could
be: `searchPath=${searchPath} ["/usr/local/foo"]`, which
involves two features: 1) substitutions referring to the key
being assigned would have to look at that key's value later in
the merge stack (rather than complaining about circularity); 2)
objects and arrays would have to be merged if a series of them
appear after a key, similar to how strings are concatenated
already. A simpler but much more limited approach would add
`+=` as an alternative to `:`/`=`, where `+=` would append an
array value to the array's previous value.
(Note that regular `=` already merges object values, to avoid
object merge you have to first set the object to a non-object
such as null, then set a new object.)
- "application.conf": normally there is no "global"
configuration, each application does its own
`Config.load("myapp")`. However, it might be nice if you could
put all your config for your app and libraries you use in a
single file. This could be called "application.conf" for
example. `Config.load("myapp")` would load "application.conf"
and merge in the `"myapp"` object from "application.conf",
so if "application.conf" contained: `myapp { foo=3 }` then
the key `foo` would be set in the result of
`Config.load("myapp")`. Apps could then put all their config
in "application.conf", if desired.
- "delete": allow deleting a field, which is slightly different
from setting it to null (deletion allows fallback to values
in system properties and the environment, for example).
This could be done using the same syntax as `include`,
potentially. It is not a backward-compatible change though.
## Rationale
(For the curious.)
The three file formats each have advantages.
- Java `.properties`:
- Java standard, built in to JVM
- Supported by many tools such as IDEs
- JSON:
- easy to generate programmatically
- well-defined and standard
- bad for human maintenance, with no way to write comments,
and no mechanisms to avoid duplication of similar config
sections
- HOCON/`.conf`:
- nice for humans to read, type, and maintain, with more
lenient syntax
- built-in tools to avoid cut-and-paste
- ways to refer to the system environment, such as system
properties and environment variables
The idea would be to use JSON if you're writing a script to spit
out config, and use HOCON if you're maintaining config by hand.
If you're doing both, then mix the two.
Two alternatives to HOCON syntax could be:
- YAML is also a JSON superset and has a mechanism for adding
custom types, so the include statements in HOCON could become
a custom type tag like `!include`, and substitutions in HOCON
could become a custom tag such as `!subst`, for example. The
result is somewhat clunky to write, but would have the same
in-memory representation as the HOCON approach.
- Put a syntax inside JSON strings, so you might write something
like `"$include" : "filename"` or allow `"foo" : "${bar}"`.
This is a way to tunnel new syntax through a JSON parser, but
other than the implementation benefit (using a standard JSON
parser), it doesn't really work. It's a bad syntax for human
maintenance, and it's not valid JSON anymore because properly
interpreting it requires treating some valid JSON strings as
something other than plain strings. A better approach is to
allow mixing true JSON files into the config but also support
a nicer format.

347
SPEC.md
View File

@ -1,347 +0,0 @@
# HOCON (Human-Optimized Config Object Notation)
Very informal spec.
In this, "application" really could mean "library," it means "a thing
defining a configuration, such as Akka"
Some Java-specific stuff is in here, though a Java-independent version would
also be possible, without the system properties parts.
Many existing akka.conf and Play application.conf would probably parse in this format, though details would be a little different (encoding, escaping, whitespace) and that could affect some configurations.
## Goals
The primary goal is: keep the semantics (tree structure; set of types;
encoding/escaping) from JSON, but make it more convenient as a
human-editable config file format.
The following features are desirable, to support human usage:
- less noisy / less pedantic syntax
- ability to refer to another part of the configuration (set a value to
another value)
- import/include another configuration file into the current file
- a mapping to a flat properties hierarchy such as Java's System.properties
- ability to get values from environment variables
- ability to write comments
The first implementation should have these properties:
- pure Java with no external dependencies
- API easily supports "nesting" (get a subtree then treat it as a root)
- API supports a "root" name which is used to scope Java properties.
So if you say your config is for "akka" then your config key "foo"
would go with `-Dakka.foo=10`, but in the config file people don't
have to write the root `akka` object.
- application can define the search path for include statements
(with a function from string to input stream)
- API supports treating values as "Durations" and "Memory sizes" in
addition to basic JSON types (for example "50ms" or "128M")
- API should attempt to perform reasonable type conversions.
This might include: treating numbers 0 or 1 as booleans, treating
strings yes/no/y/n as booleans, integer to and from float,
any number or boolean to a string. This puts "figure out
what people mean" in the API rather than in the syntax spec.
- API should support reloading the config dynamically (some kind of reload listener functionality)
## Syntax
### Basic syntax
This describes a delta between HOCON and JSON.
The same as JSON:
- files must be valid UTF-8
- quoted strings are in the same format as JSON strings
- values have possible types: string, number, object, array, boolean, null
(more types can be introduced by an API but the syntax just has those)
- allowed number formats matches JSON
Different from JSON:
- anything between "//" or "#" and the next newline is considered a comment
and ignored (unless the "//" or "#" is inside a quoted string)
- a _key_ is a string JSON would have to the left of `:` and a _value_ is
anything JSON would have to the right of `:`
- a stream may begin with a key, rather than the opening brace of a root
object. In this case a root object is implied.
- `=` is used in place of `:`
- the comma after a value may be omitted as long as there is a newline
instead
- keys with an object as their value may omit `=`, so `foo { }` means
`foo = { }`
- keys may be unquoted strings (see below for detailed definition)
- only if a key is unquoted, the `.` character has a special meaning and
creates a new object. So `foo.bar = 10` means to create an object at key
`foo`, then inside that object, create a key `bar` with value
`10`.
- quoted keys _should not_ contain the `.` character because it's
confusing, but it is permitted (to preserve the ability to use any string
as a key and thus convert an arbitrary map or JavaScript object into HOCON)
- if a key is defined twice, it is not an error; the later definition
wins for all value types except object. For objects, the two objects
are merged with the later object's keys winning.
- because later objects are merged, you can do `foo.bar=10,foo.baz=12` and
get `foo { bar=10,baz=12 }` rather than overwriting `bar` with `baz`
- also this means you can include a file that defines all keys and then
a later file can override only some of those keys
- to replace an object entirely you can first set it to `null` or other
non-object value, and then set it to an object again
- arrays may be merged by using `++=` or `+=` rather than `=` to separate key
from value.
- `++=` must have an array value on the right, and will concatenate it
with any previous array value (a previously-undefined value
is treated as `[]`)
- `+=` can have any value on the right and will append it to a
previously-defined array or if none was defined, make it
an array of one element
- FIXME prepend operator?
- a new type of value exists, substitution, which looks like `${some.path}`
(details below)
- String values may sometimes omit quotes.
- Unquoted strings may not contain '$', '"', '{', '}',
'[', ']', ':', '=', ',', '+', '#', '/' or '\' (backslash)
and may not contain whitespace (including newlines).
- Unquoted strings do not support any form of escaping; the
characters are all left as-is. If you need to use special
characters or escaping, you have to quote the string.
- Because of "value concatenation" rules (see below) you can
write a sentence with whitespace unquoted, though.
- Any unquoted series of characters that parses as a
substitution, true, false, null, number, or quoted string
will be treated as the type it parses as, rather than as
an unquoted string. However, in "value concatenation"
the non-string types convert to strings, which means
you can have the word "true" in an unquoted sentence.
- true, false, null, numbers only parse as such if they
immediately follow at least one character that is not
allowed in unquoted strings. That is, `truefoo` is
the value `true` then the unquoted string `foo`, but
`footrue` is the unquoted string `footrue`.
- quoted strings and substitutions always parse as such
since they begin with a character that can't be in an
unquoted string.
- Value concatenation: to support substitutions, and unquoted
sentences with whitespace, a value may consist of multiple
values which are concatenated into one
string. `"foo"${some.path}"bar"` or `The quick brown fox`.
- let a "simple value" be the set of JSON values excluding
objects and arrays, and including unquoted strings and
substitutions.
- as long as simple values are separated only by non-newline
whitespace, the _whitespace between them is preserved_
and the values, along with the whitespace, are concatenated
into a string.
- Whitespace before the first and after the last simple value
will be discarded. Only whitespace _between_ simple values
is preserved.
- concatenation never spans a newline or a non-simple-value
token.
- the result of the concatenation is a string value.
- the special key `include` followed directly by a string value (with no
`=`) means to treat that string value as a filename and merge the
object defined in that file into the current object, overriding
any keys defined earlier and being overridden by any keys defined later.
The search path for the file (which may include the classpath or certain
directories in the filesystem) will be application-defined.
- An included filename can refer to any of a HOCON file or a JSON file or a
Java properties file. These are distinguished by extension:
- `.properties` for Java properties (parser built into the JDK)
- `.json` for JSON (can be parsed with a slightly modified HOCON parser)
- `.conf` or `.hocon` for HOCON
- If the included filename has no extension, then any of the above
extensions are allowed. If the included filename has an extension
already then it refers to precisely that filename and the format
is not flexible.
### Path expressions
Path expressions are used to write out a path through the object
graph. They appear in two places; in substitutions, like
`${foo.bar}`, and as the keys in objects like `{ foo.bar : 42 }`.
Path expressions work like a value concatenation, except that they
may not contain substitutions. This means that you can't nest
substitutions inside other substitutions, and you can't have
substitutions in keys.
When concatenating the path expression, any `.` characters outside quoted
strings or numbers are understood as path separators, while inside quoted
strings `.` has no special meaning. So `foo.bar."hello.world"` would be
a path with three elements, looking up key `foo`, key `bar`, then key
`hello.world`.
### Java properties mapping
See the Java properties spec here: http://download.oracle.com/javase/7/docs/api/java/util/Properties.html#load%28java.io.Reader%29
There is a mapping from Java properties to HOCON,
`foo.bar.baz=10` to `foo={ bar={ baz=10 } }`. If an HOCON key has a `.` in it
(possible by quoting the HOCON key) then there is no way to refer to it as a
Java property; it is not recommended to name HOCON keys with a `.` in them.
For an application's config, Java System properties _override_ HOCON found
in the configuration file. This supports specifying config options on the
command line.
When loading a configuration, all System properties should be merged in.
Generally an application's configuration should be under a root namespace,
to avoid merging every system property in the whole process.
For example, say your config is for "akka" then your config key "foo" would
go with `-Dakka.foo=10`. When loading your config, any system properties
starting with `akka.` would be merged into the config.
System properties always have string values, but they are parsed as a
simplified HOCON value when merged:
- valid boolean, null, and number literals become those types
- anything else is a string
- no substitution, unescaping, unquoting is performed
- the idea here is to avoid "doubling" the string escaping/encoding rules and making it hard to put a string in a property, but the price is that you can't put objects and arrays in properties. FIXME: alternative is to key off whether the property starts with a `[` or `{` or to do a delayed parse when we see what type the app asks for.
### Substitutions
Substitutions are a way of referring to other parts of the configuration
tree.
The syntax is `${stringvalue}` where the `stringvalue` is a path expression
(see above).
Substitution processing is performed as the last parsing step, so a
substitution can look forward in the configuration file and even retrieve a
value from a Java System property. This also means that a substitution will
evaluate to the last-assigned value for a given option (since identical keys
later in the stream override those earlier in the stream).
Circular substitutions are an error, implementations might try to detect
them in a nicer way than stack overflow, for bonus points.
Substitutions are allowed in values, but not in keys. Substitutions are not
evaluated inside quoted strings, they must be "toplevel" values.
A substitution is replaced with any value type (number, object, string,
array, true, false, null). If the substitution is the only part of a value,
then the type is preserved. If the substitution is part of a string (needs
to be concatenated) then it is an error if the substituted value is an
object or array. Otherwise the value is converted to a string value as follows:
- `null` is converted to an empty string, not the string `null`
(note that this differs from a literal null value in a value
concatenation, which becomes the string "null")
- strings are already strings
- numbers are converted to a string that would parse as a valid number in
HOCON
- booleans are converted to `true` or `false`
Substitutions are looked up in the root object, using the regular key
syntax, so `foo.bar` is split to be "key `bar` inside the object at key
`foo`", but `"foo.bar"` is not split because it's quoted.
Recall that the root object already has system properties merged in as
overrides. So looking up `foo.bar` in root object `akka` would get the
system property `akka.foo.bar` if that system property were present.
Substitutions are looked up in three namespaces, in order:
- the application's normal root object.
- System properties directly, without the root namespace. So
`${user.home}` would first look for a `user.home` in the root
configuration (which has a scoped system property like `akka.user.home`
merged in!) and if that failed, it would look at the system property
`user.home` without the `akka.` prefix.
- the intent is to allow using generic system properties like
`user.home` and also to allow overriding those per-app.
- the intent is NOT to allow accessing config from other apps,
only to allow access to global system properties
- system environment variables
If a substitution is not resolved, it evaluates to JSON value `null`, which
would then be converted to an empty string if the substitution is part of a
string.
If a substitution evaluates to `null` (which may mean it was either unset,
or explicitly set to `null`), then environment variables should be used as a
fallback.
Note that environment variables and global system properties are fallbacks,
while app-scoped system properties are an override.
It's recommended that HOCON keys always use lowercase, because environment
variables generally are capitalized. This avoids naming collisions between
environment variables and configuration properties. (While on Windows
getenv() is generally not case-sensitive, the lookup will be case sensitive
all the way until the env variable fallback lookup is reached.)
An application can explicitly block looking up a substitution in the
environment by setting a non-`null` value in the configuration, with the
same name as the environment variable. But there is no way to set a key to
`null` if a non-empty environment variable is present.
Environment variables are interpreted as follows:
- present and set to empty string: treated as not present
- System.getenv throws SecurityException: treated as not present
- encoding is handled by Java (System.getenv already returns
a Unicode string)
- Parsed as a simplified value token as with system properties:
- valid boolean, null, and number literals become those types
- anything else is a string
- no substitution, unescaping, unquoting is performed
## Examples
To get this JSON:
{
"foo" : {
"bar" : 10,
"baz" : 12
}
}
You could write any of:
foo.bar=10
foo.baz=12
or
foo {
bar=10
baz=12
}
or
foo {
bar=10, baz=12
}
or
foo {
"bar"=10
"baz"=12
}
or
foo {
bar=10
}
foo {
bar=12
}
## application.conf
It might be nice if the API by default loaded an `application.{conf,properties,json}` which would be merged into any config and rooted at the true global root.
So for example, if Akka said its config root was `akka`, then by default an `akka.conf` is loaded and for conversion to and from system properties, it's rooted at `akka.`. Then inside `application.conf`, you could have an `akka { timeout=5 }` sort of section. The `application.conf` would load later, after `akka.conf`, and then system properties would override everything.
The purpose of `application.conf` is to allow apps to config everything in a single file.