LCP: Refactor the type representation
Summary:
# Summary
## Concepts and terminology
A **namestring for <object>** is a Lisp string that names the C++ language element
`<object>` (such as a namespace, a variable, a class, etc.). Therefore we have a
"namestring for a namespace", a "namestring for a variable", etc.
A **designator for a namestring for <object>** is a Lisp object that denotes a
"namestring for <object>". These are symbols and strings themselves.
A **typestring** is a Lisp string that represents a C++ type, i.e. its
declaration.
A **typestring designator** is a Lisp object that denotes a typestring.
A typestring (and the corresponding C++ type) is said to be **supported** if it
can be parsed using `parse-cpp-type-declaration`. This concept is important and
should form the base of our design because we can't really hope to ever support
all of C++'s type declarations.
A typestring (and the corresponding C++ type) that is not supported is
**unsupported**.
A **processed typestring** is a typestring that is either fully qualified or
unqualified.
A C++ type is said to be **known** if LCP knows extra information about the type,
rather than just what kind of type it is, in which namespace it lives, etc. For
now, the only known types are those that are defined within LCP itself using
`define-class` & co.
A C++ type is **unknown** if it is not known.
**Typestring resolution** is the process of resolving a (processed) typestring
into an instance of `cpp-type` or `unsupported-cpp-type`.
**Resolving accessors** are accessors which perform typestring resolution.
## Changes
Explicitly introduce the concept of supported and known types. `cpp-type` models
supported types while `unsupported-cpp-type` models unsupported types.
Subclasses of `cpp-type` model known types. `general-cpp-type` is either a
`cpp-type` or an `unsupported-cpp-type`.
Add various type queries.
Fix `define-rpc`'s `:initarg` (remove it from the `cpp-member` struct).
Introduce namestrings and their designators in `names.lisp`.
Introduce typestrings and their designators.
Introduce **processed typestrings**. Our DSL's macros (`define-class` & co.)
convert all of the given typestrings into processed typestrings because we don't
attempt to support partially qualified names and relative name lookup yet. A
warning is signalled when a partially qualified name is treated as a fully
qualified name.
The slots of `cpp-type`, `cpp-class`, `cpp-member` and `cpp-capnp-opts` now
store processed typestrings which are lazily resolved into their corresponding
C++ types.
The only thing that instances of `unsupported-cpp-type` are good for is getting
the typestring that was used to construct them. Most of LCP's functions only
work with known C++ types, i.e. `cpp-type` instances. The only function so far
that works for both of them is `cpp-type-decl`.
Since "unsupportedness" is now explicitly part of LCP, client code is expected
to manually check whether a returned type is unsupported or not (or face
receiving an error otherwise), unless a function is documented to return only
`cpp-type` instances.
A similar thing goes for "knowness". Client code is expected to manually check
whether a returned type is known or not, unless a function is documented to
return only (instances of `cpp-type` subclasses) known types.
## TODO
Resolution still has to be done for the following slots of the
following structures:
- `slk-opts`
- `save-args`
- `load-args`
- `clone-opts`
- `args`
- `return-type`
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1962
2019-05-10 19:55:21 +08:00
|
|
|
(in-package #:lcp)
|
|
|
|
|
|
|
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
|
|
|
;;;
|
|
|
|
;;; Name operations on strings
|
|
|
|
|
|
|
|
(defun uppercase-property-resolver (name)
|
|
|
|
(when (string= name "Uppercase")
|
|
|
|
#'upper-case-p))
|
|
|
|
|
|
|
|
(defun string-upcase-first (string)
|
|
|
|
"Upcase the first letter of the string STRING, if any."
|
|
|
|
(check-type string string)
|
|
|
|
(if (string= string "") "" (string-upcase string :end 1)))
|
|
|
|
|
|
|
|
(defun string-downcase-first (string)
|
|
|
|
"Downcase the first letter of the string STRING, if any."
|
|
|
|
(check-type string string)
|
|
|
|
(if (string= string "") "" (string-downcase string :end 1)))
|
|
|
|
|
|
|
|
(defun split-camel-case-string (string)
|
|
|
|
"Split the camelCase string STRING into a list of parts. The parts are
|
|
|
|
delimited by uppercase letters."
|
|
|
|
(check-type string string)
|
|
|
|
;; NOTE: We use a custom property resolver which handles the Uppercase
|
|
|
|
;; property by forwarding to UPPER-CASE-P. This is so that we avoid pulling
|
|
|
|
;; CL-PPCRE-UNICODE & co.
|
|
|
|
(let ((cl-ppcre:*property-resolver* #'uppercase-property-resolver))
|
|
|
|
;; NOTE: We use an explicit CREATE-SCANNER call in order to avoid issues
|
|
|
|
;; with CL-PPCRE's compiler macros which use LOAD-TIME-VALUE which evaluates
|
|
|
|
;; its forms within a null lexical environment (so our
|
|
|
|
;; CL-PPCRE:*PROPERTY-RESOLVER* binding would not be seen). Edi actually
|
|
|
|
;; hints at the problem within the documentation with the sentence "quiz
|
|
|
|
;; question - why do we need CREATE-SCANNER here?". :-)
|
|
|
|
;;
|
|
|
|
;; NOTE: This regex is a zero-width positive lookahead regex. It'll match
|
|
|
|
;; any zero-width sequence that is followed by an uppercase letter.
|
|
|
|
(cl-ppcre:split (cl-ppcre:create-scanner "(?=\\p{Uppercase})") string)))
|
|
|
|
|
|
|
|
(defun split-pascal-case-string (string)
|
|
|
|
"Split the PascalCase string STRING into a list of parts. The parts are
|
|
|
|
delimited by uppercase letters."
|
|
|
|
(check-type string string)
|
|
|
|
(split-camel-case-string string))
|
|
|
|
|
|
|
|
(defun split-snake-case-string (string)
|
|
|
|
"Split the snake_case string STRING into a list of parts. The parts are
|
|
|
|
delimited by underscores. The underscores are not preserved. Empty parts are
|
|
|
|
trimmed on both sides."
|
|
|
|
(check-type string string)
|
|
|
|
(cl-ppcre:split "_" string))
|
|
|
|
|
|
|
|
(defun split-kebab-case-string (string)
|
|
|
|
"Split the kebab-case string STRING into a list of parts. The parts are
|
|
|
|
delimited by dashes. The dashes are not preserved. Empty parts are trimmed on
|
|
|
|
both sides."
|
|
|
|
(check-type string string)
|
|
|
|
(cl-ppcre:split "-" string))
|
|
|
|
|
|
|
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
|
|
|
;;;
|
|
|
|
;;; Name operations on "things"
|
|
|
|
|
|
|
|
(defun split-cased-thing (thing &key from-style)
|
|
|
|
"Split THING into a list of parts according to its type.
|
|
|
|
|
|
|
|
- If THING is a symbol, it is split using SPLIT-KEBAB-CASE-STRING.
|
|
|
|
|
|
|
|
- If THING is a string, it is split according to the FROM-STYLE keyword
|
|
|
|
argument. FROM-STYLE can be one of :CAMEL, :PASCAL, :SNAKE or :KEBAB and
|
|
|
|
denotes splitting using SPLIT-CAMEL-CASE-STRING, SPLIT-PASCAL-CASE-STRING,
|
|
|
|
SPLIT-SNAKE-CASE-STRING and SPLIT-KEBAB-CASE-STRING respectively. If
|
|
|
|
FROM-STYLE is omitted or NIL, it is treated as if :CAMEL was given."
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(ctypecase thing
|
|
|
|
(symbol (split-kebab-case-string (string thing)))
|
|
|
|
(string
|
|
|
|
(ccase from-style
|
|
|
|
((nil :camel :pascal) (split-camel-case-string thing))
|
|
|
|
(:snake (split-snake-case-string thing))
|
|
|
|
(:kebab (split-kebab-case-string thing))))))
|
|
|
|
|
|
|
|
(defun camel-case-name (thing &key from-style)
|
|
|
|
"Return a camelCase string from THING.
|
|
|
|
|
|
|
|
The string is formed according to the parts of THING as returned by
|
|
|
|
SPLIT-CASED-THING. FROM-STYLE is passed to SPLIT-CASED-THING."
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(string-downcase-first
|
|
|
|
(format nil "~{~A~}"
|
|
|
|
(mapcar (alexandria:compose #'string-upcase-first #'string-downcase)
|
|
|
|
(split-cased-thing thing :from-style from-style)))))
|
|
|
|
|
|
|
|
(defun pascal-case-name (thing &key from-style)
|
|
|
|
"Return a PascalCase string from THING.
|
|
|
|
|
|
|
|
The string is formed according to the parts of THING as returned by
|
|
|
|
SPLIT-CASED-THING. FROM-STYLE is passed to SPLIT-CASED-THING."
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(string-upcase-first (camel-case-name thing :from-style from-style)))
|
|
|
|
|
|
|
|
(defun lower-snake-case-name (thing &key from-style)
|
|
|
|
"Return a lower_snake_case string from THING.
|
|
|
|
|
|
|
|
The string is formed according to the parts of THING as returned by
|
|
|
|
SPLIT-CASED-THING. FROM-STYLE is passed to SPLIT-CASED-THING."
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(string-downcase
|
|
|
|
(format nil "~{~A~^_~}" (split-cased-thing thing :from-style from-style))))
|
|
|
|
|
|
|
|
(defun upper-snake-case-name (thing &key from-style)
|
|
|
|
"Return a UPPER_SNAKE_CASE string from THING.
|
|
|
|
|
|
|
|
The string is formed according to the parts of THING as returned by
|
|
|
|
SPLIT-CASED-THING. FROM-STYLE is passed to SPLIT-CASED-THING."
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(string-upcase
|
|
|
|
(format nil "~{~A~^_~}" (split-cased-thing thing :from-style from-style))))
|
|
|
|
|
|
|
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
|
|
|
;;;
|
|
|
|
;;; Namestrings
|
|
|
|
|
|
|
|
(defun ensure-namestring-for (thing func)
|
|
|
|
"Return the namestring corresponding to the namestring designator THING.
|
|
|
|
|
|
|
|
- If THING is a symbol, return the result of calling FUNC on its name.
|
|
|
|
|
|
|
|
- If THING is a string, return it."
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(ctypecase thing
|
|
|
|
(symbol (funcall func thing))
|
|
|
|
(string thing)))
|
|
|
|
|
|
|
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
|
|
|
;;;
|
|
|
|
;;; C++ names and namestrings
|
|
|
|
|
|
|
|
(eval-when (:compile-toplevel :load-toplevel :execute)
|
|
|
|
(defparameter +cpp-namestring-docstring+
|
|
|
|
"Return the ~A namestring corresponding to the ~A namestring designator
|
|
|
|
THING.
|
|
|
|
|
|
|
|
- If THING is a symbol, return the result of calling ~A on its name.
|
|
|
|
|
|
|
|
- If THING is a string, return it."))
|
|
|
|
|
|
|
|
(defmacro define-cpp-name (cpp-object name-op)
|
|
|
|
"Define a name function and a namestring function for the C++ language element
|
|
|
|
named by the symbol CPP-OBJECT. Both functions rely on the function named by the
|
|
|
|
symbol NAME-OP to perform the actual operation.
|
|
|
|
|
|
|
|
The name function's name is of the form CPP-<CPP-OBJECT>-NAME.
|
|
|
|
|
|
|
|
The namestring function's name is of the form
|
|
|
|
ENSURE-NAMESTRING-FOR-<CPP-OBJECT>."
|
2019-05-24 20:16:42 +08:00
|
|
|
(let ((cpp-name-for (alexandria:symbolicate 'cpp-name-for- cpp-object)))
|
|
|
|
`(progn
|
|
|
|
(defun ,cpp-name-for (thing &key from-style)
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(,name-op thing :from-style from-style))
|
|
|
|
(setf (documentation ',cpp-name-for 'function)
|
|
|
|
(documentation ',name-op 'function))
|
|
|
|
(defun ,(alexandria:symbolicate 'ensure-namestring-for- cpp-object) (thing)
|
|
|
|
,(format nil +cpp-namestring-docstring+
|
|
|
|
(string-downcase cpp-object)
|
|
|
|
(string-downcase cpp-object)
|
|
|
|
name-op)
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(ensure-namestring-for thing #',name-op)))))
|
LCP: Refactor the type representation
Summary:
# Summary
## Concepts and terminology
A **namestring for <object>** is a Lisp string that names the C++ language element
`<object>` (such as a namespace, a variable, a class, etc.). Therefore we have a
"namestring for a namespace", a "namestring for a variable", etc.
A **designator for a namestring for <object>** is a Lisp object that denotes a
"namestring for <object>". These are symbols and strings themselves.
A **typestring** is a Lisp string that represents a C++ type, i.e. its
declaration.
A **typestring designator** is a Lisp object that denotes a typestring.
A typestring (and the corresponding C++ type) is said to be **supported** if it
can be parsed using `parse-cpp-type-declaration`. This concept is important and
should form the base of our design because we can't really hope to ever support
all of C++'s type declarations.
A typestring (and the corresponding C++ type) that is not supported is
**unsupported**.
A **processed typestring** is a typestring that is either fully qualified or
unqualified.
A C++ type is said to be **known** if LCP knows extra information about the type,
rather than just what kind of type it is, in which namespace it lives, etc. For
now, the only known types are those that are defined within LCP itself using
`define-class` & co.
A C++ type is **unknown** if it is not known.
**Typestring resolution** is the process of resolving a (processed) typestring
into an instance of `cpp-type` or `unsupported-cpp-type`.
**Resolving accessors** are accessors which perform typestring resolution.
## Changes
Explicitly introduce the concept of supported and known types. `cpp-type` models
supported types while `unsupported-cpp-type` models unsupported types.
Subclasses of `cpp-type` model known types. `general-cpp-type` is either a
`cpp-type` or an `unsupported-cpp-type`.
Add various type queries.
Fix `define-rpc`'s `:initarg` (remove it from the `cpp-member` struct).
Introduce namestrings and their designators in `names.lisp`.
Introduce typestrings and their designators.
Introduce **processed typestrings**. Our DSL's macros (`define-class` & co.)
convert all of the given typestrings into processed typestrings because we don't
attempt to support partially qualified names and relative name lookup yet. A
warning is signalled when a partially qualified name is treated as a fully
qualified name.
The slots of `cpp-type`, `cpp-class`, `cpp-member` and `cpp-capnp-opts` now
store processed typestrings which are lazily resolved into their corresponding
C++ types.
The only thing that instances of `unsupported-cpp-type` are good for is getting
the typestring that was used to construct them. Most of LCP's functions only
work with known C++ types, i.e. `cpp-type` instances. The only function so far
that works for both of them is `cpp-type-decl`.
Since "unsupportedness" is now explicitly part of LCP, client code is expected
to manually check whether a returned type is unsupported or not (or face
receiving an error otherwise), unless a function is documented to return only
`cpp-type` instances.
A similar thing goes for "knowness". Client code is expected to manually check
whether a returned type is known or not, unless a function is documented to
return only (instances of `cpp-type` subclasses) known types.
## TODO
Resolution still has to be done for the following slots of the
following structures:
- `slk-opts`
- `save-args`
- `load-args`
- `clone-opts`
- `args`
- `return-type`
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1962
2019-05-10 19:55:21 +08:00
|
|
|
|
|
|
|
(define-cpp-name namespace lower-snake-case-name)
|
|
|
|
(define-cpp-name class pascal-case-name)
|
|
|
|
(define-cpp-name enum pascal-case-name)
|
|
|
|
(define-cpp-name type-param pascal-case-name)
|
|
|
|
(define-cpp-name variable lower-snake-case-name)
|
|
|
|
(define-cpp-name enumerator upper-snake-case-name)
|
|
|
|
|
|
|
|
(defun cpp-name-for-member (thing &key from-style structp)
|
|
|
|
"Just like CPP-NAME-FOR-VARIABLE except that the suffix \"_\" is added unless
|
|
|
|
STRUCTP is true."
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(format nil "~A~@[_~]"
|
|
|
|
(cpp-name-for-variable thing :from-style from-style)
|
|
|
|
(not structp)))
|
|
|
|
|
|
|
|
(defun ensure-namestring-for-member (thing &key structp)
|
|
|
|
(check-type thing (or symbol string))
|
|
|
|
(ensure-namestring-for
|
|
|
|
thing (lambda (symbol) (cpp-name-for-member symbol :structp structp))))
|
|
|
|
|
|
|
|
(setf (documentation 'ensure-namestring-for-member 'function)
|
|
|
|
(format nil +cpp-namestring-docstring+
|
|
|
|
"member" "member" 'cpp-member-name))
|