standardese documentation generator version 0.3: Groups, inline documentation, template mode & more
After two bugfix release for the parsing code, I finally got around to implement more features for standardese. A complete refactoring of the internal code allowed me to implement some advanced features: standardese now comes with member groups, the ability to show inline documentation, a template language and many minor things that just improve the overall documentation generation.
standardese is a documentation generator specifically designed for C++ code. It supports and detects many idioms for writing C++ documentation. It aims to be a replacement of Doxygen.
Yet again an update on the parsing situation
I’m using libclang for the parsing, but because it has many limitations, I’m forced to run my own parser over the tokens of each entity to get the required information.
But because libclang’s tokenizer does not preprocess the tokens, I’ve used Boost.Wave to preprocess the tokens, then parse them. But this leads to problems if you have source entities that are generated by a macro, like in the following example:
#define MAKE_STRUCT(name) \
struct name \
{ \
int a; \
};
MAKE_STRUCT(foo)
MAKE_STRUCT(bar)
When parsing foo
or bar
, I’ll get the tokens of the macro,
instead of the expanded tokens.
Because I do not want to influence the way you write C++ code,
I was forced to do something else.
This is one of standardese main goals: There should be no need to adapt your code to standardese. It should just support everything out of the box. That’s why it provides ways to completely modify the synopsis of each entity, etc.
In the 0.2-2 patch, I’ve changed the preprocessing code, so that Boost.Wave preprocesses the entire file, then parse that with libclang. That why I do not have to worry about any preprocessing.
But Boost.Wave is slow and also can’t handle many of the extensions used by the standard library headers, so I got a lot of workarounds there.
In this version I finally replaced Boost.Wave and now I use clang for the preprocessing.
I literally use clang, I call the binary from the code with the -E
flag to give the preprocess output and parse that.
I know that this is a bad solution, but it is just a temporary solution until I find a proper library for preprocessing.
But let’s talk about interesting features.
Member groups
You often have code that looks like this:
class foo
{
public:
…
/// \returns A reference to the variable.
T& get_variable()
{
return var_;
}
/// \returns A reference to the variable.
const T& get_variable() const
{
return var_;
}
};
Multiple functions do practically the same thing but have slightly different signatures. It would be very tedious to repeat the documentation over and over again.
With member groups you don’t have to:
class foo
{
public:
/// \returns A reference to the variable.
/// \group get_variable
T& get_variable()
{
return var_;
}
/// \group get_variable
const T& get_variable() const
{
return var_;
}
};
The \group
command adds an entity to a member group.
As the name implies, this only works for entities that are member of the same class/namespace/etc..
The group name is just an internal identifier for the group and only needs to be unique in that scope.
The first entity with a new group identifier, is the main entity for the group: It’s comment will be taken for the group comment and it’s type defines the header used for the group. With groups the output will look like this:
Function foo::get_variable
(1) T& get_variable();
(2) const T& get_variable() const;
Returns: A reference to the variable.
This is similar to the way cppreference.com does its documentation.
Modules
I’ve also added modules as a way to group related entities together.
The \module
command adds an entity to a module,
it can be in at most one module and will be passed on to all children.
For example, if you do it in a namespace, it will add all entities in that namespace to that module.
The module will be shown in the documentation of the entity by default - can be controlled by the output.show_modules
command - and a new index file standardese_modules
will list all modules with all entities in each module.
They are useful if you have multiple logical components in your project and want to give a quick overview.
Entity linking improvements
Inside a comment there are two syntax for linking to a different entity:
-
[some text](<> "unique-name")
(CommonMark link without URL but with title) -
[unique-name]()
(CommonMark link without URL)
The unique-name
is the unique identifier of the entity you want to refer to.
The correct URL will be filled in by standardese.
The
unique-name
can also refer to an external entity, for example by defaultstd::XXX
, will create link to the corresponding C++ reference page. This can be customized and extended by thecomment.external_doc
option.
Now I’ve added a third syntax:
[some-text](standardese://unique-name/)
, i.e. a CommonMark link with an URL in the standardese://
protocol.
Like with the other two options, standardese will fill in the URL automatically.
This syntax was mainly added for the template mode, see below.
But a problem with that linking model was that the unique-name
is verbose:
// unique name is: ns
namespace ns
{
// unique name is: ns::foo(void*)
// unique name of param is: ns::foo(void*).param
void foo(void* param);
// unique name is: ns::bar<T>
template <typename T> // unique name of `T` is: ns::bar<T>.T
struct bar
{
// unique name is: ns::bar<T>::f1()
void f1();
// unique name is: ns::bar<T>::f2()
void f2();
};
}
While you don’t need the signature for functions that are not overloaded,
and while you can rename the unique name to an arbitrary string with the \unique_name
command,
this is still verbose.
For example if you want to link from f2()
to f1()
, you had to type: [ns::bar<T>::f1()]()
.
Now I’ve added a link mode with name lookup.
Simply start the unique name with *
or ?
and standardese will search for an entity with rules similar to the regular C++ name lookup.
So with that you can simply link to f1()
from f2()
by writing: [*f2()]()
.
Name lookup only works in comments associated with a C++ entity and will be done from that C++ entity. It does not work in template files, for example.
Inline documentation
The documentation for some entities will now be shown inline by default.
This applies to parameters, member variables of a struct
, enum values or base classes.
Previously if you document them, standardese would add a new section for them, repeat their synopsis, etc.
Enumeration foo
enum class foo
{
a,
b,
c
};
An enum.
Enumeration constant foo::a
a
The value a.
Enumeration constant foo::b
b
The value b.
Enumeration constant foo::c
c
The value c.
Struct bar
struct bar
{
int a;
};
A struct.
Variable bar::a
int a;
Some variable.
Function func
void func(int a);
A function.
Parameter func::a
int a
A parameter.
Now they can be shown inline, in a little list:
Enumeration foo
enum class foo
{
a,
b,
c
};
An enum.
Enum values:
Struct bar
struct bar
{
int a;
};
A struct.
Members:
Function func
void func(int a);
A function.
Parameters:
It goes without saying that links to those entities will resolve to correct list position, right?
Other improvements
There are many smaller things.
You can now completely control the synopsis of an entity with the \synopsis
command.
Simply set the synopsis to an arbitrary string that will be shown instead of the actual synopsis.
Previously you could only hide, for example, certain parameters of a function.
The headings are now improved.
Previously it only showed the type of the entity: Function bar()
, Constructor foo(const foo&)
.
Now it detects certain signatures and give them more semantic meaning: Copy constructor foo(const foo&)
, Comparison operator operator==
, etc.
The “definition” of a macro can now be hidden from the synopsis by the global output.show_macro_replacement
option.
This is useful as macro definitions are often implementation details.
There are also a few breaking changes:
To do a hard line break in a comment, you cannot use the CommonMark backslash at the end of a line anymore, you have to use a forward slash instead (this is a technical limitation).
The \entity
and \file
commands for remote comments must now be in the beginning of a comment and not at an arbitrary position.
Also the unique name of function templates got simplified: you must not pass the template parameters there anymore.
But let’s address the biggest and most powerful feature: template mode.
Template mode
standardese now also works as a basic templating language.
If you pass in files that are not header files, they will be preprocessed.
This does two things: correctly linking all URLs in the standardese://
protocol
and replacing of special commands.
This can be best shown by an example. Consider the following C++ input file:
/// Struct a.
struct a {};
/// A function.
void func();
/// Struct b.
struct b {};
A non-source file input like this one:
### A heading
This file is in Markdown format, but you can use *anything* you want.
standardese doesn't care about the format,
it just does dumb text manipulation.
I can link to [the function](standardese://func()/) and it will be resolved.
But I can also show output of standardese here:
{ { standardese_doc_synopsis func() commonmark } }
This line will be replaced with the synopsis of `func()` in the commonmark format.
But it can be more advanced:
{ { standardese_for $entity file.hpp } }
{ { standardese_if $entity name func() } }
{ { standardese_else } }
* { { standardese_doc_text $entity commonmark } }
{ { standardese_end } }
{ { standardese_end } }
This will show the documentation text of the two structs.
Note: I had to add spaces between the
{ {
and} }
, because Jekyll was parsing them. The actual syntax does not use those spaces, and standardese will silently ignore any commands not starting withstandardese_
, so it works nice with it.
Pass both files to standardese and it will create the regular documentation for the C++ file as well as preprocess the template file to this:
A heading
This file is in Markdown format, but you can use anything you want. standardese doesn’t care about the format, it just does dumb text manipulation.
I can link to the function (manual edit: link doesn’t work here obviously) and it will be resolved. But I can also show output of standardese here:
void func();
This line will be replaced with the synopsis of func()
in the CommonMark format.
But it can be more advanced:
* Struct a.
* Struct b.
This will show the documentation text of the two structs.
This is useful if you want to write additional files, like tutorials.
But with the --template.default_template
you can pass a file that will customize the entire output.
If you pass none it will behave like this:
{ { standardese_doc $file $format } }
Again, in reality no spaces.
$file
will refer to the current file, $format
to the specified output format.
This will render the documentation for each file as standardese would do it.
Check out the readme for a quick template syntax overview.
But if you want to use additional files, you’d love the standardese_doc_anchor
command.
With the standardese://
protocol you can link to parts of the generated documentation.
But with the anchor command, you can link back:
{ { standardese_doc_anchor unique-name <format> } }
Without the spaces.
This will create an anchor in the file.
But the unique-name
will be registered, so you can use it as a link target inside the documentation comments!
If the
unique-name
already exists, this will change the link for that entity. With it you can override where the “actual” documentation is.
The template language is currently very basic and the error messages if you mess up are bad, but its already worth it and will be improved in the future.
What’s next?
With this release, standardese is at a point where I’m going to migrate Doxygen documentation to it. But I’ll continue working on it. I have many features planned and I might already start tackling with automated comment generation based on the code alone.
If you want to see a live demo, check out my Meeting C++ Lightning Talk. You can get the tool from the Github page, read the readme for more information.
This blog post was written for my old blog design and ported over. If there are any issues, please let me know.