Why I Never met an Auto-Generated Doc I liked
Guest article - Dana Fujikawa
Tools for auto-generating API Documentation from code comments suffer from a variety of problems. And, their output has never been popular with developers.
Thankfully today, there is a drive towards quality, hand-curated, developer-focused Documentation (or Docs), using online platforms like Readme and GitBook. Solutions like these allow all team members to create documentation that is well organized, partitioned into formal guides and API references, while helping to guide the developer audience on their journey to consuming your services and APIs.
Despite this, many developer programs continue to use tools like Doxygen and Javadocs which are essentially 1990s documentation technology. It’s 2021 – you can do better!
What's the Problem?
The first and biggest problem with auto-generated Docs is that they almost always lack substantive information. You may be asking how this is the fault of the tool, so allow me to elaborate. The problem usually stems from the false belief that just because developers can write the Docs as they code (i.e., via code comments), that somehow, this will result in good Docs that will be kept up to date. While the theory is good, I've rarely, if ever, seen it pan out this way in practice.
For example, functions and members in auto-generated docs tend to be missing descriptions, or worse, have really poor descriptions. The most frequent ones I come across are what I call Captain Obvious Descriptions. For example, given a method like GetX(), most auto-generated Docs will contain a description like "Returns X.” Your Developer audience needs more substantive information, such as the legal range of values, meanings for specific values (e.g., when 0 or negative numbers have special meanings), units (e.g., pixels, meters, etc.), and guidance on when the method should be called or used.
That so much auto-generated documentation content is like this probably shouldn't be a surprise because most developers don't like writing documentation – that's why they enrolled in computer programming school. As a result, many end up doing minimal documentation work because they're solely focused on problem solving (as they should be), and not necessarily on the Developer Experience or Developer Journey of those who will consume their APIs. Of course, there are exceptions, especially in Developer-First companies where programmers are obliged to take the time to write great Docs. However, throughout my 20+ year career in tech, I see poor-quality content like that mentioned above, time and time again.
Compounding this problem is a general lack of formatting and content organization options available in these code-generation tools. The resulting output tends to be a raw dump of classes, methods, namespaces/packages, etc., which lacks a well thought out organization and proper table contents, has poor search capabilities, and the inability to integrate with other Docs like Tutorials or Getting started guides.
Here is one example:
Figure 1: Auto-generated documentation with an overview lacking concrete details about what the SDK actually does.
As can be seen in the screenshot, there are a number of serious problems, including:
The overview doesn't state the problem that the SDK solves, when it is to be used, or what hardware platforms it can be used on. All we know is that it's somehow used for ADAS systems.
The TOC comprises nothing more than groups of programmatic entities (modules, data structures, etc.) and lacks any topics to help guide the developer. These lists of entities are only usable once the developer knows what they're looking for.
And if we look at one of the Structs, we can see those Captain Obvious Descriptions:
Figure 2: Example of non-substantive Captain Obvious descriptions.
Here, the descriptions for data and bAllocated, (which by the way, are mislabelled as Parameters when they should be labelled as Fields), provide no substantive information to help the developer understand how those items are to be used or interpreted.
Another example is this Java SDK Documentation:
Figure 3: A giant, mostly useless, list of modules in the Java SDK.
Figure 3 illustrates how typical auto-generated documentation ends up as a giant list of modules. While this may be useful for developers who know exactly what they're looking for, it provides little help to developers who are just trying to understand the SDK. One of the underlying causes for this, is that the developers writing the Documentation make assumptions about the developer audience's level of knowledge about the API.
There are also some additional problems with auto-generated documentation tools. For one, tech writers within your organization can't access the source code files to write or edit the content, unless they've been granted access. And, unless those tech writers have a programming background, or are provided with tools to compile and verify their changes to the code comments, they could potentially break the code. Those writers also need to be familiar with the cryptic markup often required to add content via code comments, which can be error prone and time consuming to write.
Another major problem is that not all source code repos cleanly separate the private code from the public-facing code that comprises the API. This often happens when an organization decides to open up their API to others, after the fact. So, when these auto-generation tools do their giant dump of modules, functions, etc., more often than not, the output includes entities that developers shouldn't know or care about. This can add extra confusion for developers just getting started with your API.
Perhaps the most unfortunate part is that in many cases, these poorly-written, auto-generated Docs are the only Docs available for the developer audience. In these situations, it's usually the case that the company hasn't invested the time to understand the Developer Journey or the need for a more complete set of Getting started guides, Tutorials, etc. Subsequently, the developer audience is often left to effectively reverse engineer the API using what little they can glean from the Docs.
Understand your Developer Audience
It doesn't have to be like this. The first step to great Documentation is to understand your Developer Personas and the journey they take in discovering your APIs and solutions. For this you can use our Developer Segmentation and Developer Persona templates, and Developer Journey map.
Once you understand your developer audience and their needs, the next step is to identify the types of Documentation that will best help them. With our clients, we bring a holistic, external perspective to help them build a content strategy that fits their specific solution. This typically means identifying what the Overview, Quickstart, and initial Tutorials must look like and how they must align with the API documentation. This also involves developing a formal content organization (i.e., table of contents) for the documentation based around the Developer Journey.
Use the Proper Tools
With that completed, the next step is to switch to a modern documentation system designed for the 22nd century and beyond. SaaS documentation platforms like Readme and GitBook have all but replaced old-school, monolithic products like Doxygen, RoboHelp, Framemaker, etc. While today's SaaS offerings may arguably have less functionality than the Framemakers of yesteryear, they focus on simplicity.
Key features that make these new systems so powerful include:
Concurrent accessibility and editing by everyone on the team, without the need to install any software.
Simple formatting using Markdown that anyone can quickly learn.
Content organization structured around a formally thought-out table of contents.
Version control and review functionality.
In the case of Readme, a partitioning of user guides and API documentation, where the latter uses the popular three-column layout for table of contents on the left, content in the middle, and code samples on the right.
Install a Tech Writer
While your developers can still provide the bulk of the technical information, having a separate technical writer who can critique that content and organize it is essential. This frees up your developers to remain focused on creating and coding, while the technical writer takes care of formatting and overseeing all content. The technical writer effectively acts as a gatekeeper, ensuring that all content is of high quality, fits into the narrative of the established developer journey, avoids assumptions, and is kept up-to-date.
No one likes a Captain Obvious, and developers want substantive information that guides them on their journey of discovering your offering and using your API.
Today, this is best accomplished using online documentation platforms like Readme or GitBook and employing the services of a skilled technical writer.
With these resources in place, along with a strong understanding of your Developer Personas and their journey with your offerings, you will be in a much stronger position to offer high-quality documentation that your developer audience will thank you for.