gardener / docforge Goto Github PK

Scalable build tool for distributed documentation sources

License: Apache License 2.0

Go 90.82% Shell 7.22% Dockerfile 0.14% Makefile 1.24% Python 0.59%

docforge's Introduction

Gardener

Gardener implements the automated management and operation of Kubernetes clusters as a service and provides a fully validated extensibility framework that can be adjusted to any programmatic cloud or infrastructure provider.

Gardener is 100% Kubernetes-native and exposes its own Cluster API to create homogeneous clusters on all supported infrastructures. This API differs from SIG Cluster Lifecycle's Cluster API that only harmonizes how to get to clusters, while Gardener's Cluster API goes one step further and also harmonizes the make-up of the clusters themselves. That means, Gardener gives you homogeneous clusters with exactly the same bill of material, configuration and behavior on all supported infrastructures, which you can see further down below in the section on our K8s Conformance Test Coverage.

In 2020, SIG Cluster Lifecycle's Cluster API made a huge step forward with v1alpha3 and the newly added support for declarative control plane management. This made it possible to integrate managed services like GKE or Gardener. We would be more than happy, if the community would be interested, to contribute a Gardener control plane provider. For more information on the relation between Gardener API and SIG Cluster Lifecycle's Cluster API, please see here.

Gardener's main principle is to leverage Kubernetes concepts for all of its tasks.

In essence, Gardener is an extension API server that comes along with a bundle of custom controllers. It introduces new API objects in an existing Kubernetes cluster (which is called garden cluster) in order to use them for the management of end-user Kubernetes clusters (which are called shoot clusters). These shoot clusters are described via declarative cluster specifications which are observed by the controllers. They will bring up the clusters, reconcile their state, perform automated updates and make sure they are always up and running.

To accomplish these tasks reliably and to offer a high quality of service, Gardener controls the main components of a Kubernetes cluster (etcd, API server, controller manager, scheduler). These so-called control plane components are hosted in Kubernetes clusters themselves (which are called seed clusters). This is the main difference compared to many other OSS cluster provisioning tools: The shoot clusters do not have dedicated master VMs. Instead, the control plane is deployed as a native Kubernetes workload into the seeds (the architecture is commonly referred to as kubeception or inception design). This does not only effectively reduce the total cost of ownership but also allows easier implementations for "day-2 operations" (like cluster updates or robustness) by relying on all the mature Kubernetes features and capabilities.

Gardener reuses the identical Kubernetes design to span a scalable multi-cloud and multi-cluster landscape. Such familiarity with known concepts has proven to quickly ease the initial learning curve and accelerate developer productivity:

Kubernetes API Server = Gardener API Server
Kubernetes Controller Manager = Gardener Controller Manager
Kubernetes Scheduler = Gardener Scheduler
Kubelet = Gardenlet
Node = Seed cluster
Pod = Shoot cluster

Please find more information regarding the concepts and a detailed description of the architecture in our Gardener Wiki and our blog posts on kubernetes.io: Gardener - the Kubernetes Botanist (17.5.2018) and Gardener Project Update (2.12.2019).

K8s Conformance Test Coverage

Gardener takes part in the Certified Kubernetes Conformance Program to attest its compatibility with the K8s conformance testsuite. Currently Gardener is certified for K8s versions up to v1.30, see the conformance spreadsheet.

Continuous conformance test results of the latest stable Gardener release are uploaded regularly to the CNCF test grid:

Provider/K8s	v1.30	v1.29	v1.28	v1.27	v1.26	v1.25
AWS
Azure
GCP
OpenStack
Alicloud
Equinix Metal	N/A	N/A	N/A	N/A	N/A	N/A
vSphere	N/A	N/A	N/A	N/A	N/A	N/A

Get an overview of the test results at testgrid.

Start using or developing the Gardener locally

See our documentation in the /docs repository, please find the index here.

Setting up your own Gardener landscape in the Cloud

The quickest way to test drive Gardener is to install it virtually onto an existing Kubernetes cluster, just like you would install any other Kubernetes-ready application. You can do this with our Gardener Helm Chart.

Alternatively you can use our garden setup project to create a fully configured Gardener landscape which also includes our Gardener Dashboard.

Feedback and Support

Feedback and contributions are always welcome!

All channels for getting in touch or learning about our project are listed under the community section. We are cordially inviting interested parties to join our bi-weekly meetings.

Please report bugs or suggestions about our Kubernetes clusters as such or the Gardener itself as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).

Learn More!

Please find further resources about our project here:

docforge's People

Contributors

Stargazers

Watchers

Forkers

g-pavlov shturec swilen-iwanow kristian-zh vlvasilev dimitar-kostadinov kostov6 isabella232 dimityrmirchev oliver-goetz luis-sousa-pinto raphaelvogel vogelhome lucabernstein

docforge's Issues

Silence forge command usage on error except wrong flags/cmd/args

What would you like to be added:
Silence forge command usage on error except wrong flags/cmd/args.

Why is this needed / describe a real-world scenario:
Displaying the usage on every error is useless and clutters the screen pushing up the stack the errors that a user is actually interested in.

How to categorize this?:
/kind enhancement
/priority normal

Integrate broken links checker for docforge docs

What would you like to be added:
Broken links checker for docforge docs

Why is this needed:
Automatically sanitize documentation material

merge instead of append resolved nodesSelector nodes

Consider a situation with a node that declares both explicitly defined nodes and a nodesSelector that resolves to nodes that overlap in names and hierarchy. This approach is a way to address the limited means to control the nodes generated nodes from a nodesSelector if that is necessary.

Currently however the nodes resolved form a nodesSelector are not merged but appended, which will lead to nodes duplication in this scenario. This PR comment describes the situation and the expected behavior.

How to categorize this issue?
/kind bug
/priority normal

Preserving Pulled Documents Consistency

Pulled markdown documents are very likely to contain links to other resources, such as multimedia files (e.g. images), locations in the same document (e.g. a section head), other markdown documents or websites. To keep their content consistent after the relocation, some linked resources may need to be downloaded too and links to be adjusted accordingly.

Links in Markdown Documents

The links that will be processed are anything that falls in this scope:

All forms of image, hyperlink or autolink markdown as specified in Commonmark and the GitHub flavored markdown.
Any HTML element with "src" or "href" attribute, because Markdown permits raw HTML and it's fairly common practice to make use of that.

Links to documents

Markdown documents are downloaded only if they are nodes in the target documentation model, defined explicitly or implicitly with nodeSelector. All cross links to downloaded documents are converted to relative. The links destinations are calculated and adjusted to reflect correctly the potentially new location of the referenced documents, defined in the documentation model. This applies both to originally relative and absolute links and links between GitHub repositories.

If a linked document is not a node in the documentation model, then it will not be downloaded and the link to it is rewritten to its resolved absolute form.

Cascading download of documents based on hyperlinks in their content is not supported intentionally to ensure predictable results and avoid accidental downloads.

Links to resources

Resources linked by downloaded documents are downloaded if they are in a linking document locality domain, and an exclusion setting specified in the document model does not prevent that.

The definition of a "locality domain" is the information space where all resources can be considered local to each other. Obviously the actual interpretation of such definition will vary across source systems and project organizations. For documents and resources stored in GitHub, the natural locality domains are the repositories where they are managed. In a file system, that may be the root of the file system or a directory. A locality domain may also be interpreted by a logical criteria, not only physical. In a project, the locality domain may be a component, that is spread across several GitHub repositories.

Resources are downloaded in a dedicated structure, potentially split into subpaths for efficiency, with their names changed to UUIDs to avoid potential name and path clashes. Links in all downloaded documents originally referencing a resource that has been downloaded and processed like that are adjusted according to the documents relative position to the new location of the resource and rewritten as relative links.

Link adjustment to downloaded resource and rewrite to relative applies to all downloaded documents that resource in any way regardless of their original locality domain.

Absolute links that do not need to be processed because of a reason outlined so far are left in tact.
Relative links that have not been processed for a reason outlined so far are converted to absolute.

Links to internal document sections

Internal document links (e.g. #heading-section-id) are not processed and are left as is, unless the document section they reference has been removed by an exclusion pattern. In this case, the link is removed.

Linked documents and resources versions

The topic is discussed in detail in issue #17

docforge option to pull files github info

What would you like to be added:
Optionally, flag docforge to pull also github information for (resolved) documentation nodes in a manifest. The gitinfo contains each document node source:

author details
last update date
commits history

Git info is fetched only for document nodes with a source property. nodes with contentSelector and template are not supported.
Files are serialize din json format in a dedicated destination provided as flag or sit side-by-side with the documents they apply to. Their naming format is name-of-document-source.json.
Fetching git info must be a non-blocking parallel job similar to downloading resources.

Why is this needed:
Git information is often handy to provide context to pages. Such data is used extensively on the gardener.cloud website and is the last reason to keep the nodejs script that is part of the build. Docforge can do that relatively easy and take it over for nodejs.

downloads of resources with correct path and wrong resource name fail silently

What happened:
Downloads of resources with correct path and wrong resource name fail silently.

What you expected to happen:
A not found error is supposed to indicate the failed resource URL

/priority normal

Replicate file/folder structure by spec: preserve with path rules

Installation with brew

What would you like to be added:
Installation with brew similar to what we have for gardener/gardenctl

Why is this needed / describe a real-world scenario:
Consumption

How to categorize this?:
/kind enhancement
/priority normal

Exclude a subset of nodes selected by NodeSelector

It is convenient to grab a document structure that is good enough as it stands and pull it with a node selector, but sometimes only a subset may be necessary. It should be possible to exclude document nodes from a structure built by a node selector specifying the pattern in the document structure manifest's node selector element.

Create CI/CD pipeline

What would you like to be added:
CICD pipeline supporting the automation of the the engineering process around the tool
Support:

head-update, pull requests
- check, test and build
- publish image at eu.gcr.io/gardener-project/docode
release jobs
- check, test and build
- publish image at eu.gcr.io/gardener-project/docode
- publish in slack upon release ti_workspace/G0170ECNADC
- automatically update minor version upon release
- publish Git release with release notes and binaries for supported platforms (linux/amd64, darwin/amd64 and windows/386)

Why is this needed:
Automatic quality control and engineering

Fix silent 'not found' errors

What would you like to be added:
This PR fixes silent errors when document node source's resource name is wrong. Same fix is necessary also for ContentSelector.Source, Template.Path and NodeSelector.Path.

Why is this needed / describe a real-world scenario:
Find out the reason for files that fail to download in that case.

How to categorize this?:

/kind enhancement
/priority normal

Module docforge manifests

What would you like to be added:
"Modules" or "structure includes" are pointers to external structure that work similar to nodeSelector - they are resolved to a sub-structure of nodes hierarchy inside a node and are modeled as node field.
This is a recursive property, and its resolution must cascade down to fully a resolved node hierarchy. Circular dependencies are not allowed.

Why is this needed:
It should be possible to modularize manifests to support component documentation aggregation and long documentation structure breakdown scenarios.

Failed to parse a link that contains src= with --hugo flag enabled

What happened:
Failed to parse a link that contains src=. Same as the issue fixed with #82 but applies when dockforge is run with --hugo flag

W1109 18:38:04.663167  814998 processor.go:77] Invalid link:gardener.cloud.community%40gmail.com), or paste this [iCal url](https://calendar.google.com/calendar/ical/gardener.cloud.community%40gmail.com/public/basic.ics) into any iCal client.

If you have a topic you'

What you expected to happen:
HTML links should be parsed correctly

How to reproduce it (as minimally and precisely as possible):
Use https://raw.githubusercontent.com/gardener/documentation/master/CONTRIBUTING.md or use a document with similiar content:

[test](https://a.b/c?src=x)

How to categorize this issue?
/kind bug
/priority critical

Updates of md links text and title do not work

What happened:
Trying to update a text or title of an md link doesn't work

What you expected to happen:
They are expected to be updated

How to reproduce it (as minimally and precisely as possible):
Create a rewrite rule that changes text/title of a link in document

/kind bug
/priority normal

Forging documents with links to GitHub commit(s) fail

What happened:
Forging documents with links to GitHub commit(s) resources fail:
Unsupported GitHub URL: https://github.com/gardener/gardener/commit/17084191c752c206537b9506b54828f4d723d9b7 . Unknown resource type string 'commit'. Must be one of [tree blob raw wiki releases issues issue pulls pull]

What you expected to happen:
Documents with GitHub commit(s) resource type links should not fail.

How to categorize this issue?
/priority normal

Support variables substitution

What would you like to be added:
Optional definition of variables in a documentations structure manifest.

Why is this needed:
Variables can be used to parametrize a structure, encapsulating changes only within a single variables block in the manifest. That:

reduces redundancies and thus error-proneness and potentially the length of the manifest
improves the automation opportunities for employing docforge

Concerning the latter point, a reference case could be the following.

Consider a documentation structure manifest in the Gardener documentation designed for the Gardener website. The documentation repo is piped in the cicd to release a build upon a release by the Gardener component. With component versions parameterized for each link to a component documentation material in the manifest, the documentation release job build can consult with Gardener's component descriptor and its dependencies versions for the new release and automatically update the component versions variables in the manifest. That will ensure automated consistency between the Gardener BoM and the documentation structure BoM encoded in the manifest.

Even without fully automating the release process it is significantly easier to maintain variables values instead of the version in each link in a large document structure.
Example reference structure with variables:

var:
- gardener-version: v1.12.0
- gardener-extension-provider-aws-version: v1.15.3
- gardener-github-org-path: https://github.com/gardener
root:
  name: doc
  nodes:
  - name: gardenlet
    contentSelectors:
      - source: ${gardener-github-org-path}/gardener/blob/${gardener-version}/docs/concepts/gardenlet.md
  - name: aws-provider
    contentSelectors:
      - source: ${gardener-github-org-path}/gardener-extension-provider-aws/blob/${gardener-extension-provider-aws-version}/docs/usage-as-operator.md
localityDomain:
  ${gardener-github-org-path}/gardener:
    version: ${gardener-version}
    path: gardener/gardener
  ${gardener-github-org-path}/gardener-extension-provider-aws:
    version: ${gardener-extension-provider-aws-version}
    path: gardener/gardener-extension-provider-aws

File system resource handler

What would you like to be added:
A file system resource handler that resolves and reads links from local file system.

Why is this needed:
The file system handler enables the mashup of local content with content from other storage systems such as GitHub.

Replicate file/folder structure by spec: by metadata selector

Adapt tests for the GitHub backend task

Mock createBlobFromTask

Prevent circular references in included documentation structures

What would you like to be added:
#87 adds support for including a documentation structure into another documentations structure. Moreover, this can go on recursively. However, it is necessary to address a situation with possible circular inclusions and yield error an error.

Why is this needed / describe a real-world scenario:

Manifest A declares:

nodesSelector:
  path: manifestB

Manifest B declares:

nodesSelector:
  path: manifestA

and this cannot be resolved into a meaningful structure.

How to categorize this?:
/kind enhancement
/priority normal

Docforge File system handler fails to resolve documentation manifest that are not in cwd when using nodesSelector

What happened:
With the manifest.yaml

nodesSelector:
  path: module.yaml

and filesystem structure

c
| - manifest.yaml
| - module.yaml

Docforge fails to find module.yaml because it is not in the cwd
What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

How to categorize this issue?

/kind bug
/priority normal

Addressing versions in document cross-repository/component links

Relative links in source documents are resolved to absolute and that will set their version to the same as the linking document.
Absolute links to other nodes in the same repository (that should be relative) that feature a specific version may do that on purpose (to pin to exact state) or as a result of bad practices.

Absolute links to another repository are normally using its master. As development progresses it may turn out that an older version of a linking document is pointing to a different, updated state of the master version of the linked document. This does not reflect consistently the state of the product for a particular version.

To resolve this, the older version of the document should be updated to link to a valid state of the linked document to keep the common information space consistent for that version. Or even better that should happen upon release to have all links to master versions changed with the respective component versions.

Another approach would be to manage the created bundles versioned in a repo too but that will not enable reproducible builds for a particular version of the whole product.

Managing this manually can be an overkill and should be aided by automation if that's what's necessary to happen. We need to further discuss the following options:

If the absolute link version doesn't match a node, then keep the original link
If the absolute link points a node, update absolute link with the version of the document representing the node (e.g. master -> 1.8.0).
Do not mandate an approach but make it configurable (either of the above) e.g. using semver format to instruct behavior per node or globally

markdown links that contain src= or href= fail processing

What happened:
Markdown links that contain src= in their query string or URL in general, are erroneously matched by the regex for HTML src attributes and processed as tag attributes, which predictably fails.

What you expected to happen:
The regex for href|src attributes to consider more precisely if the match is within a tag.

How to reproduce it (as minimally and precisely as possible):

structure:
- source: https://github.com/gardener/documentation/blob/master/CONTRIBUTING.md

or use a document reference that contains something in the lines of: [calendar.google.com](https://calendar.google.com/calendar/embed?src=gardener.cloud.community%40gmail.com)

/kind bug
/priority normal

Concurrent map writes panic on git info contributors

What happened:

fatal error: concurrent map writes

goroutine 59 [running]:
runtime.throw(0xb4af44, 0x15)
	/usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc0006bfbe8 sp=0xc0006bfbb8 pc=0x433812
runtime.mapassign_faststr(0xa5aac0, 0xc0001bf590, 0xc0016b18a0, 0x12, 0xc0008a40f0)
	/usr/local/go/src/runtime/map_faststr.go:291 +0x3de fp=0xc0006bfc50 sp=0xc0006bfbe8 pc=0x41315e
github.com/gardener/docforge/pkg/reactor.(*gitInfoController).updateContributors(0xc0001bf560, 0xc000792600, 0x960, 0x1047, 0x960, 0x0)
	/tmp/build/80754af9/git-gardener.docforge-master.master/pkg/reactor/gitinfo_controller.go:170 +0x18c fp=0xc0006bfcd0 sp=0xc0006bfc50 pc=0x98250c
github.com/gardener/docforge/pkg/reactor.(*gitInfoWorker).Work(0xc00000dca0, 0xc31c80, 0xc0000a3b00, 0xc0001bf560, 0xa06d60, 0xc0006b4248, 0xc31ec0, 0xc0001bf530, 0xc0006bfe78)
	/tmp/build/80754af9/git-gardener.docforge-master.master/pkg/reactor/gitinfo_controller.go:124 +0x45c fp=0xc0006bfdf8 sp=0xc0006bfcd0 pc=0x981efc
github.com/gardener/docforge/pkg/reactor.withGitInfoController.func1(0xc31c80, 0xc0000a3b00, 0xa06d60, 0xc0006b4248, 0xc31ec0, 0xc0001bf530, 0xc005fbc1f8)
	/tmp/build/80754af9/git-gardener.docforge-master.master/pkg/reactor/gitinfo_controller.go:97 +0x72 fp=0xc0006bfe50 sp=0xc0006bfdf8 pc=0x988d82
github.com/gardener/docforge/pkg/jobs.WorkerFunc.Work(0xc00000dce0, 0xc31c80, 0xc0000a3b00, 0xa06d60, 0xc0006b4248, 0xc31ec0, 0xc0001bf530, 0x0)
	/tmp/build/80754af9/git-gardener.docforge-master.master/pkg/jobs/jobs.go:103 +0x62 fp=0xc0006bfe98 sp=0xc0006bfe50 pc=0x8c2b22
github.com/gardener/docforge/pkg/jobs.(*Job).startWorkers.func1(0xc00000dfa0, 0xc000088d70, 0xc00068f070, 0xc31c80, 0xc0000a3b00, 0x4, 0xc31ec0, 0xc0001bf530, 0xc0004625a0)
	/tmp/build/80754af9/git-gardener.docforge-master.master/pkg/jobs/jobs.go:223 +0x326 fp=0xc0006bff98 sp=0xc0006bfe98 pc=0x8c3fc6
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc0006bffa0 sp=0xc0006bff98 pc=0x4634d1
created by github.com/gardener/docforge/pkg/jobs.(*Job).startWorkers
	/tmp/build/80754af9/git-gardener.docforge-master.master/pkg/jobs/jobs.go:207 +0x127

What you expected to happen:
no panic

How to categorize this issue?
/kind bug
/priority critical

Missing tests for the core reactor's logic that resolves the documentation structure nodesSelector

Test coverage of the reactor package has to be improved.

HTML links are not updated in cases they should

What happened:
HTML links do not get updated to absolute or downloaded resource reference in cases they need to.

What you expected to happen:
HTML links should be updated similar to the MD links.

How to reproduce it (as minimally and precisely as possible):
Try a structure with a document with source referencing content with a an HTML relative link that should convert to absolute because it is not in download scope. Or a link that has to convert to downloaded resource when in scope.

How to categorize this issue?
/kind bug
/priority critical

Do not rely on file extension to figure its content type

What would you like to be added:
Do not rely on file extension to figure its content type.

Why is this needed / describe a real-world scenario:

For example in Hugo files may have extension html, because commonly IDEs are relying on this to match editors for content, but it's still treated like MD if it starts with frontmatter.
Markdown can come from any API, not only GitHub, and the resource path may not necessarily include file extension.

How to categorize this?:
/kind enhancement
/priority normal

Maintain valid absolute links to tree objects when rewriting them

What happened:
When converting GitHub document links to absolute for indexing, we always assume blobs but in practice a link can have a tree object as destination, turning it to invalid one when it is necessary to serialize the absolute link in the document.

What you expected to happen:
Maintain valid absolute links.

Test with:

Build Documentation Source into Bundle

Documentation source are markdown documents that are potentially spread across multiple storage systems of various type, including GitHub repos and local file system. The task is to replicate and structure this content into a documentation bundle, preserving the consistency of the source material in it. A documentation bundle can then be further processed or directly published on a suitable channel.

The target documentation bundle is modeled to specify the intended logical document structure and the source locations of the documents in it. It can be serialized into a (manifest) file and managed in a source control system, enabling release specific versions of a documentation bundle configuration. This is the input to the system that provides the description of the desired result.

It should work at minimum with GitHub and file system as documentation source storage systems, but it must be left open to extend to others too, e.g. ssh, ftp, confluence, or any meaningful API.

Once pulled from their storages into the intended structure of the bundle, documents are processed further to preserve their consistency. This largely means evaluating linked resources and potentially downloading them, and rewriting links as necessary to reflect changes in location. Preserving documents consistency is described in detail in #7.

It should be possible to add also custom processors to the pulled documents and keep the whole pre-publish process encapsulated within that tool.

Finally, the processed documents are serialized at a chosen destination. At minimum, it must be possible to write the result to the file system, but it must be left open to extend to other storage systems, such as an API. Custom implementations may also go as far as serializing into structures specific to a markdown-to-html build tools (e.g. Hugo) amending or renaming files.

Functional requirements in a nutshell are to deliver a tool that orchestrates the:

pull of documentation sources from multiple locations according to a manifest for documentation structure
their reconciliation into a consistent documentation
the invocation of configured postprocessors on the documents
writing the documents to a target location

The non-functional properties of the tool are:

extensible:
- add new source storages
- add new post-processors for documents
- add new document writers serializing to other storages than the file system (e.g. an API)
scriptable: provided as CLI tool that can be utilized in a container or other scripts
configurable on demand (incl. extensions)
performant: <20sec for 100 documents

Documents cross links consistency

What would you like to be added:
Links in a document are resolved to absolute, and:

If a linked document is a node in the model (either explicitly defined or via nodeSelector), then it will be(or has been) downloaded and the link to it is rewritten to a relative link pointing to its potentially new in the document structure location that has to be calculated relative to the linking document position. This mechanism works both for relative and absolute links (potentially between repos too) in the original documents, converting them to relative links to downloaded documents.
If the linked document is not a node in the model, then it will not be downloaded and the link to it rewritten to its resolved absolute form.

Why is this needed:
A stable, reliable, predictable and sound mechanism for maintaining documents consistency is at the core of the documentation build mechanism.

Implement exclude/include to locality domain

Resources linked by documents can be in or out the scope of the documentation that is required to be downloaded. It should be possible to specify that in a documentation structure manifest. Links to excluded resources are converted to absolute for now.
In future we may consider a more agile approach, e.g. with removing links and text replacements.

How to handle links to tree object

How do we address links to tree objects (folders) in a markdown document?
Download vs convert to absolute link?

Documentation Source Processors

There are a number of tasks that could be performed on a markdown source prior to building it into deployable bundle with tools such as Hugo.

Regardless of the tool that will build the markdown source it is very helpful to be able to keep source minimal and autogenerate and process certain parts of according to some globally defined rules. This epic aims at delivering a set of standalone tasks that can do that for various aspects. Those tools will be applied and validated in the Gardener project but will be generally applicable in the context of delivering documentation-as-code.

The list can expand base don popular demand. A reasonable starter set is:

Image processor
The image processor uses embedded (raster) images as template to create web content that is considerate about its environment and deliver images with sizes that are applicable to the requesting device to save bandwidth and improve user experience.
- Processes images embedded in a document converting them to HTML <img> tag with srcset attribute.
- Processes the images to generate downsized variants, considering the supplied image as the most high-resolution one, and link them in the corresponding srcset attributes. The supported dimensions are configurable.
- The generated images that are large enough, are generated with progressive rendering.
- Converts images to a single format which is a configurable option
Fenced code block processor
Fenced code can be inline, but with the help of a tool can also be imported from a file at build time. That is a particular advantage if it's part of the source code, if the source is a snippet reused across several documents, and because having it out-of-document enables the language-specific tools to act on it as on any source code. Standard markdown does not support anything but inline code. A specific markup that the tool can process is required:
Two options are to use structured comment or standard link markdown with attributes support that will be parsed by a custom markdown parser.
For source code snippets, it is particularly handy to be able to specify a range that is obtained from a source file and displayed as fenced code block. It would be useful to use GitHub's way to specify range in a file (#LX-LY, where X is starting line and Y is end line of the range) along the file location link.

md-2-md parser and renderer retaining original format and supporting changes

What would you like to be added:
Parse markdown and render it back to original without transforming the document format. changes such a as link updates on link destinations or removing links should be possible too with scope of the transformation limited to the changed markup and surrounding elements only and no larger effect on the document format.

Why is this needed:
Currently, the markdown parser we use builds an abstract syntax tree from the source markdown file, completely ignoring the original format of the document, as any markdown parser out there would. The reason is that parsers are extensively created to convert to HML so the task to retain original format is not in scope for most. In our case, we want to make changes to markdown files and retain the original format of the document, including markdown, order and whitespaces.

Having the option to apply formatting rule, preferably configurable, would be a second priority target. It would come in handy as markdownfmt tool, and it certainly has a use case for us too, but our first priority is to be able to reliably parse and make changes to markdown without transforming the original.

Support multiple GitHub instances

Support multiple GitHub instance and credentials for them. For example github.com and github.tools.sap.

Propagate 'not found' errors context

What would you like to be added:
Currently an incorrect resource name (with correct path) yields a warning log entry but the final error says "not found" without any context to find out the the reason at a glance.

Why is this needed / describe a real-world scenario:
Finding out which resource name needs to be corrected.

How to categorize this?:
/kind enhancement
/priority normal

Replicate file/folder structure by spec: preserve same folder structure

Handling links in documents referencing resources and their downloads

What would you like to be added:
All resources in a repository are downloaded if they are linked by a document from the same repository, unless an exclusion pattern supplied globally or per node configuration prevents that. Resources are downloaded in a dedicated folder with their names changed to UUIDs to avoid potential name and path clashes. Links to them are then rewritten as relative and to point to their new location and name. That applies also to any absolute links pointing to this resource regardless whether from a document in the same or a different repository.

Example exclusion pattern:

Why is this needed:
A stable, reliable, predictable and sound mechanism for maintaining documents consistency is at the core of the documentation build mechanism.

See #7 , #24

relative links in file system documents are not resolved correctly

What happened:
A markdown in a document at //a.md for a relative link like [test](./b.md) that is not in download scope gets resolved to [test](/<path>/a.md/b.md)

What you expected to happen:
The resolved path should be /<path>/b.md instead

How to reproduce it (as minimally and precisely as possible):

Create file a.md with content:
```
[test](./a.md)
```
Create manifest like:
```
structure:
   - source: <path>/a.md
```
where <path> is path to the a.md file you created.
Invoke docforge with that manifest and inspect the results in a.md
You should observe
```
[test](/<path>/a.md/a.md)
```
where <path> is path to the a.md file you created.

Using docforge with `--github-info-destination` flag fails when all commits for a file are internal

What happened:
Using docforge with --github-info-destination flag fails with nil pointer reference panic when all commits for a file are internal.

What you expected to happen:
Ignore the file and proceed without panic.

How to categorize this issue?
/kind bug
/priority critical

Updates of text and title in a markdown link fail when longer than the original ones

What happened:
Link updates of markdown link text or title components fail if the update string is longer than the original text. The failure is in the offset of the rest of the sequence of the link components in the changed link.

If a text is changed and it's longer than the original the new text will replace the first characters of the link, because the destination and title components are not offset.

Similar symptoms are observed when the destination is changed and they affect the title component because it is after the destination bytes.

What you expected to happen:
The link text and title should update regardless of the ratio of the changed and changing texts.

/kind bug
/priority normal

Modify MD links with minimal changes on the rest of the document

One of the important tasks to keep documents consistent after download is to change links in the document (and nothing else). Formatting the document according to rules should be a separate and optional task in building a document bundle. In fact it may well be not part of the document bundle build but part of the development process.

Interrupting the program yields `panic: send on closed channel`

What happened:
Interrupting the program yields panic: send on closed channel

What you expected to happen:
Interrupt silently

How to reproduce it (as minimally and precisely as possible):
Run a build with docforge and hit ctrl+C (or the Mac analogue) before it ends

autolinks ending with * are not parsed correctly

What happened:
The link in **Go ahead and help us spread the word: https://gardener.cloud** is parsed as https://gardener.cloud**

What you expected to happen:
The link should be parsed as https://gardener.cloud

/priority normal

Links images and list items text word-wrapped regardless of nowrap renderer setting

What happened:

Text for links, images and list items markdown is word-wrapped, regardless of the global renderer setting to not apply word-wrapping (TextWrap: -1), producing invalid markdown.

What you expected to happen:
A markdown renderer configured for nowrap should behave like that.

How to reproduce it (as minimally and precisely as possible):
[something longer that will lead to wordwrap line of text](https://github.com/gardener/docforge/issues)

round up the docofrge api

What would you like to be added:

Why is this needed:
The lessons learned and feedback that was received for the API it needs to be updated.

Use URL in manifest file path

What would you like to be added:
Optionally use URL in addition to file system path as value of flag --manifest (-f).
Some URLs can reference resources that require authentication for read access.

Why is this needed:
To avoid boilerplate for downloading configuration files from GitHub prior to using them as arguments of the flag.

False error "Multiple peer nodes with property index.." when --hugo

What happened:
Getting "Multiple peer nodes with property index: true detected in ..." for any structure that features node with

properties
  index: true
...

when built with --hugo flag, even if there is only one declaring index:true.

What you expected to happen:
The error should not be yielded if there is a single or 0 nodes declaring index:true, when built with --hugo.

/kind bug
/priority critical

docforge blocks forever when the `--github-info-destination` flag is used

What happened:
docforge blocks forever on updates to contributors list when the --github-info-destination flag is used.

What you expected to happen:
finish successfully

How to reproduce it (as minimally and precisely as possible):
use the --github-info-destination flag

How to categorize this issue?
/kind bug
/priority blocker

Prevent redundant title front-matter property

What would you like to be added:
Use the document front matter if it is available and the node-defined front matter only if it is not in preparation for a more thorough implementation with merge strategies.

Why is this needed / describe a real-world scenario:
If there is front matter in a document and in the node that has source referencing this document, currently it is concatenated as two sequential byte data streams and that leads to potentially redundant properties.

How to categorize this?:
/kind enhancement
/priority critical

gardener / docforge Goto Github PK

docforge's Introduction

K8s Conformance Test Coverage

Start using or developing the Gardener locally

Setting up your own Gardener landscape in the Cloud

Feedback and Support

Learn More!

docforge's People

Contributors

Stargazers

Watchers

Forkers

docforge's Issues

Preserving Pulled Documents Consistency

Links in Markdown Documents

Links to documents

Links to resources

Links to internal document sections

Other links

Linked documents and resources versions

Build Documentation Source into Bundle

Recommend Projects

Recommend Topics

Recommend Org