Summer of Code Week 1- 3 mins
So as I said in my previous blog post, I am working with coala for language independent documentation extraction for this year’s Google Summer of Code.
It has been one week since the coding period has started, and there has been some work done! I would like to explain some stuff before we get started on the real work.
So, my project deals with language independent documentation extraction. Turns out, documentation isn’t that independent of the language. Most programming languages don’t have an official documentation specification. But It could be said that documentation is independent of the documentation standard (hereby referred to as
docstyle) it uses.
I have to extract parts/metadata from the documentation like descriptions, parameters and their descriptions, return descriptions and perform various analyzing routines on this parsed metadata.
Most of my work is with the
DocumentationComment class, where I have to implement routines for each language/docstyle. I started out with
python first because of two reasons:
- Its my favourite programming language (Duh!)
coalais written in
python! (Duh again!)
python has its own docstyle, that is known as “docstrings”, and they are clearly defined in PEP 257. Note that PEP 257 is just a general styleguide on how to write docstrings.
The PEP contains conventions, not laws or syntax
It is not a specifictaion.
So, I have come up with the following signature for
DocumentationComment(documentation, language, docstyle, indent, marker, range)
Now let’s say
doc is an instance of
doc would have a function named
parse_documentation() that would do the parsing and get the metadata. So if I have a function with a docstring:
And I load this into the
DocumentationComment class and then apply the parsing:
Note: Not all parameters are required for instantation.
repr(docdata) would print:
You may ask about the strange formatting. That is because it retains the exact formatting, as displayed in the docstring. This is important, because whatever analyzing routines I run, I should always be able to “assemble” back to the original docstring.
That’s it! This was my milestone for week 1, to parse and extract metadata out of python docstrings! I have already started developing a simple Bear, that I will talk about later this week.
PS: I would really like to thank my mentor Mischa Krüger for his thoughts on the API design and for doing reviews on my ugly code. :P