Documenting Spark Code with Scaladoc

Matthew Powers
Feb 18, 2018 · 5 min read

How to generate documentation

spark-daria transformations documentation
Uncollapsed documentation view
/**
* Removes all whitespace in a string
*
{{{
* val actualDF = sourceDF.withColumn(
* "some_string_without_whitespace",
* removeAllWhitespace(col("some_string"))
* )
*
}}}
*
* Removes all whitespace in a string (e.g. changes
`"this has some"` to `"thishassome"`.
*
*
@group string_funcs
*
@since 0.16.0
*/
def removeAllWhitespace(col: Column): Column = {
regexp_replace(col, "\\s+", "")
}
`monospace`
''italic text''
'''bold text'''
__underline__
{{{
val whatIsThis = "a code snippet!"
}}}

Grouping functions

spark-daria function groupings
/**
*
@groupname datetime_funcs Date time functions
*
@groupname string_funcs String functions
*
@groupname collection_funcs Collection functions
*
@groupname misc_funcs Misc functions
*
@groupname Ungrouped Support functions for DataFrames
*/
/**
*
@group string_funcs
*/
def removeAllWhitespace(col: Column): Column = {
regexp_replace(col, "\\s+", "")
}
scalacOptions in (Compile, doc) += "-groups"

Limiting the public interface with the private keyword

package com.github.mrpowers.spark.daria.sql

import org.apache.spark.sql._

case class ProhibitedDataFrameColumnsException(smth: String) extends Exception(smth)

private[sql] class DataFrameColumnsAbsence(df: DataFrame, prohibitedColNames: Seq[String]) {

val extraColNames = (df.columns.toSeq).intersect(prohibitedColNames)

def extraColumnsMessage(): String = {
val extra = extraColNames.mkString(", ")
val all = df.columns.mkString(", ")
s"The [${extra}] columns are not allowed to be included in the DataFrame with the following columns [${all}]"
}

def validateAbsenceOfColumns(): Unit = {
if (extraColNames.nonEmpty) {
throw new ProhibitedDataFrameColumnsException(extraColumnsMessage())
}
}

}
private def toSnakeCase(str: String): String = {
str
.replaceAll("\\s+", "_")
.toLowerCase
}

Hosting the documentation

Next steps

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade