A Tutorial on Approaching the Topic Modeling of Bank Regulation

Public Deposited

Regtech, a reference to the application of new technologies to bank regulation, mandates a conversation about reducing the burden of bank regulation by letting computers take over some of the handling of regulatory text. Bank regulations and the related manuals, guidance, or other supplements are mostly unstructured. Software tools and statistical models have evolved to “read” unstructured text and create actionable insights by way of “text analytics”, but there are only limited cases of use and application within bank regulation. I contribute to this discussion by reviewing the text of regulatory guidance using text analytics tools. The model I employ seeks to determine the “topics” in which documents may be categorized. In this context, a “topic” division may be based on the bank activity to which the regulation applies, the regulator who authored the text, or even a time period in which the regulatory text was relevant. My objective was to appreciate whether the model could identify the first example - topics based upon the bank activity to which the regulatory text applied. I find that the model’s “topics” are aligned with those of experts, plus are suggestive of a next level deeper than the experts’ topics. However, the model is sensitive to changes in the formatting and word choices in the underlying text and processing choices applied. While the findings promise that there is opportunity in managing regulatory text with text analytics to create efficiency for the human implementers of regulation, they also show the importance of considering how the underlying text will affect the outcome. To that end, I recommend that creators of the text take the needs of text analytics work into account.

Last modified
  • 07/26/2018
Date created
Resource type
Rights statement