Analogous to the contrast between waterfall and agile methodologies in software development, Agile Data is focused on achieving a minimum viable releasable improvement. Ignore the scrum masters for a moment; what allows organizations to follow continuous deployment methodologies is a laser precision focus on incrementalism.
A typical timeline for analytical teams to build and deploy an improved customer workflow might look like the following, for instance a new fraud workflow within a large bank (Illustrative):
Month 1: Internal and external data discovery and exploration
Month 2–3: Data access, contracting, and extraction
Month 4: Merging and data cleansing
Month 5: Modeling, out of sample validation, and what-if impact assessments
Month 6: Compliance and regulatory reviews
Month 7: Algorithm implementation and UAT
Month 8–9: Passive usage and validation vs analytical approach
Month 10: Go live
Of course, this timeline might vary wildly by organization, some of whom have automated the end-to-end process. Workflows are increasingly in need of being dynamic. Customer acquisition algorithms, fraud algorithms, and product backends are increasingly in need of constant innovation, not one time changes, especially as new entrants are disrupting and impacting end-customer expectations. This analytical process, even when compressed aggressively, cannot survive. For all of the aforementioned reasons, we at Demyst advocate for an Agile Data methodology centered on attribute level changes. In data-enabled workflows, there is a range of levels of abstraction that one can operate at, each with their pros and cons, laid out below:
My personal experience and belief are that all too often clients operate at the function or model level because of organizational simplicity. Establishing systems to support attribute level changes are not straightforward. However, once established, attribute level changes have major benefits:
It is rarely the case that launching 1 attribute at a time isn’t statistically a better idea. In the case of insurance pricing where regulators only allow changes 1 or 2 times per year, or where there are major systems changes where models can’t be tuned, perhaps a larger change is needed — but a greedy search algorithm where each attribute is evaluated on its own merit is often an optimal path forward.
Data ethics and governance processes become tightly coupled to the analytical decisions. Each attribute is decided atomically and reviewed with ethics top of mind.
Analytical progress is made, tracked, and justified far faster. In days. Organizations quickly yield payback on incremental data and with the right processes know if anything is off track.
Attributes become re-usable assets across analytical workflows, creating a virtuous cycle within an organization, as a by-product of this deployment approach.
Going down this path requires awareness of key interactions though, across both models and attributes. Organizations need to be able to make attribute changes and model changes independently, which can create side effects that can only be resolved through an algorithm rebuild. Business leaders who are accountable for functional level outcomes still need to be able to affect change at a lower level of abstraction.
At Demyst, as the above relates to external data, we support a far smoother buying process. We help clients buy 1 attribute at a time, on a variable cost basis, and scale up as they innovate. In fact, we are launching an attribute level version of our already market-leading catalog at demyst.com. But first executives need to establish a process to be able to affect change on those incremental data.