Data can be used to free up time and rationalise, and for improvement of processes – and even in order to generate fresh business. By evaluating your data-driven improvement projects using 15 specific issues, much frustration and many unwelcome setbacks can be avoided, and the chances of succeeding throughout will be considerably increased.
We have previously written about why it is so important to first identify requirements, problems and challenges that are worth resolving, as well as to define projects that quickly and clearly return value, so as to avoid many resources being ploughed into a black hole and everything coming to naught because of the complexity.
Once the project has been defined it is necessary to review how mature the organisation is in terms of implementing the project, from a data perspective.
– "By asking the questions, project owners get an indication of the departments that need to be involved in the journey. New data projects often involve more aspects of the business and the organisation than you might at first think. It is good for the various people involved to get on track at an early stage," says Stella Riad, unit manager at RISE.
Sheds light on the current situation and the challenges
The purpose of the method is to quickly get out the question: Where do we actually stand?
And above all, what milestones must the project pass so as to move forward?
– "The 15 questions are a practical application of Neil D. Lawrence's research into “data readiness levels” which means initiation of a vital discussion of the project from various standpoints.
In the research the different data-maturity levels are called Band C, Band B and Band A:
- Band C is about availability of the data. Does the data exist? Can it be used?
- Band B is about the reliability of the data. Is it validated?
- Band A is about the usability of the data. Can the data be of use?
Each level simply sheds light on various challenges, and through the 15 questions that RISE has drawn up you become aware of parameters that are otherwise easy to miss.
– "The best thing is for all the people and departments concerned to spend 1-2 hours going over the questions at a Teams meeting. You're going to gain a lot of time from that. We recently had a case, for example, in which nobody had thought certain information could be sensitive. The sooner something like this becomes known, the better.
By asking the questions, project owners get an indication of the departments that need to be involved in the journey
Questions for "Band C" (availability of data)
Do you have programmatic access to data?
Can your developers access it, e.g. using an API or database?
Do you have the requisite licences for use of data?
If you buy data from a third party it is important that you ensure that the licence and the terms of service cover your use case. There can, for example, be a big difference between the way you can use data in an academic context and how you can use the same data commercially.
Have you ensured that you have the legal right to use data?
Have issues such as GDPR been dealt with?
Have you performed an ethical analysis of the use of data?
In certain cases you have to carry out an ethical review in order to be able to use the data for research purposes. Consult a lawyer if you are unsure. The Swedish Ethical Review Authority provides further reading matter regarding this topic.
Is data converted to a machine-readable format?
In order to be able to use data in an automated process, the data has to be readable and manageable by the programs that are to implement the process. Exactly how the data should be formatted depends on the tools that will be using the data. For example, if you intend to use methods for automatic language analysis (what is known as "Natural Language Processing, NLP"), then the documents for analysis must be converted into machine-readable text.
Questions for "Band B" (reliability of data)
Are the data characteristics known?
Has anyone carried out an exploratory data analysis, for example, or have you looked through examples from the data set to ensure you have a idea of its properties and shortcomings?
Is data validated?
Have duplicates been removed, have missing values been dealt with etc.?
Questions for "Band A" (usability of data)
Are all interested parties in agreement on the use case the data is intended to solve?
As already stated, Band A is closely linked to the task to be performed. The first and maybe most difficult issue for a new project is agreeing on what requirement the project is intended to meet.
Are all interested parties in agreement on the aim of using the data?
Not until the need has been identified can you start to think about what the solution might be. Part of the solution is data; do the interested parties concerned know why (i.e. how) the data may be part of the solution? The answer to that question also includes clarification of the expectations for the data, e.g. in terms of availability/access, format, latency etc.
Is the data sufficient for dealing with the use case?
In other words, is the data to which the project has access sufficient for realising the solution(s) regarding the requirement that has/have been established? Is there, for example, any annotated training data of an appropriate quantity and quality?
Is data sufficient to evaluate a solution to the use case?
In other words, is there any appropriate validation and evaluation data? Or are there other ways of evaluating the solution?
Are your organisation's data-gathering processes such that they will support gathering of new data of the kind you are using in this project?
A project sometimes leads to a service that is to be put into production and managed. Can the organisation that is to implement these stages ensure that new (and future) data will be gathered and handled in such a way that the service can survive? Can the organisation, for example, convert incoming reports from PDF to an appropriate text format, or ensure that the measurement points from the sensors used in the project correspond to those used in the production environment?
Is your data secured with regard to hacking and business risks?
Data must not only be available to participants in the project – one should also ensure it is not available to unauthorised users. You need to proactively work on minimising the risk of hacking, as well as developing plans of action that tell the organisation how to act in the event of any hacking. This is seldom something that lies within the parameters of an individual project.
Is it safe for you to share data with others (with regard to business risk)?
Insofar as you wish to share data, you will need to ask yourself whether it is safe to do so, considering the deal you may (potentially) close using the data. Does the sharing of data unintentionally provide information on your capacities and future business plans?
Are you allowed to share data with others, e.g. considering licences and GDPR?
Insofar as you wish to share data, you need to make sure you are actually allowed to do so, bearing in mind such things as GDPR and the licences and terms & conditions linked to any data from a third party.
Feel free to make use of our expertise and experience
If your organisation wishes to benefit from RISE's experience of helping organisations through use of this process, please contact us directly.