If really data is money
An important part of the literature and business strategy is based on the idea that data is money (the new black oil) and that digital services will bring new sources of revenues for a company.
If we really embrace the idea, we should look at data in the same way that we manage our money; and, if data is really money, let’s look together at several point of comparison between our data or our money
Centralize or decentralize ?
Each person has its own way to manage his money, but we can isolate, for the purpose of this comparison, 3 main strategies
- To hide money everywhere in your house.
- To put all the money on one account
- To balance the money on several accounts
Each strategy has his advantages and drawbacks, but we can exercise them with a few questions.
If your focus is to be able to have all the time some money in short proximity (to pay the daily expense for example), your strategy can be to spread your assets in many different locations : your pocket, a book on a shelf, in your car …
However, this strategy will be a challenge if you want to be sure that your money has not been stolen, as you need to check in every place that the money is still there (and with the amount that you put there).
Also, if you want to make a big purchase, you will have to collect all the money and check if you have enough fund for it, unless you keep a global register.
By putting all your money in one account, you simplify the inventory and collection process, but you rely on the high availability of this unique data source, and the way to use the money will not be always optimum for your use.
For example, some expenses needs to use cash or check, and those operations are not rewarded with the same level for a specific account.
The last strategy is then to use several accounts that have specialized benefits for the different transactions that you plan to do, but also using a global register or management tool that allows you to transfer information from one account to another, to perform tax summaries …
Each of those 3 strategies has a matching strategy in the data world :
- Data local to each services
- Centralized unique database
- Central data lake with several databases options based on the nature of the data and the expected processing
Your bank account
Let’s look now at your bank account. What is it to say on this ? Money is money, right ?
Yes, but aside from the account balance, you are interested to be able to analyze the flow of money in and out, focus on a specific provider or recipient (like a shop), or to have global categories on each transaction so that you can plan for a specific budget.
Those data on data (or meta-data) are the information managed by your bank account on top of the money itself.
Also, your bank account has a unique public address that you can share with providers to receive or send money, independently from the features and the specific behavior of your account (like rewards program)
The satisfaction, that you will have on your account provider, is based on the availability of the information, the simplicity of performing your actions and the automatic meta-data collection performed by the system for your convenience.
Of course, the access to this account is secured by a 2-3 factor identification as you need to be confident that the actions that you perform or that another party (like PayPal) performed on your behalf are controlled and monitored
Availability against interest
As we continue to analyze our data strategy, for most of the money investment, a balance has to be found between the interest rate and the availability of the funds
In average, an account with high interest rate will not allow to move the money anytime
In the opposite, maintaining all the funds on a checking account does not bring high interests
The same choices in the strategy will have to be applied to data.
Data with high availability will have high maintenance and storage cost.
Data with high change rate will be difficult to process for trend or consolidation needs
Data storage in distributed storage will have extraction cost but will be efficient for parallel processing and data consolidation
Like your bank account, each type of account will also have different transfer speed to retrieve or transfer money to others. After your own selection, you expect the account to perform on those operations as expected in its performance description.
What about the risk on the money ?
As we deposit money on our bank account, we are sensible to the stability of this money, in other terms, on the fact that the money will still be there when we will need it.
Even if the money is mainly virtualized now, each funds stability relies on two main elements (aside from customer identification)
- Stability of the currency
- Safety of it storage
The stability of a currency is created by a set of concrete values associated to it (like gold) and a regulation on its availability and emission. With the digital age, virtual money (like crypto-currency) have an important risk on this aspect
For data, this value can be measured by its trusted used in operation and the amount of data created on the market.
For example, if the information on the aircraft schedule is made available for free on the market and produced at a high pace, the related value of this information will decrease quickly over time.
Also, if the data proposed by a provider a not a market standard, the data value is different than another provider that is the current market reference on that domain.
The second aspect of the value of the data is its safety. A few years ago, on the wild west, the physical safety was essential and still now, some of our assets are exposed to robbery.
For data, the risk is not only to be lost or stolen but also to be corrupted by the system that use it (or an external system). Many actions has to be performed (like backup and recovery strategies) to guaranty the safety and integrity of the data
With the increase of the virtualization of the money, some additional risks like bank bankrupt needs to be considered.
On data, the digital services shall ensure that the data assets are not stolen (security leak) or damaged (virus or external attack). On this topic, the current trend is to move from a passive behavior (check when requested) to a dynamic behavior (proactive by behavior analysis and recurrent integrity controls).
Examples of this can be detecting inappropriate access to data, unusual actions or absence of expected events (for example your salary that should be on your account 2 days ago)
Reliability and security
We would rarely buy a credit card that do not works in shops or only once every 2-3 transactions.
The reliability of the service is one of the factor of choices and satisfaction for any provider.
A few years ago, some credit cards were only working in some countries and were not accepted in others due to the difference in payment systems or the need for specific card reader.
Now, especially in US, credit card are used widely in every kind of transactions and digital payments solution (like Paypal) are also expanding their range of application.
All of them are considered as reliable and trustable.
For data, reliability of the services and safety of the data created are a core need for any digital offer
If we look now at the other side of the transaction process, we now expect to have immediate safety control in our transaction. Dynamic change of credential, identity check, control by geographical proximity, 3 factor confirmation of transaction are example of safety activities performed seamlessly during the process.
Safety of operation is the warranty of the value of the money processed.
For data, identification of services or users, rotation of credentials, use of encryption or security authority in transaction are similar operations that need to embedded in the digital services to insure that the data value is not lost.
Finally, the last aspect of the reliability of the money is to know if the money has not been stolen.
This check needs to be performed on data by integrity check algorithm, regular re-creation of a known state (immutable infrastructure) and others processes often associated to the 3R (Rotate, Repave, Repair)
Value of the data
As a final though, let’s come back to the value of the data.
In a world with volume of data is doubling in less than 2 years, if we look at data with a straight analogy to money, a currency with such a depreciation rate would be a dangerous and risky placement.
The risk is clearly high on data as, like fresh goods, the implicit value is decreasing over time.
However, to do a fair comparison, we need to take into account several other factors like :
- the data storage and processing cost is decreasing with a high pace too
- the transmission speed and capacity are increasing a a slower rate but still at a high pace
- in many domains, the digitalization of activities allows to use new type of data in operation with a very high return of investment.
So, overall, the data has a value that is perishable and related to the availability to deliver the associated digital service with a short time to market.
This dimension needs to be the highest focus when building business plan on digital services