We all have gigabytes of business and personal data that is held in a variety of formats. Do you know what data you have, if it is secured correctly and where it all is?
This subject is of massive importance to companies, but is equally important to individuals since you have a trove of data that hackers want.
How Data is Generated
Organizations that are in a highly regulated industry have to retain transaction data for up to 10 years. In the UK we have to retain financial records supporting our tax returns for at least 7 years. This data often sits in filing cabinets, boxes in a locked room (or maybe even a secure storage facility) or on a disk somewhere possibly in a database. It is often never looked at beyond the initial generation and use until the retention period expires, at which time the records are just shredded and burned.
Various industries are digitizing their workforce, which means a lot more data is being collected electronically and filed on hard drives or in databases in data centres. This opens up a whole area of cyber security risk if that database is exposed to the Internet and for some reason has not been secured.
As consumers we often receive invoices via email or from printing to an electronic form (e.g. PDF) from the ordering system (e.g. Amazon). We may also use paperless invoicing where we have to go online to download our latest bill. For example, credit cards can be paid through direct debit and you will receive an email when your latest bill is available online.
All this data can be downloaded, stored on a hard drive or cloud storage, or just remains on the source system (e.g. your credit card or energy providers website). Is is largely unused by you unless you need to go back and compare prices or payments beyond the initial consumption of the bill.
As we introduce more IoT devices, these have the ability to generate gigabytes of data every day. For example each autonomous/connected car could generate up to 300 TB of data in a year (and that is only one car).
Data Protection Legislation
As a business, if we are recording Private and Personally Identifiable (PPI) data, this is governed by data protection laws and in particular the GDPR in the EU and the UK. These laws typically stipulate a number of principles:
- Personal data shall be processed fairly and lawfully
- Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes
- Personal data shall be adequate, relevant and not excessive in relation to the purpose or purposes for which they are processed
- Personal data shall be accurate and, where necessary, kept up to date
- Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose
- Personal data shall be processed in accordance with the rights of data subjects (individuals)
- Appropriate technical and organizational measures shall be taken against unauthorized or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data
- Personal data shall not be transferred to a country or territory outside the country or territory in which it was collected and ensures an adequate level of protection for the rights and freedoms of data subjects in relation to the processing of personal data.
The GDPR has further principles, certainly relating to electronic data and in particular to the reporting of data breaches involving personal data.
In the UK the GDPR was enacted under the Data Protection Act 2018 and will remain in force even once the UK leaves the EU fully in 2021 (on the current timeline). For EU nation states, the GDPR will remain in force when involving EU citizens when data is held on these citizens in other countries not in the EU.
One of the principles above states “Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose“. Businesses have to establish their own data retention policy and once that retention period has expired they have to destroy the data. This covers hard-copy as well as data held digitally.
I don’t intend to go into the details of the GDPR in this blog. If you need to do that then there are more qualified people than me with a legal background that can help you out. I am more concerned about the processing of this data in particular on the principle “Appropriate technical and organizational measures shall be taken against unauthorized or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data“.
Cataloguing your Data
If you are in a highly regulated industry, your regulator will stipulate the procedures you need to establish around cataloguing and storage of your data. They will also stipulate that you should be able to extract the data within a reasonable time period should the regulator require an audit. Your data may be stored in various media, for example:
- Filing Cabinets in your office
- Filing Boxes in secure storage
- Hard Drives on a PC or some form of file server
- Cloud storage
- In a providers website (e.g. energy company).
There may be other means of retaining records not listed above, and we are always inventing new means of storing data (some of the latest is on sheets of glass and in DNA).
In order to meet the needs of your regulator you need to be able to recover records easily. You can’t do that if they are not catalogued in some form.
For filing boxes and filing cabinets, it is wise to retain a list of what is in each drawer of a filing cabinet or filing box, where that cabinet or box is located and an reference to locate that information should you need to. The reference you use could be some internal standard, a bar code of a RFID tag.
Large scale hard copy data storage is big business and often use barcodes and RFID tags in the box that can be read from a distance of several metres in secure warehouses. Access to these warehouses will be strictly controlled and recovery of specific boxes will be governed by access and secure shipping rules.
Data retained in electronic media still requires to be catalogued. If we are talking about folders on a hard-drive, then the same principles to storing hard copy apply. You need a catalogue of what each folder contains and appropriate permissions on the folder to stop people from casually browsing private/confidential data. You need to be able to locate data easily and securely. Copying of that data should also be controlled.
If data is in databases, cataloguing is often done by the set of indexes you establish when you build the database as well as the data table structure. Databases are often accessed through applications and in this form can be secured via the application. For example, as a member of the accounting team in a financial services company I need access to the billing information for our clients, but I don’t need access to data around the financial instruments. The Billing team may need access to the actual transactions but would not be able to update the terms of the financial instrument (e.g. a Bond, Mortgage, Loan).
With databases we can extract data as reports, which also need to be access controlled and secured in line with the appropriate regulator rules and data privacy legislation. These reports also need to be controlled so that:
- Only the authorized people can see them
- Copying and distribution is controlled and each copy is catalogued
- Data retention rules are applied to each original and copy.
Securing Data in an Organization
You need to think about how your data is secured, irrespective of how it is stored.
Hard Copy data is the most problematic since it is easily seen and copied. A lot of companies now use secure photocopying facilities where you have to login to the printer to retrieve your printouts and to copy/scan paper copies.
Data often has data classifications. This will depend on your organization and maybe even your regulator. As a guidance these data classifications are often used in an organization:
- Public – Data is in the public domain (e.g. on your website or on printed publicity material)
- Internal Only – Information that is for use within an organization and should not be disclosed outside of the organization (e.g. draft press releases, talking points with clients)
- Confidential – The data contains some information that the organization would prefer not to be made public, but if it was the damage caused by the disclosure would be limited
- Highly Confidential – This will often contain information that if disclosed would be highly damaging to the organization (e.g. financial strategies, private client information)
- Personal and Private Information (PPI) – This is data that will identify an individual, company or other organization (e.g. your home address, date of birth, healthcare information, HR records).
In a military/government scenario you will often see classifications such as:
- Secret (including sub-classifications such as ‘Eyes Only’ meaning you can read it, but cannot copy or retain it)
- Top Secret.
Each item of data in an organization should have an appropriate classification with the appropriate systemic safeguards in place around:
- Visibility within the organization.
We often impose a ‘Need to Know’ principle around data. This principle is a governing rule that determines if you need to know this information to do your job. For example, as a colleague I don’t need to know your HR Record and how much you are paid. As a manager I may need to be able to view the HR records of the people in my team, but not in other teams with a different manager. In this case a ‘Need to Know’ principle may be applied to ensure you need access to the information for the purpose of managing an individual.
Data stored in hard copy media needs to be secured based on its classification. Hard Copy may involve:
- Paper files and notebooks
- Removable electronic media (e.g. portable hard drives, flash/pen drives, CD/DVD/BR discs).
For Public data this is normally not secured beyond the media it is stored in as by its nature it is available to anyone who wants to see it. For data with any of the other classifications a level of security needs to be applied. For hard copy this often requires anything from a locked filing cabinet to a secure storage facility/warehouse.
Individual organizations may have rules around retaining Internal/confidential data in your desk drawer, so long as it is lockable. Others may impose more strict rules. Anything with a higher classification needs secure storage.
When we are dealing with electronic means, we need technology to be appropriately configured to ensure data is not visible to anyone who should not have access. In the case of hard drive folders, you can set up permissions at the:
- Disk Level, which limits access the hard drive
- Folder Level, which limits access to all files within a folder
- File Level which limits access to an individual file.
Under Windows 10, if you right click on a disk/folder/file and select properties you will see something like the following:
The Security Tab allows you to set various permissions:
- Full Control – you have full access to the folder/file
- Modify – you can open and change the file
- Read & Execute – you can read and execute an executable file (e.g. a .EXE application)
- List folder contents – you can see what is in the folder but may not be able to open the files
- Read – you can open the file but cannot change it
- Write – you can write to the folder and/or file
- Special permissions – allows you to change the owner of the file and a number of other system related permissions.
You can apply these permissions based on named users or user groups (see the Group or user names section).
If you are using MACOS, Linux or some other operating system the actual process of setting file permissions will be different, but you should be able to set similar permissions at the Disk, Folder and File levels. Mobile devices (Phones/Tablets) often don’t allow this level of access since they are typically single user devices.
Another category is shared drives/folders. Under Windows you can share disks/folders/files over a network and apply permissions to each. It is normally the role of the network administrator to set these up, so I won;t be going into this in any detail here.
However, while on the subject of Shared Drives, these can be a serious security risk if they are not appropriately secured against the appropriate data classification.
If data is in a database, these systems have to be secured at the database level and at the user application level. A database will likely have a limited number of users that can access the underlying data. These should be secured with a unique username and strong password (at least) and preferably some other form of second factor (e.g. bio-metric, authentication app). This is particularly important if the database is connected to an internet facing application like a website.
Securing Data at Home
As a consumer you may not consider your data to be that sensitive. However knowing some private data about you is all a criminal needs to impersonate you and take out loans or some other form of credit or transaction. Identity theft is a big business.
You need to make sure any online account you have is secured behind a strong password and preferably some form of second factor authentication (e.g. an authenticator app, bio-metrics).
If you use Cloud Storage (e.g. Google Drive, Microsoft OneDrive, Apple iCloud, Amazon cloud, Drop Box), this account must be secured. As best practice, anything you put up there should also be encrypted. This is easy to accomplish as there are applications that allow you to encrypt files (WinZip being one of them). If you are paying for cloud storage, make sure they encrypt your data when stored on their cloud and when transferred to you for processing. Free services are often not encrypted, but encryption is becoming the norm.
If you are storing your data on removable media look into encrypting that device. On Windows you can use BitLocker for both internal hard drives and removable drives.
Make sure your devices are also encrypted. This means your phone, tablet and laptop and desktop PC. If you have a home server, make sure all disks in that machine are encrypted and that data is transferred between devices in an encrypted form. It is not impossible to remove hard drives (as well as embedded SSD’s) and connect them to another device where they can be read.
A data breach is the intentional, or unintentional, release of secure or private/confidential information to an untrusted environment, person or organisation. Other terms used for for this include:
- Unintentional information disclosure
- Data leak
- Information leakage
- Data spill.
Data breaches can happen in various forms. For example:
- You leave a briefcase on the train containing confidential documents
- Someone sees information internally that they should not have seen (e.g. PPI)
- An account login details are disclosed and someone logs in as you with higher privileges than they they have
- Someone gains access to a secure storage facility without the appropriate authorization
- Careless disposal of printed material as well as computer equipment
- Theft of an unencrypted device
- A hacker uses malware and/or zero day vulnerabilities to break into a secured database and extracts data
- Using insecure http access for websites – make sure you always use https
- A database is exposed online without being secured.
Incidents range from concerted attacks by cyber criminals or nation state hackers (Advanced Persistent Threat – APT), or individuals who hack for some kind of personal gain, associated with organized crime, political activist or national governments to careless disposal of used computer equipment or data storage media.
A large number of data breaches that are reported are down to the administrator not securing the database and then by mistake exposing it to the Internet.
Databases when stored within an organization need to be secured to comply with regulator and data privacy requirements. Even backups should be secured and encrypted. However, due to an oversight in the available network paths around a company’s network, it is often possible for a database to be exposed to the internet without doing so deliberately. The Network Administrator should be careful enough to know which paths are fully protected and internal, and which are open to the Internet. But not all network admins are bread equally. There is a whole science around securing a network through segmentation, sub-nets and virtual networks that I won’t go into here, but a network admin should know about these things.
With the invention of readily available cloud services, unqualified people often set these up and forget to fully secure them with the appropriate network masking and authentication. I regularly see databases breached on Amazon Web Services (AWS) specifically because they were discovered and accessed without any authentication or easily crackable/default passwords.
As an example, a group of security researchers set up a Fake Smart Factory that represented a totally insecure environment. This was set up as a honeypot to see how long it would take for hackers to find the unsecured environment and exploit it. It took in all two months for someone to find it and install malware.
Data is a useful resource as it can be used to gain insights into your company using Machine Learning, AI tools and Analytics tools. This also used to be called ‘Big Data’ but is now called ‘Data Science’ (we keep inventing new names for the same thing).
Dark data is data which is acquired through various means but not used in any manner and in particular to derive insights or for decision making.
If you don’t know what data you have, then you cannot use it to gain insight into your company. As a result you don’t know what you have and if you lose it in a data breach. You also have no idea as to the impact of losing it may have.
This has been a long blog, with probably more technical content than is usual. However you must take control of your data whether you are a company or individual.
As a company, if your data is disclosed in a data breach you have to be able to assess the impact of that breach and make appropriate disclosures under the GDPR as as well as legislation in other countries. Failure to do this can result in heavy fines and imprisonment, and that doesn’t include the reputational damage. A lot of small/medium size companies who suffer a data breach are out of business withing 6-12 months.
You can limit the damage through encryption, but this is not totally fool proof since weak encryption and passwords will just allow for the encryption to be broken. If a hacker is motivated enough, even the most strong encryption can be eventually broken. In this case all you can do is put enough obstacles in the way (in the form of string encryption) to make it not worth their while.