How to Store and Secure Sensitive Data in Web Applications

The 2020 Verizon Data Breach Investigations Report (DBIR) says that nearly half (45%) of the breaches featured hacking, and are tied to web application vulnerabilities.

It has more than doubled year over year, 22% of breaches from social attacks and malware attacks, 17% of breaches due to misconfigurations, 8% of unauthorized access and 4% of physical attacks.

To make the hosting and running of a web application possible, several web application components are needed. In a basic environment there should be at least a web server software (such as Apache or IIS), web server operating system (such as Windows, Linux, MacOS), database server (such as MySQL, MSSQL or PostgreSQL) and a network based service, such as FTP or SFTP.

For a secure web server, all of these components also need to be protected to make sure that sensitive data is secured properly. If security breaks at any point, the malicious attackers can gain access to the web application and retrieve data from the database or tamper it.

Sensitive data in web applications

Sensitive data can be any sort of information that needs to be protected from unauthorized access to safeguard the privacy or security of an individual or organisation. It can include any information pertaining to:

Also it can be personally identifying information (PII) or high business impact (HBI) data. Sensitive data varies a lot from country to country and the way you have to store and secure sensitive data can also vary accordingly. Various compliance standards, such as the Payment Card Industry (PCI) compliance standard, require special measures to be taken, when collecting sensitive data to stay in compliance.

In today’s world of infrastructure security- network, host, and application-level, data security becomes more important. Data security, includes the security of:

And the right storage mechanisms should be chosen for storing these data. Storage mechanisms should save information more reliably, reduce bandwidth, and improve responsiveness.

Data Model

Data model is a subset of the implementation model which describes the logical and physical representation of persistent data in the system.

Persistence

Storage methods for web applications can be evaluated according to scope over which data is made persistent.

Client-side Data storage

Client-side storage allows users to store different types of data on the client with users’ permission and then retrieve them whenever needed. This allows users to persist data for long-term storage, save sites or documents for offline usage, keep user-specific settings for the site, and more.

Data can be stored in different ways, such as session storage, local storage, cookies, webSQL, cache and indexedDB.

SessionStorage

SessionStorage object is used to store data on a temporary basis and cleared when the page session ends. Since SessionStorage is tab specific, it is not accessible from web workers or service workers. It is limited to about 5 MB and can contain only strings. It may be useful for storing small amounts of session specific information, for example, IndexedDB key.

LocalStorage

LocalStorage object is used to store data for the entire website on a permanent basis. LocalStorage is not accessible from web workers or service workers. It is limited to about 5MB and can contain only strings. LocalStorage should be avoided because it is synchronous and will block the main thread.

Cookies

Cookies are sent with every HTTP request, so storing data in it will significantly increase the size of web requests. They are synchronous, and are not accessible from web workers. Like LocalStorage and SessionStorage, cookies are limited to only strings. Cookies have their uses, but not a good choice for storage.

WebSQL

WebSQL Support has been removed from almost all major browsers. The W3C stopped maintaining the Web SQL spec in 2010, with no plans to further updates planned. WebSQL should not be used, and existing usage should be migrated to IndexedDB.

Cache

Cache has been deprecated and support will be removed from browsers in the future. Application cache should not be used, and existing usage should be migrated to service workers and the Cache API.

IndexedDB

Unlike most modern promise-based APIs, IndexedDB is event based. Promise wrappers like idb for IndexedDB hide some of the powerful features but more importantly, hide the complex machinery (e.g. transactions, schema versioning) that comes with the IndexedDB library. It is a low level API that requires significant setup before use, which can be particularly painful for storing simple data.

Server-side Data storage

Data storage is usually handled server-side. Data storage can occur on physical hard drives, disk drives, USB drives or virtually in the cloud. Files are backed up and easily available when systems ever crash beyond repair.

There are three broad types of data storage, including direct attached storage, network attached storage and storage area network.

Direct Attached Storage (DAS)

DAS is a storage system where servers are directly connected to the storage device. In DAS, to access data by applications, block-level access protocol is used. Some of the common devices in this category include:

Network Attached Storage (NAS)

Network-attached storage is a file-level computer data storage server and it is connected to a computer network. It offers dedicated file serving and sharing through the network. It increases performance, reliability with features like RAID and swappable drives designed for higher multi-drive workloads.

Storage Area Network (SAN)

A storage area network is a dedicated and high-performance storage system. It transfers block-level data between servers and storage devices. SAN is usually used in data centers, enterprises or virtual computing environments.

Data Storage Devices

Computer storage devices are any type of hardware that stores data. It keeps and retains information short-term or long-term. It can be a device inside or outside a computer or a server.

Hard Disk Drive

Hard Disk Drive (HDD) or Fixed Disk Drive (FDD), is a non-volatile, hardware data storage device attached to a computer or server. It magnetically stores, retrieves, and outputs digital data using a series of stacked rotating metallic disks that have been coated with magnetic material. The rotating disks are paired with an actuator arm which reads and writes the digital data to the disks.

Solid State Drive

Solid State Device (SSD) is a storage device that uses integrated circuit assemblies to store and retrieve data, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It offers swift data transfer between SSD and a smaller physical size than a disk array.

Compact Disk /Digital Versatile Disk

An optical disc drive reads and writes all common Compact Disk (CD) and Digital Versatile Disk (DVD) formats. CD drives are built into computers. A DVD will hold more information than a CD, and therefore can be used for a wide variety of media and storage.

Hybrid Flash Arrays

These storage devices include both flash memory drives and hard disk drives for balanced performance. Hybrid flash arrays use form factors and electrical interfaces that are compatible with common HDD bays. Hybrid flash arrays offer low-cost startup, reasonable performance costs and fast data access on demand.

Hybrid Cloud Storage

Hybrid cloud storage is an approach for managing cloud storage that uses both local and off-site resources. It offers a secure and compliant option that helps to assure business continuity. It accommodates frequent backups and long-term archives as well as future scaling and always-on availability. The combination of cloud and on-premises storage adds a layer of safety to ensure data is protected and available, and storage space could potentially be unlimited.

Backup Software

Computer programs used to perform a backup; creates additional exact copies of files, databases or entire computers. Software for system and enterprise backups typically comes with a license or a subscription rate billed monthly or annually.

Backup Appliances

Accumulates the backup software and hardware components within a single device. Configurations may be complicated and reliability may be at risk with misconfigurations and incorrect software tuning.

Cloud Storage

Complete cloud-based or online storage solutions offer virtual data storage which stores data on the internet through a cloud computing provider. They manage it and are responsible for data availability and accessibility, not just on a local computer or external hard disk. Reliability tends to be on point, but organizations need to consider a cloud storage security strategy before implementing.

Web application vulnerabilities that lead to sensitive data leakage

OWASP Top 10 is the list of the 10 most common application vulnerabilities with its risks, impact, countermeasures and it is updated every three to four years. The latest OWASP vulnerabilities list was released in 2017, they are:

Sensitive data protective measures and mechanisms

Authentication

User authentication plays an important role in addressing many important data protection principles, as it is essential to meeting security, access, consent, and accountability requirements.

Maintaining confidentiality, integrity, and availability for data security is a basic factor in securing data. Authentication of users and even of communicating systems is performed by various mechanisms, but the basic factor of these is cryptography.

Authentication of users takes several forms, but all are based on the combination of authentication factors: something an individual knows (such as a password), something they possess (such as a security token), or some measurable quality (such as a fingerprint).

Single factor authentication is based on only one authentication factor. Stronger authentication requires additional factors; for instance, two factor authentication is based on two authentication factors (such as a pin and a fingerprint).

Access control

Access controls are generally described as discretionary or non-discretionary, and the most common access control models are:

Encryption

There are multiple ways for encrypting data at rest. Following is an outline of various forms of encryption that are protection methods for securing data at rest:

The two goals of securing data in motion are preventing data from being compromised with its confidentiality, integrity, availability. To protect data in motion:

The most common way to protect data in motion is to utilize encryption combined with authentication to create a conduit to safely pass data.

Conclusion

To sum up things, in order to store and secure sensitive data it is important to choose the right mechanism. Yet it is evident that securing sensitive data cannot be assured only with the right storage mechanism, it also requires proper security of the application. If the application ends up being vulnerable, then it makes it easier for an attacker to retrieve sensitive data.

Automated human-like penetration testing for your web apps & APIs

Teams using Beagle Security are set up in minutes, embrace release-based CI/CD security testing and save up to 65% with timely remediation of vulnerabilities. Sign up for a free account to see what it can do for you.