What is Big Data?
A new generation of technologies and architectures, designed to economically extract insight from very large volume of data both structured and unstructured that leads to better decisions and strategic business moves.
Most of the data that wants to be analyzed contains sensitive information, and that’s why it relates to us as security practitioners. The data is being used for surveillance to analyze surveillance information, but also to analyze things like entertainment and social media. Traffic patterns and web visiting patterns for customers. Analyzing customers shopping pattern and then predicting what they’re going to buy next.
WHY IS BIG DATA ON THE RISE?
- Data is the “new oil”: In the recent years, for the first time in history data surpasses oil as the world’s most valuable commodity.
- Data dividend: Data market focusing on location data only where a market-research firm Opimas estimates it will reach $250 million by 2020, according to The New York Times.
- Boosts productivity: Data collection and analysis help boost productivity and make better management decisions by conducting controlled experiments.
- Adjust Business Levels: It enables basic low-frequency forecasting to high-frequency “nowcasting” to adjust business levels just in time.
- Improve Decision Making: It powers sophisticated analytics, which can substantially improve decision-making
- Improve the development of the NextGen Products and Services
- Data Mappers: The process of associating large data sets to the target data.
- NoSQL Databases: NoSQL databases are similar to their relational counterparts, but they employ fewer integrity checks and a less-constrained consistency model, so they are open to security threats and vulnerabilities.
- Storage Auto-tiering: When it comes to storage we wanted it store fast, fast and fast, so storage auto-tiering dynamically moves information between different disk types and RAID levels to meet space, performance and cost requirements.
SECURITY GUIDELINES TO MITIGATE BIG DATA RISKS
|Untrusted Mappers||Ensure the trustworthiness of mappers:|
– Trust Establishment
– Mandatory Access Controls (MAC)
|Lack of support for explicitly enforcing security in NoSQL databases.||– Enforce security through an application or middleware|
– Encryption, encryption, encryption
|Auto-tiering storage solutions may not keep track of where the data is stored||– Plan tiering strategy carefully|
– Outsource verification procedures to a third-party auditor
|Validation / trust of endpoints where |
data is being collected
|Implement BYOD solutions (personal device profiling, registration, onboarding)|
Using security guidelines to deal with this nontraditional approach to data mining. For instance, dealing with non-relational SQL databases or NoSQL databases by enforcing security by writing APIs and using software as middleware to access the database. Add the security layers to that software or encrypt, encrypt and again encrypt those NoSQL databases. In terms of untrusted mappers or crawlers, we have certain security controls that are again in their infancy, perhaps MAC – or mandatory access control – is an example of a specific tool that can be used to ensure the trustworthiness of a mapper.
In some other cases, we have very few options. For instance, if we want higher visibility for the mappers and crawlers to evaluate and analyze more relevant data, what happens if that data is encrypted? Well, there are some protocols under research, something called homomorphic encryption, which allows us to analyze the data even if it is encrypted, and obtain some sort of information out of it. But in any case, perhaps our best options today are to decrypt that data. Now given how fast those crawlers and mappers need to access data, then perhaps decryption is not an option in terms of performance. So if we are to implement a security strategy for big data, we probably have to think out of the box and start building additional layers of control, because there are no standard tools to do so.