data lake : Data Lake allows an organization to store all of their data, structured and unstructured, in one, centralized repository. Since data can be stored as-is, there is no need to convert it to a predefined schema and you no longer need to know what questions you want to ask of your data beforehand.
¶ A Data Lake should support the following capabilities:
· Collecting and storing any type of data, at any scale and at low costs
· Securing and protecting all of data stored in the central repository
· Searching and finding the relevant data in the central repository
· Quickly and easily performing new types of data analysis on datasets
· Querying the data by defining the data’s structure at the time of use (schema on read)
¶ Furthermore, a Data Lake isn’t meant to be replace your existing Data Warehouses, but rather complement them. If you’re already using a Data Warehouse, or are looking to implement one, a Data Lake can be used as a source for both structured and unstructured data, which can be easily converted into a well-defined schema before ingesting it into your Data Warehouse. A Data Lake can also be used for ad hoc analytics with unstructured or unknown datasets, so you can quickly explore and discover new insights without the need to convert them into a well-defined schema. (†2607)