The Cloudera Data Platform (CDP) offers data integration, data governance, data cataloguing and transformation capability but not data quality per se.
CDP involves more than 30 different open source (Apache) projects, especially in the security and governance and analytics layers, but it would be tedious to call each of these out by name. Note that the Cloudera Data Warehouse represents a use for CDP. For data ingestion, stream processing, real-time streaming analytics and large-scale data movement into a data lake or cloud stores, CDP’s DataFlow capabilities are powered by Apache NiFi, MiNiFi, Kafka, Flink or Spark Streaming.
Author/s: Philip Howard,Daniel Howard