Apache Griffin (Incubating)

Big Data Quality Solution For Batch and Streaming

ABOUT APACHE GRIFFIN

Apache Griffin is an open source Data Quality solution for Big Data, which supports both batch and streaming mode. It offers an unified process to measure your data quality from different perspectives, helping you build trusted data assets, therefore boost your confidence for your business.


Apache Griffin offers a set of well-defined data quality domain model, which covers most of data quality problems in general. It also define a set of data quality DSL to help users define their quality criteria. By extending the DSL, users are even able to implement their own specific features/functions in Apache Griffin.

Apache Griffin has been accepted as an Apache Incubator Project on Dec 7, 2016.

Apache Griffin handle data quality issues in 3 steps:

Picture

Step 1 Define Data Quality

Data scientists/analyst define their data quality requirements such as accuracy, completeness, timeliness, profiling, etc.

Picture

Step 2 Measure Data Quality

Source data will be ingested into Apache Griffin computing cluster and Apache Griffin will kick off data quality measurement based on data quality requirements.

Picture

Step 3 Metrics

Data quality reports as metrics will be evicted to designated destination.

Picture

Additional Bonus

Apache Griffin provides front tier for user to easily onboard any new data quality requirement into Apache Griffin platform and write comprehensive logic to define their data quality.

ARCHITECTURE

WHO USES Apache Griffin

COMMUNITY

Contribution

Get help using Apache Griffin or contribute to the project

Events

Learn more about Apache Griffin from Conferences

Apache Griffin is an effort undergoing incubation at The Apache Software Foundation (ASF)sponsored by the Apache Incubator PMC. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. Apache Griffin (incubating) is available under the Apache License, version 2.0.

Copyright © 2018 The Apache Software Foundation. All Rights Reserved. Apache, Apache Griffin and the Apache feather logo are trademarks of The Apache Software Foundation.