Redshift Lineage, Incubating Mode Integration, and More!
Happy Friday, DataHub Enthusiasts! We just made v0.8.18 available — let’s get you up to speed on what’s in store.
Vulnerability Alert!
We heard today about a vulnerability in log4j2
; big thanks to @frsann for quickly pushing a fix. We encourage all DataHub users to update to the latest version as soon as possible.
Automatic Redshift Lineage
We now support automatically ingesting Dataset → Dataset lineage for Redshift Datasets! This includes Tables, Views, Late-binding Views, External Tables (ie. Redshift Spectrum), and COPY statements from S3.
Check out this great walkthrough from
Metadata Service Authentication
You can now make authenticated requests to the Metadata Service APIs (GraphQL + Restl.li). Interested in a technical deep dive? Check out this post from
John Joyce or give his demo a watch:
Incubating Ingestion Sources
This release contains two new ingestions sources — these are still incubating, but we encourage you to take them for a spin & let us know if you encounter any issues!
Apache Nifi — Now available for extracting metadata about DataJobs and DataFlows. Find the source docs here.
Mode Analytics — Use this to extract reports, charts, and more from your Mode Analytics instance. Read the source docs here.
Add New Aspects without a Fork
This is a major milestone towards No-Code UI — we’re super excited for you to start digging in! ICYMI, we demoed this upcoming functionality during the November Town Hall — watch it below!
Glossary Term Transformer
This transformer allows users to add tags or glossary terms to entities based on a regex match filter. Shoutout to @ecooklin for the first-time contribution to DataHub!
Bug Fixes
Metadata Service
- Empty search query fails to resolve
- Log4j vulnerability addressed
- Improve search & recommendations performance by ~50%, homepage load by ~50%.
- Fix invalid Tag creation policy
- Fix Spring injection of Entity Client inside datahub-upgrade
Metadata Ingestion
- BigQuery: Fix handling of partitioned & snapshotted tables for lineage usage and basic table indexing.
- Recommendations: Fix issue where recently viewed and most popular recommendations were not showing up when user urn contains special characters.
- Add config to specify ca certificate path for datahub-rest sink
- Snowflake: Handling for special characters in Snowflake databases and schemas. Map “geo” type to NullType to prevent errors.
UI
- Fix Groups page not showing asset ownership correctly
- Fix issue where markdown links were not clickable.
- Fix deletes by search cannot accept auth token
Backward Incompatible Changes
The standalone Spring GraphQL Service has been removed. (Replaced by Metadata Service GraphQL API)
Community Contributions
Congrats on first-time contributions! @adriangb @anshbansal @bartlomiejolma @robscriva @ecooklin
Big thanks for your ongoing support @arunvasudevan @aseembansal @claudio @dexter-mh-lee @EnricoMi @frsann @gabe @hsheth2 @jeffmerrick @jjoyce0510 @kevinhu @maggiehays @mayurinehate @pedro93 @rslanka @serefacet @shirshanka @swaroopjagadish @treff7es @varunbharill