Amundsen Monthly Update — July 2021

Mark Grover
amundsen-io
Published in
4 min readAug 12, 2021

--

Summary

July highlights from the Amundsen community:

  • ML features discovery in Amundsen
  • MySQL support in Amundsen
  • Compatibility with Elasticsearch 7
  • Salesforce & Oracle extractors
  • Adding support for nested data types in Delta
  • Proposal to make programmatic descriptions editable
  • How-to: Setup OIDC Authentication in Amundsen

All that and more details below!

Check out last month’s highlights here.

Don’t forget: Join our Slack community at slack.amundsen.io. We can’t wait to meet you!

ML Features Discovery in Amundsen

Video: ML Features Discovery in Amundsen — July 2021 community meeting

We have exciting news for all the data scientists and ML modelers out there! Allison Suarez Miranda, a Software Engineer on the Amundsen team at Lyft, joined us at our July community meeting to speak about ML features discovery in Amundsen. She talked about why we’re starting with features, development of the first iteration, and a sneak peek into future work that we have in store. You can discover features, like the way you discover data in Amundsen!

Check out Allison’s presentation to hear all the details.💡

MySQL Support in Amundsen

Video: MySQL Support in Amundsen — July 2021 community meeting

We have the initial support to use MySQL as the backend metadata store. 🎉 At our July community meeting, Xuan Shen, who’s on the Data Infra team at WePay, spoke about adding MySQL support in Amundsen. This will allow you to use existing MySQL infrastructure as Amundsen’s backend without having to rely on a graph database. Xuan dove into Amundsen RDS and how MySQL works in databuilder and metadata service.

Check out Xuan’s presentation to learn more.💡

Compatibility with Elasticsearch 7

Amundsen is now compatible with Elasticsearch 7! This change is fully backward compatible. If you haven’t upgraded to ES.7, don’t worry. Amundsen works with both ES.6 and ES.7. 😀

Huge shoutout to Verdan Mahmood & Mariusz Gorski! You can see more details about the implementation here: https://github.com/amundsen-io/amundsen/pull/1386

Salesforce & Oracle Extractors

Added new Salesforce & Oracle connectors to Amundsen

Salesforce, Oracle, who’s next? We’ve got two new extractors in Amundsen. 🙌

Salesforce extractor — provides support for extracting basic Salesforce object metadata from Salesforce into Amundsen.

See PR details and docs for more info. Props to Ben Rifkind!

Oracle extractor — extracts table and column metadata (including database, schema, table name, table description, column name, and column description) from the Oracle database into Amundsen.

See PR details and docs for more info. Props to Sávio Teles!

Adding support for nested data types in Delta

Nested Type in Amundsen

We’re adding support for nested datatypes in Delta! Amundsen has had support for extracting, displaying and even searching for nested data for other data sources (like BigQuery). Thanks to Jack Roof at Samsara, we recently added support for extracting and displaying nested data coming from Databricks Delta.

Check out the PR details 👉 here & here.

Proposal to make programmatic descriptions editable

Programmatic descriptions in Amundsen

We’re proposing editable programmatic descriptions.💥 Programmatic descriptions are a useful way for us to store and display business specific metadata about tables in Amundsen. This new feature will allow us to customize machine generated metadata.

Check out the open RFC and let us know what you think! 👀

How-to: Setup OIDC Authentication in Amundsen

OpenID Connect + Amundsen

Verdan Mahmood wrote a step-by-step guide on how to enable OIDC for your Amundsen installation. Check it out and let him know what you think!

Announcements

📣 We’d like to welcome 3 new maintainers to Amundsen! Maintainers are hugely important to our community — they make significant contributions to the project and it signifies community growth. 🌱 This means we’ll be able to develop more and support an even greater amount of users. Please welcome Dmitriy Kunitskiy from Lyft, Grant Seward from Stemma, and Dominik Choma from ING as new Amundsen maintainers! Don’t hesitate to lean on them for questions and support.

📣 We’ve done a bit of Slack housekeeping over the last month. What does this mean for you? Going forward, we’ll be using:

Updated Channels

#announcements — for all announcements for the project. Only maintainers can post on this channel (renamed from #general-no-use).

#amundsen-general-questions — for any general questions you may have about the project (renamed from #amundsen).

#troubleshoot — for specific debugging questions (no changes here).

New Channels (consider joining!)

#introductions — for introducing ourselves. All new members will be added here.

#randomness — because we’re all multifaceted people and sometimes things don’t fit in an existing bucket. 😁

Coming up next…

Next community meeting

  • Date: Thursday, September 2, 9am Pacific, 12pm Eastern, 6pm Central Europe
  • Add to your calendar: https://evt.to/hsusdihw

Join us on Slack: slack.amundsen.io

Subscribe for periodic updates: Medium & Twitter

Curated with ❤ by Stemma

--

--