Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Dec 23, 2025

What changes were proposed in this pull request?

Mark DerbyDialect as deprecated, also update docs.

Why are the changes needed?

https://db.apache.org/derby/

On 2025-10-10, the Derby developers voted to retire the project into a read-only state.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Review.

Was this patch authored or co-authored using generative AI tooling?

No.

@pan3793
Copy link
Member Author

pan3793 commented Dec 23, 2025

BTW, Spark ships Derby jars to support embedded HMS, should we switch to an alternative? Seems the only option is H2.

PS: there is no action in the Hive community as of now.

cc @dongjoon-hyun @cloud-fan @LuciferYang

* The included JDBC driver version supports kerberos authentication with keytab.
* There is a built-in connection provider which supports the used database.

There is a built-in connection providers for the following databases:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please spin-off this as a new document PR, @pan3793 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun, I opened #53598 to update the list, will rebase after merging that.

import org.apache.spark.sql.types._


@deprecated(since = "4.2.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I understand the intention, it looks a little weird because this is a private case class. Maybe, we need other way to warn this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I think users can only use JDBC dialects via the JDBC data source, so we should document this deprecation in the JDBC data source doc page.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in addition to mentioning deprecation in docs, how about adding a warning log on loading this class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't plan to remove it why emit the warning log?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with the opinions of @dongjoon-hyun and @cloud-fan.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I will just remove this change and keep code unchanged.


## Upgrading from Spark SQL 4.1 to 4.2

- Since Spark 4.2, support for Derby JDBC datasource is deprecated and will be removed in a future version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we remove this, and will be removed in a future version, because we cannot remove this until Apache Spark 5.0. According to the new rapid release plan, we will have 4.2 in 2026 with the existing release cadence. So, Apache Spark 5 could be 2027.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants