From 2c421efab0dc43954c5518d7e490ab220c365fa0 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 14:54:49 +0200 Subject: [PATCH 01/23] wip --- content/unpack.md | 514 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 514 insertions(+) create mode 100644 content/unpack.md diff --git a/content/unpack.md b/content/unpack.md new file mode 100644 index 00000000..2c42beb3 --- /dev/null +++ b/content/unpack.md @@ -0,0 +1,514 @@ +--- +layout: sip +permalink: /sips/:title.html +stage: implementation +status: under-review +title: SIP-61 - Unroll Default Arguments for Binary Compatibility +--- + +**By: Li Haoyi** + +## History + +| Date | Version | +|---------------|--------------------| +| Feb 14th 2024 | Initial Draft | + + +## Summary + +This proposal provides a syntax to "unpack" a `case class` _type_ into a definition-site +parameter list via the `unpack` keyword, and to "unpack" `case class` _value_ into a +definition-site argument list via `*`: + +```scala +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +def downloadSimple(unpack config: Config) = doSomethingWith(config) +def downloadAsync(unpack config: Config, ec: ExecutionContext) = doSomethingWith(config) +def downloadStream(unpack config: Config) = doSomethingWith(config) + +// Call with individual parameters +val data = downloadSimple("www.example.com", 1000, 10000) +val futureData = downloadAsync("www.example.com", 1000, 10000, ExecutionContext.global) +val stream = downloadStream(url = "www.example.com", connectTimeout = 1000, readTimeout = 10000) + +// Call with config object +val config = RequestConfig("www.example.com", 1000, 10000) +val data2 = downloadSimple(config*) +val futureData2 = downloadAsync(config*, ExecutionContext.global) +val stream2 = downloadStream(config*) +``` + +## Motivation + +This proposal removes a tremendous amount of boilerplate converting between data structures +and method calls in Scala. For example, the code snippet above without this feature would +have the pa: + +```scala +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +def downloadSimple(url: String, + connectTimeout: Int, + readTimeout: Int) = doSomethingWith(config) +def downloadAsync(url: String, + connectTimeout: Int, + readTimeout: Int, + ec: ExecutionContext) = doSomethingWith(config) +def downloadStream(url: String, + connectTimeout: Int, + readTimeout: Int) = doSomethingWith(config) + +// Call with individual parameters +val data = downloadSimple("www.example.com", 1000, 10000) +val stream = downloadStream(url = "www.example.com", connectTimeout = 1000, readTimeout = 10000) + +// Call with config object +val config = RequestConfig("www.example.com", 1000, 10000) +val data = downloadSimple( + url = config.url, + connectTimeout = config.connectTimeout, + readTimeout = config.readTimeout +) +val futureData = downloadAsync( + url = config.url, + connectTimeout = config.connectTimeout, + readTimeout = config.readTimeout, + ec = ExecutionContext.global +) +val stream = downloadStream( + url = config.url, + connectTimeout = config.connectTimeout, + readTimeout = config.readTimeout +) +``` + +Apart from the huge amounts of code that are required without `unpack` keyword, at both +definition-site and call-site, there are some specific things worth noting: + +1. The "interesting" parts of the code are much harder to spot with all the boilerplate. + For example, `downloadAsync` takes an extra `ec: ExecutionContext`, while the + other two `download` methods do not. This is obvious in the `unpack` implementation, + but invisible in the boilerplate of the status-quo implementation + +2. `Call with individual parameters` is very convenient, but the verbosity + happens in the `Call with config object` use case. Both scenarios are + extremely common in practice: sometimes you want to call a method now, sometimes you want + to save the parameters and call the method later. + +An alternative way to write this today would be using the `RequestConfig` object as the +API to the `download` methods: + +```scala +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +def downloadSimple(config: RequestConfig) = doSomethingWith(config) +def downloadAsync(config: RequestConfig, ec: ExecutionContext) = doSomethingWith(config) +def downloadStream(config: RequestConfig) = doSomethingWith(config) + +// Call with individual parameters +val data = downloadSimple(RequestConfig("www.example.com", 1000, 10000)) +val futureData = downloadAsync(RequestConfig("www.example.com", 1000, 10000), ExecutionContext.global) +val stream = downloadStream(RequestConfig(url = "www.example.com", connectTimeout = 1000, readTimeout = 10000)) + +// Call with config object +val config = RequestConfig("www.example.com", 1000, 10000) +val data = downloadSimple(config) +val futureData = downloadAsync(config, ExecutionContext.global) +val stream = downloadStream(config) +``` + +This removes one set of boilerplate from the `Call with config object` section, but +adds new boilerplate to the `Call with individual parameters` section. Although it is more +concise, this change complicates the API of `download`, needing users to remember to import +and use the `RequestConfig` wrapper. + +Apart from the boilerplate, some things to note: + +1. The `RequestConfig` object is really just an implementation detail of `download` meant + to shared parameters and args between the different `download` methods. From a user + perspective, the name is meaningless and the contents are arbitrary: someone calling + `downloadAsync` would have to pass some params inside a `RequestConfig`, some parameters + outside `RequestConfig`, with no reason why some parameters should go in one place or another + +2. If you want to share code between even more methods, you may end up with multiple `FooConfig` + objects that the user has to construct to call your method, possibly nested. The user would + have to import several `Config` classes and instantiate a tree-shaped data structure just to + call these methods. But this tree-structure does not model anything the user cares about, but + instead models the code-sharing relationships between the various `def download` methods + +```scala +case class RequestConfig(url: String, + timeoutConfig: TimeoutConfig) +case class TimeoutConfig(connectTimeout: Int, + readTimeout: Int) +case class AsyncConfig(retry: Boolean, ec: ExecutionContext) +def downloadSimple(config: RequestConfig) = doSomethingWith(config) +def downloadAsync(config: RequestConfig, asyncConfig: AsyncConfig) = doSomethingWith(config) +def downloadStream(config: RequestConfig, asyncConfig: AsyncConfig) = doSomethingWith(config) + +// Call with individual parameters +val data = downloadSimple(RequestConfig("www.example.com", TimeoutConfig(1000, 10000))) +val futureData = downloadAsync( + RequestConfig("www.example.com", TimeoutConfig(1000, 10000)), + AsyncConfig(true, ExecutionContext.global) +) +val stream = downloadStream( + RequestConfig( + url = "www.example.com", + timeoutConfig = TimeoutConfig(connectTimeout = 1000, readTimeout = 10000) + ), + AsyncConfig(retry = true, ec = ExecutionContext.global) +) +``` + +There are other more sophisticated ways that a library author can try to resolve this problem - +e.g. builder patterns - but the fundamental problem is unsolvable today. `unpack`/`*` solves +this neatly, allowing the library author to use `unpack` in their definition-site parameter lists +to share parameters between definitions, and the library user can either pass parameters +individually or unpack a configuration object via `*`, resulting in both the definition site +and the call site being boilerplate-free, even in the more involved example above: + +```scala +case class RequestConfig(url: String, + unpack timeoutConfig: TimeoutConfig) +case class TimeoutConfig(connectTimeout: Int, + readTimeout: Int) +case class AsyncConfig(retry: Boolean, ec: ExecutionContext) +def downloadSimple(unpack config: RequestConfig) = doSomethingWith(config) +def downloadAsync(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) = doSomethingWith(config) +def downloadStream(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) = doSomethingWith(config) + +// Call with individual parameters +val data = downloadSimple("www.example.com", 1000, 10000) +val futureData = downloadAsync( + "www.example.com", + 1000, + 10000, + true, + ExecutionContext.global +) + +val stream = downloadStream( + url = "www.example.com", + connectTimeout = 1000, + readTimeout = 10000, + retry = true, + ec = ExecutionContext.global +) + +// Call with config object +val config = RequestConfig("www.example.com", TimeoutConfig(1000, 10000)) +val asyncConfig = AsyncConfig(retry = true, ec = ExecutionContext.global) + +val data = downloadSimple(config*) +val futureData = downloadAsync(config*, asyncConfig*) +val stream = downloadStream(config*, asyncConfig*) +``` + +## Applications in the Wild + +### Requests-Scala + +One application for this is Requests-Scala codebase, which inspired the example above. +In the real code, the list of parameters is substantially longer. `def apply` and `def stream` +sharing most parameters - but not all of them - and `apply` delegates to `stream` internally. +There is also already a `case class Request` object that encapsulates the "common" parameters +between them, which is useful if you want to save a request config to use later: + +```scala +class Requester{ + def apply( + url: String, + auth: RequestAuth = sess.auth, + params: Iterable[(String, String)] = Nil, + headers: Iterable[(String, String)] = Nil, + data: RequestBlob = RequestBlob.EmptyRequestBlob, + readTimeout: Int = sess.readTimeout, + connectTimeout: Int = sess.connectTimeout, + proxy: (String, Int) = sess.proxy, + cert: Cert = sess.cert, + sslContext: SSLContext = sess.sslContext, + cookies: Map[String, HttpCookie] = Map(), + cookieValues: Map[String, String] = Map(), + maxRedirects: Int = sess.maxRedirects, + verifySslCerts: Boolean = sess.verifySslCerts, + autoDecompress: Boolean = sess.autoDecompress, + compress: Compress = sess.compress, + keepAlive: Boolean = true, + check: Boolean = sess.check, + chunkedUpload: Boolean = sess.chunkedUpload, + ): Response = { + ... + stream( + url = url, + auth = auth, + params = params, + blobHeaders = data.headers, + headers = headers, + data = data, + readTimeout = readTimeout, + connectTimeout = connectTimeout, + proxy = proxy, + cert = cert, + sslContext = sslContext, + cookies = cookies, + cookieValues = cookieValues, + maxRedirects = maxRedirects, + verifySslCerts = verifySslCerts, + autoDecompress = autoDecompress, + compress = compress, + keepAlive = keepAlive, + check = check, + chunkedUpload = chunkedUpload, + onHeadersReceived = sh => streamHeaders = sh, + ) + ... + } + def stream( + url: String, + auth: RequestAuth = sess.auth, + params: Iterable[(String, String)] = Nil, + blobHeaders: Iterable[(String, String)] = Nil, + headers: Iterable[(String, String)] = Nil, + data: RequestBlob = RequestBlob.EmptyRequestBlob, + readTimeout: Int = sess.readTimeout, + connectTimeout: Int = sess.connectTimeout, + proxy: (String, Int) = sess.proxy, + cert: Cert = sess.cert, + sslContext: SSLContext = sess.sslContext, + cookies: Map[String, HttpCookie] = Map(), + cookieValues: Map[String, String] = Map(), + maxRedirects: Int = sess.maxRedirects, + verifySslCerts: Boolean = sess.verifySslCerts, + autoDecompress: Boolean = sess.autoDecompress, + compress: Compress = sess.compress, + keepAlive: Boolean = true, + check: Boolean = true, + chunkedUpload: Boolean = false, + redirectedFrom: Option[Response] = None, + onHeadersReceived: StreamHeaders => Unit = null, + ): geny.Readable = ... + + def apply(r: Request, data: RequestBlob, chunkedUpload: Boolean): Response = + apply( + r.url, + r.auth, + r.params, + r.headers, + data, + r.readTimeout, + r.connectTimeout, + r.proxy, + r.cert, + r.sslContext, + r.cookies, + r.cookieValues, + r.maxRedirects, + r.verifySslCerts, + r.autoDecompress, + r.compress, + r.keepAlive, + r.check, + chunkedUpload, + ) + + def stream( + r: Request, + data: RequestBlob, + chunkedUpload: Boolean, + onHeadersReceived: StreamHeaders => Unit, + ): geny.Writable = + stream( + url = r.url, + auth = r.auth, + params = r.params, + blobHeaders = Seq.empty[(String, String)], + headers = r.headers, + data = data, + readTimeout = r.readTimeout, + connectTimeout = r.connectTimeout, + proxy = r.proxy, + cert = r.cert, + sslContext = r.sslContext, + cookies = r.cookies, + cookieValues = r.cookieValues, + maxRedirects = r.maxRedirects, + verifySslCerts = r.verifySslCerts, + autoDecompress = r.autoDecompress, + compress = r.compress, + keepAlive = r.keepAlive, + check = r.check, + chunkedUpload = chunkedUpload, + redirectedFrom = None, + onHeadersReceived = onHeadersReceived, + ) +} + +case class Request( + url: String, + auth: RequestAuth = RequestAuth.Empty, + params: Iterable[(String, String)] = Nil, + headers: Iterable[(String, String)] = Nil, + readTimeout: Int = 0, + connectTimeout: Int = 0, + proxy: (String, Int) = null, + cert: Cert = null, + sslContext: SSLContext = null, + cookies: Map[String, HttpCookie] = Map(), + cookieValues: Map[String, String] = Map(), + maxRedirects: Int = 5, + verifySslCerts: Boolean = true, + autoDecompress: Boolean = true, + compress: Compress = Compress.None, + keepAlive: Boolean = true, + check: Boolean = true, +) +``` + +Requests-Scala is like this way because `requests.get(url = "...", data = ..., readTimeout = ...)` +is the API that users want, which can also be seen by the popularity of the upstream Python Requests +library. However, providing this call-site API requires huge amounts of boilerplate, whereas +with `unpack` it could be defined as follows: + +```scala +class Requester{ + def apply( + unpack request: Request, + chunkedUpload: Boolean = sess.chunkedUpload, + ): Response = { + ... + stream( + request*, + chunkedUpload = chunkedUpload, + onHeadersReceived = sh => streamHeaders = sh, + ) + ... + } + def stream( + unpack request: Request, + chunkedUpload: Boolean = false, + redirectedFrom: Option[Response] = None, + onHeadersReceived: StreamHeaders => Unit = null, + ): geny.Readable = ... +} + +case class Request( + url: String, + auth: RequestAuth = RequestAuth.Empty, + params: Iterable[(String, String)] = Nil, + headers: Iterable[(String, String)] = Nil, + readTimeout: Int = 0, + connectTimeout: Int = 0, + proxy: (String, Int) = null, + cert: Cert = null, + sslContext: SSLContext = null, + cookies: Map[String, HttpCookie] = Map(), + cookieValues: Map[String, String] = Map(), + maxRedirects: Int = 5, + verifySslCerts: Boolean = true, + autoDecompress: Boolean = true, + compress: Compress = Compress.None, + keepAlive: Boolean = true, + check: Boolean = true, +) +``` + +Things to note: +* There is a massive reduction in boilerplate from 147 lines to 40 lines, and + the code is much clearer as now the differences between `def apply`, `def stream`, + and `case class Request` are obvious at a glance + +* The `def apply(r: Request, data: RequestBlob, chunkedUpload: Boolean)` + and `def stream(r: Request, data: RequestBlob, chunkedUpload: Boolean, onHeadersReceived: StreamHeaders => Unit)` + overloads are no longer necessary. If someone has a `Request` object, they can simply call + `requests.get.apply(request*, ...)` or `requests.get.stream(request*, ...)` to pass it in, + without needing a dedicated overload taking a `r: Request` object as the first parameter + +## uPickle + +uPickle has a similar API, where the user can call +```scala +val s: String = upickle.default.write(value, indent = 2, sortKeys = true) + +val baos = new ByteArrayOutputStram() +upickle.default.writeToOutputStream(value, baos, indent = 2, sortKeys = true) + +val b: Array[Byte][] = upickle.default.writeToByteArray(value, indent = 2, sortKeys = true) +``` + +This requires definitions such as + +```scala +trait Api { + def write[T: Writer](t: T, + indent: Int = -1, + escapeUnicode: Boolean = false, + sortKeys: Boolean = false): String + + def writeTo[T: Writer](t: T, + out: java.io.Writer, + indent: Int = -1, + escapeUnicode: Boolean = false, + sortKeys: Boolean = false): Unit + + def writeToOutputStream[T: Writer](t: T, + out: java.io.OutputStream, + indent: Int = -1, + escapeUnicode: Boolean = false, + sortKeys: Boolean = false): Unit + + def writeToByteArray[T: Writer](t: T, + indent: Int = -1, + escapeUnicode: Boolean = false, + sortKeys: Boolean = false): Array[Byte] + + def stream[T: Writer](t: T, + indent: Int = -1, + escapeUnicode: Boolean = false, + sortKeys: Boolean = false): geny.Writable +} +``` + +With `unpack`, this could be consolidated as: + +```scala +trait Api { + case class WriteConfig(indent: Int = -1, + escapeUnicode: Boolean = false, + sortKeys: Boolean = false) + + def write[T: Writer](t: T, + unpack writeConfig: WriteConfig): String + + def writeTo[T: Writer](t: T, + out: java.io.Writer, + unpack writeConfig: WriteConfig): Unit + def writeToOutputStream[T: Writer](t: T, + out: java.io.OutputStream, + unpack writeConfig: WriteConfig): Unit + + def writeToByteArray[T: Writer](t: T, + unpack writeConfig: WriteConfig): Array[Byte] + + def stream[T: Writer](t: T, + unpack writeConfig: WriteConfig): geny.Writable +} +``` + +## Limitations +## Alternatives +## Prior Art + +### uPickle's `@flatten` +### MainArg's `case class` embedding +### Python +### Kotlin +### Javascript \ No newline at end of file From f1e1394264e37b46ba20eb7c38b7e6cd1252ca50 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 14:54:56 +0200 Subject: [PATCH 02/23] wip --- content/unpack.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index 2c42beb3..ad3d10f2 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -432,7 +432,7 @@ Things to note: `requests.get.apply(request*, ...)` or `requests.get.stream(request*, ...)` to pass it in, without needing a dedicated overload taking a `r: Request` object as the first parameter -## uPickle +### uPickle uPickle has a similar API, where the user can call ```scala From f66828faf012be66dbb862f1ae7a6e606ace7194 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 15:22:56 +0200 Subject: [PATCH 03/23] wip --- content/unpack.md | 286 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 281 insertions(+), 5 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index ad3d10f2..9139cf50 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -26,9 +26,9 @@ case class RequestConfig(url: String, connectTimeout: Int, readTimeout: Int) -def downloadSimple(unpack config: Config) = doSomethingWith(config) -def downloadAsync(unpack config: Config, ec: ExecutionContext) = doSomethingWith(config) -def downloadStream(unpack config: Config) = doSomethingWith(config) +def downloadSimple(unpack config: RequestConfig) = doSomethingWith(config) +def downloadAsync(unpack config: RequestConfig, ec: ExecutionContext) = doSomethingWith(config) +def downloadStream(unpack config: RequestConfig) = doSomethingWith(config) // Call with individual parameters val data = downloadSimple("www.example.com", 1000, 10000) @@ -42,6 +42,11 @@ val futureData2 = downloadAsync(config*, ExecutionContext.global) val stream2 = downloadStream(config*) ``` +The delegation performed by `unpack` is very similar to inheritance with +`extends`, or composition with `export`. Scala has always had good ways to DRY up repetitive +member definitions, but has so far had no good way to DRY up repetitive parameter lists. +`unpack` provides the way to do so. + ## Motivation This proposal removes a tremendous amount of boilerplate converting between data structures @@ -54,8 +59,8 @@ case class RequestConfig(url: String, readTimeout: Int) def downloadSimple(url: String, - connectTimeout: Int, - readTimeout: Int) = doSomethingWith(config) + connectTimeout: Int, + readTimeout: Int) = doSomethingWith(config) def downloadAsync(url: String, connectTimeout: Int, readTimeout: Int, @@ -503,12 +508,283 @@ trait Api { } ``` +### OS-Lib + +OS-Lib has similar APIs, e.g. +```scala +os.walk(path, preOrder = false, followLinks = true) +os.walk.attrs(path, preOrder = false, followLinks = true) +os.walk.stream(path, preOrder = false, followLinks = true) +``` + +These are defined as + +```scala +object walk{ + def apply( + path: Path, + skip: Path => Boolean = _ => false, + preOrder: Boolean = true, + followLinks: Boolean = false, + maxDepth: Int = Int.MaxValue, + includeTarget: Boolean = false + ): IndexedSeq[Path] = { + stream( + path, + skip, + preOrder, + followLinks, + maxDepth, + includeTarget + ).toArray[Path].toIndexedSeq + } + def attrs( + path: Path, + skip: (Path, os.StatInfo) => Boolean = (_, _) => false, + preOrder: Boolean = true, + followLinks: Boolean = false, + maxDepth: Int = Int.MaxValue, + includeTarget: Boolean = false + ): IndexedSeq[(Path, os.StatInfo)] = { + stream + .attrs( + path, + skip, + preOrder, + followLinks, + maxDepth, + includeTarget + ) + .toArray[(Path, os.StatInfo)].toIndexedSeq + } + object stream { + def apply( + path: Path, + skip: Path => Boolean = _ => false, + preOrder: Boolean = true, + followLinks: Boolean = false, + maxDepth: Int = Int.MaxValue, + includeTarget: Boolean = false + ): Generator[Path] = { + attrs( + path, + (p, _) => skip(p), + preOrder, + followLinks, + maxDepth, + includeTarget + ).map(_._1) + } + def attrs( + path: Path, + skip: (Path, os.StatInfo) => Boolean = (_, _) => false, + preOrder: Boolean = true, + followLinks: Boolean = false, + maxDepth: Int = Int.MaxValue, + includeTarget: Boolean = false + ): Generator[(Path, os.StatInfo)] + } +} +``` + +With `unpack`, this could be consolidated into + +```scala +object walk{ + case class Config(path: Path, + skip: Path => Boolean = _ => false, + preOrder: Boolean = true, + followLinks: Boolean = false, + maxDepth: Int = Int.MaxValue, + includeTarget: Boolean = false) + + def apply(unpack config: Config): IndexedSeq[Path] = { + stream(config*).toArray[Path].toIndexedSeq + } + def attrs(unpack config: Config): IndexedSeq[(Path, os.StatInfo)] = { + stream.attrs(config*) + .toArray[(Path, os.StatInfo)].toIndexedSeq + } + object stream { + def apply(unpack config: Config): Generator[Path] = { + attrs(path, (p, _) => skip(p), preOrder, followLinks, maxDepth, includeTarget).map(_._1) + } + def attrs(unpack config: Config): Generator[(Path, os.StatInfo)] = ??? + } +} +``` + +Things to note: + +1. A lot of these methods are forwarders/wrappers for each other, purely for convenience, and + `*` can be used to forward the `config` object from the wrapper to the inner method + +2. Sometimes the parameter lists are subtly different, e.g. `walk.stream.apply` and + `walk.stream.attrs` have a different type for `skip`. In such cases `unpack` cannot work + and so the forwarding has to be done manually. + +## Detailed Behavior + +`unpack` unpacks the parameter _name_, _type_, and any _default value_ into the enclosing parameter +list. As we saw earlier `unpack` can be performed on any parameter list: `def`s, `class` +constructors, `case class`es: + +```scala +// Definition-site Unpacking +case class RequestConfig(url: String, + unpack timeoutConfig: TimeoutConfig) +case class TimeoutConfig(connectTimeout: Int, + readTimeout: Int) +case class AsyncConfig(retry: Boolean, ec: ExecutionContext) +def downloadSimple(unpack config: RequestConfig) = doSomethingWith(config) +def downloadAsync(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) = doSomethingWith(config) +def downloadStream(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) = doSomethingWith(config) +``` + +### Nested and Adjacent Unpacks + +There can be multiple hops, e.g. `downloadSimple` unpacks `RequestConfig`, and `RequestConfig` +unpacks `TimeoutConfig`, and there can be multiple `unpack`s in a single parameter list as shown +in `def downloadAsync` above. + +Any names colliding during `unpack`ing should result in an error, just like if you wrote: + +```scala +def downloadSimple(foo: Int, foo: Int) = ??? +// -- [E161] Naming Error: -------------------------------------------------------- +// 1 |def downloadSimple(foo: Int, foo: Int) = ??? +// | ^^^^^^^^ +// |foo is already defined as parameter foo +// | +// |Note that overloaded methods must all be defined in the same group of toplevel definitions +// 1 error found +``` + +Similar errors should be shown for + +```scala +case class HasFoo(foo: Int) +def downloadSimple(foo: Int, unpack hasFoo: HasFoo) = ??? +``` + +Or + +```scala +case class HasFoo(foo: Int) +case class AlsoHasFoo(foo: Int) +def downloadSimple(unpack hasFoo: HasFoo, unpack alsoHasFoo: AlsoHasFoo) = ??? +``` + + +### Generics + +Unpacking should work for generic methods and `case class`es: + +```scala +case class Vector[T](x: T, y: T) +def magnitude[T](unpack v: Point[T]) +magnitude(x = 5.0, y = 3.0) // 4.0: Double +magnitude(x = 5, y = 3) // 4: Int +``` + +And for generic case classes referenced in non-generic methods: + + +```scala +case class Vector[T](x: T, y: T) +def magnitudeInt(unpack v: Point[Int]) +magnitude(x = 5, y = 3) // 4: Int +``` + +### Orthogonality + +`unpack` on definitions and `*` on `case class` values are orthogonal: either can be used without +the other. We already saw how you can use `unpack` at the definition-site and just pass parameters +individuall at the call-site: + +```scala +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +def downloadSimple(unpack config: RequestConfig) = ??? + +val data = downloadSimple("www.example.com", 1000, 10000) +``` + +Similarly, you can define parameters individually at the definition-site and `unpack` a `case class` +with matching fields at the call-site + +```scala +def downloadSimple(url: String, + connectTimeout: Int, + readTimeout: Int) = ??? + +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +val config = RequestConfig("www.example.com", 1000, 10000) +val data = downloadSimple(config*) +``` + +And you can `unpack` a different `case class` onto an `unpack`-ed parameter list as long +as the names of the parameters line up: + +```scala +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +def downloadSimple(unpack config: RequestConfig) = ??? + +case class OtherConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +val config = OtherConfig("www.example.com", 1000, 10000) +val data = downloadSimple(config*) +``` + +Mix `unpack`-ed and individually passed argments: + +```scala +case class AsyncConfig(retry: Boolean, ec: ExecutionContext) +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +def downloadAsync(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) = ??? + +case class OtherConfig(url: String, + connectTimeout: Int, + readTimeout: Int, + retry: Boolean) + +val config = OtherConfig("www.example.com", 1000, 10000, true) +downloadAsync(config*, retry = true) +``` + +### Case Class Construction Semantics + +`unpack` may-or-may-not re-create a `case class` instance passed via `config*` to an +`unpack`ed parameter list. This is left up to the implementation. But for the vast majority +of `case class`es with non-side-effecting constructors and structural equality, whether or +not the `case class` instance is re-created is entirely invisible to the user. + + ## Limitations ## Alternatives +### Automatic Unpacking + ## Prior Art ### uPickle's `@flatten` + ### MainArg's `case class` embedding + ### Python + ### Kotlin + ### Javascript \ No newline at end of file From 84e498b08d529c38c893a34ee7c37dd1ce4ae697 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 15:34:48 +0200 Subject: [PATCH 04/23] wip --- content/unpack.md | 111 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 110 insertions(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index 9139cf50..57f5baeb 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -779,12 +779,121 @@ not the `case class` instance is re-created is entirely invisible to the user. ## Prior Art + +There is significant + +### Scala `extends` and `export` + +`unpack` is similar to Scala's `extends` and `export` clauses, except rather than applying +to members of a trait it applies to parameters in a parameter list. It serves a similar purpose, +and has similar ways it can be abused: e.g. too-deep chains of `unpack`-ed `case class`es are +confusing just like too-deep inheritance hierarchies with `extends`. + ### uPickle's `@flatten` +uPickle has the `@flatten` annotation which flattens out nested case classes during +JSON serialization. + +```scala +case class Outer(msg: String, @flatten inner: Inner) derives ReadWriter +case class Inner(@flatten inner2: Inner2) derives ReadWriter +case class Inner2(i: Int) derives ReadWriter + +write(Outer("abc", Inner(Inner2(7)))) // {"msg": "abc", "i": 7} +``` + +Like `unpack`, `@flatten` can be used recursively to flatten out +a multi-layer `case class` tree into a single flat JSON object, as shown above + ### MainArg's `case class` embedding +MainArgs allows you to re-use sets of command-line flags - defined by case classes - +in method `def`s that define the sub-command entrypoints of the program: + +```scala +object Main{ + @main + case class Config(@arg(short = 'f', doc = "String to print repeatedly") + foo: String, + @arg(doc = "How many times to print string") + myNum: Int = 2, + @arg(doc = "Example flag") + bool: Flag) + implicit def configParser = ParserForClass[Config] + + @main + def bar(config: Config, + @arg(name = "extra-message") + extraMessage: String) = { + println(config.foo * config.myNum + " " + config.bool.value + " " + extraMessage) + } + @main + def qux(config: Config, + n: Int) = { + println((config.foo * config.myNum + " " + config.bool.value + "\n") * n) + } + + def main(args: Array[String]): Unit = ParserForMethods(this).runOrExit(args) +} +``` +```bash +$ ./mill example.classarg bar --foo cow --extra-message "hello world" +cowcow false hello world + +$ ./mill example.classarg qux --foo cow --n 5 +``` + +In this example, you can see how `def bar` and `def qux` both make use of the parameters from +`case class Config`, along with their own unique parameters `extraMessage` or `n`. This +serves a similar purpose as `unpack` would serve in Scala code, de-coupling the possibly-nested +`case class` data structure from the flag parameter list exposed to users (in MainArgs, users +interacting with the program via the CLI). + ### Python +Python's [PEP-692](https://peps.python.org/pep-0692/) defines an `Unpack[_]` marker type +that can be used together with `TypedDict` classes. These work similarly to `unpack` in this +proposal, but use typed-dictionary-based implementation for compatibility with Python's +widespread use of `**kwargs` to forward parameters as a runtime dictionary. + +> ```python +> from typing import TypedDict, Unpack +> +> class Movie(TypedDict): +> name: str +> year: int +> +> def foo(**kwargs: Unpack[Movie]) -> None: ... +> ``` +> +> means that the `**kwargs` comprise two keyword arguments specified by `Movie` +> (i.e. a name keyword of type `str` and a year keyword of type `int`). This indicates +> that the function should be called as follows: +> +> ```python +> kwargs: Movie = {"name": "Life of Brian", "year": 1979} +> +> foo(**kwargs) # OK! +> foo(name="The Meaning of Life", year=1983) # OK! +> ``` ### Kotlin -### Javascript \ No newline at end of file +Kotlin has an open extension proposal [KEEP-8214](https://youtrack.jetbrains.com/issue/KT-8214) +to support a `dataarg` modifier on `data class`es that functions identically to `unpack` +in this proposal + +```kotlin +data class Options( + val firstParam: Int = 0, + val secondParam: String = "", + val thirdParam: Boolean = true +) + +fun foo(dataarg options: Options){} + +fun f() { + foo(secondParam = "a", thirdParam = false) + foo(1) + foo() +} +``` \ No newline at end of file From bd4cc826e6e15ed77ee89045196df02b4c3099ca Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 15:45:19 +0200 Subject: [PATCH 05/23] wip --- content/unpack.md | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index 57f5baeb..9e66ebee 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -772,9 +772,36 @@ downloadAsync(config*, retry = true) of `case class`es with non-side-effecting constructors and structural equality, whether or not the `case class` instance is re-created is entirely invisible to the user. +### Name-Based Unpacking + +Unpacking at callsites via `*` is done by-field-name, rather than positionally. That means +that even if the field names are in different orders, it will still work + +```scala +def downloadSimple(url: String, + connectTimeout: Int, + readTimeout: Int) = ??? + +case class RequestConfig(connectTimeout: Int, // Different order! + url: String, + readTimeout: Int) + +val config = RequestConfig("www.example.com", 1000, 10000) +val data = downloadSimple(config*) // OK +// Equivalent to the following, which is allowed today in Scala +val data = downloadSimple( + connectTimeout = config.connectTimeout, + url = config.url, + readTimeout = config.readTimeout +) +``` + +In general, we believe that most developers think of their `case class`es as defined by +the field names and types, rather than by the field ordering. So having `*` unpack `case class` +values by field name seems like it would be a lot more intuitive than relying on the parameter +order and hoping it lines up between your `case class` and the parameter list you are unpacking +it into. -## Limitations -## Alternatives ### Automatic Unpacking ## Prior Art From b7874b2054fb82b414855e3e2e75cddb98fca27c Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 15:45:47 +0200 Subject: [PATCH 06/23] wip --- content/unpack.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index 9e66ebee..4ed0b676 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -10,9 +10,9 @@ title: SIP-61 - Unroll Default Arguments for Binary Compatibility ## History -| Date | Version | -|---------------|--------------------| -| Feb 14th 2024 | Initial Draft | +| Date | Version | +|-------------|--------------------| +| 23 Aug 2024 | Initial Draft | ## Summary From 5a24bfdb956e808515fa1e1cd7747ac150b520b6 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 15:47:05 +0200 Subject: [PATCH 07/23] wip --- content/unpack.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index 4ed0b676..3dbcaf02 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -700,7 +700,7 @@ magnitude(x = 5, y = 3) // 4: Int `unpack` on definitions and `*` on `case class` values are orthogonal: either can be used without the other. We already saw how you can use `unpack` at the definition-site and just pass parameters -individuall at the call-site: +individually at the call-site: ```scala case class RequestConfig(url: String, From 40f078b03cbe8bfcdac8b9ddc082bb4f8b277fa5 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 15:47:36 +0200 Subject: [PATCH 08/23] wip --- content/unpack.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index 3dbcaf02..5d60df85 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -802,13 +802,9 @@ values by field name seems like it would be a lot more intuitive than relying on order and hoping it lines up between your `case class` and the parameter list you are unpacking it into. -### Automatic Unpacking ## Prior Art - -There is significant - ### Scala `extends` and `export` `unpack` is similar to Scala's `extends` and `export` clauses, except rather than applying From 7be3b19caed3c956bba6bf5e766f4cb28e107f42 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 16:04:55 +0200 Subject: [PATCH 09/23] wip --- content/unpack.md | 119 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 117 insertions(+), 2 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index 5d60df85..410b7c03 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -218,7 +218,7 @@ val futureData = downloadAsync(config*, asyncConfig*) val stream = downloadStream(config*, asyncConfig*) ``` -## Applications in the Wild +## Applications ### Requests-Scala @@ -802,6 +802,100 @@ values by field name seems like it would be a lot more intuitive than relying on order and hoping it lines up between your `case class` and the parameter list you are unpacking it into. +### Binary Compatibility + +parameter lists using `unpack` should be externally indistinguishable from individually-defined +parameters. So a library should be able to take a method defining individual parameters + +```scala +def downloadSimple(url: String, + connectTimeout: Int, + readTimeout: Int) +``` + +And later, perhaps in the interest of code sharing, replace it with a method `unpack`ing a ` +case class: + +```scala +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +def downloadSimple(unpack config: RequestConfig) +``` + +And this should require no changes at any callsites, and should not break binary or tasty +compatibility. + + +### Default parameter values + +As can be seen from some of the other examples in this proposal, `unpack` should include +the default parameter values: + +```scala +case class RequestConfig(url: String, + connectTimeout: Int = 10000, + readTimeout: Int = 10000) + +// These two methods definitions should be equivalent +def downloadSimple(unpack config: RequestConfig) +def downloadSimple(url: String, + connectTimeout: Int = 10000, + readTimeout: Int = 10000) +``` + +Large flat parameter lists often contain default parameters, and usually the user would +want the same default parameter across all use sites. So default parameters should be maintained +when `unpack`ing a `case class` type into the enclosing parameter list. + +### `@unroll` interaction + +`@unroll` annotations on the parameters of a `case class` should be preserved when unpacking +those parameters into a method `def` + +```scala +case class RequestConfig(url: String, + @unroll connectTimeout: Int = 10000, + @unroll readTimeout: Int = 10000) + +// These two methods definitions should be equivalent +def downloadSimple(unpack config: RequestConfig) +def downloadSimple(url: String, + @unroll connectTimeout: Int = 10000, + @unroll readTimeout: Int = 10000) +``` + +We expect that both `unpack` and `unroll` would be used together frequently: `unpack` to +preserve consistency between different methods in the same version, `unroll` to preserve +binary and tasty compatibility of the same method across different versions. The two goals +are orthogonal and a library author can be expected to want both at the same time, and so +`unpack` needs to preserve the semantics of `@unroll` on each individual unpacked parameter. + +### Modifier handling + +`case class` fields can have modifiers like `val`, `var`, `private`, etc. that are not allowed +in method `def`s. `unpack` should preserve these modifiers if the enclosing parameter list +belongs to a `class` or `case class`, and strip these modifiers if the enclosing parameter list +belongs to a method `def` + +```scala +case class RequestConfig(var url: String, + var connectTimeout: Int = 10000, + readTimeout: Int = 10000) + +// These two methods definitions should be equivalent +def downloadSimple(unpack config: RequestConfig) +def downloadSimple(url: String, + connectTimeout: Int = 10000, + readTimeout: Int = 10000) +// These two class definitions should be equivalent +class Foo(unpack config: RequestConfig) +class Foo(var url: String, + var connectTimeout: Int = 10000, + readTimeout: Int = 10000) +``` + ## Prior Art @@ -919,4 +1013,25 @@ fun f() { foo(1) foo() } -``` \ No newline at end of file +``` + +## Future Work + +### Support for tuples and named tuples + +For this initial proposal, we limit `unpack` an `*` to only work on `case class`es. This is +enough for the most painful scenarios [discussed above](#applications), and matches the most +closely: a `case class` parameter list _exactly_ matches the structure of the enclosing parameter +list we `unpack`ing the `case class` into, and unpacking values via `*` is also straightforward. +However, we could potentially expand this to allow use of `unpack` an `*` on positional and +named tuples. + +While `unpack`/`*` on `case class`es is most useful for library authors, `*` on tuples +and named tuples could be of great convenience in application code: method bodies often have +local data structures containing tuples or named tuples that get passed as to method calls +as parameters, and `*` could make this a lot more convenient. than having to write +`foo(tuple._1, tuple._2, tuple._3)` or similar today. + +`unpack` could also be used to unpack a named tuple into a parameter list, which would +work identically to unpacking a `case class` type except a named tuple would not have +any default param values. From 3a4161092f7b692fc456d652dc1f2e6aec535d78 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 23 Aug 2025 16:07:12 +0200 Subject: [PATCH 10/23] wip --- content/unpack.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/content/unpack.md b/content/unpack.md index 410b7c03..192d7d66 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -1035,3 +1035,9 @@ as parameters, and `*` could make this a lot more convenient. than having to wri `unpack` could also be used to unpack a named tuple into a parameter list, which would work identically to unpacking a `case class` type except a named tuple would not have any default param values. + +The other way around, `unpack`ing a `case class` into a named tuple type, or a named tuple +into another named tuple could also be useful. + +All of these ideas for integrating `unpack`/`*` with tuples and named tuples should be +investigated, but for now they are beyond the scope of this proposal. From 14584814a286b3a6a14b5e9f255eef0b54e645ea Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 13:03:28 +0800 Subject: [PATCH 11/23] wip --- content/unpack.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index 192d7d66..83c5c950 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -3,7 +3,7 @@ layout: sip permalink: /sips/:title.html stage: implementation status: under-review -title: SIP-61 - Unroll Default Arguments for Binary Compatibility +title: SIP-XX - Unroll Default Arguments for Binary Compatibility --- **By: Li Haoyi** From 930134019bc187cfb4e5c6b3cfd41f7bab4924c6 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 13:04:28 +0800 Subject: [PATCH 12/23] wip --- content/unpack.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index 83c5c950..341cd065 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -3,7 +3,7 @@ layout: sip permalink: /sips/:title.html stage: implementation status: under-review -title: SIP-XX - Unroll Default Arguments for Binary Compatibility +title: SIP-XX - Unpack Case Classes into Parameter Lists and Argument List --- **By: Li Haoyi** From 6448f4cb1da6b995dc144182c8580267182b15b5 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 13:05:01 +0800 Subject: [PATCH 13/23] wip --- content/unpack.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index 341cd065..f1ee897b 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -3,7 +3,7 @@ layout: sip permalink: /sips/:title.html stage: implementation status: under-review -title: SIP-XX - Unpack Case Classes into Parameter Lists and Argument List +title: SIP-XX - Unpack Case Classes into Parameter Lists and Argument Lists --- **By: Li Haoyi** From e0c1cd8c0cfc3400224ecd35c088bcebe3983976 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 13:39:49 +0800 Subject: [PATCH 14/23] . --- content/unpack.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index f1ee897b..63ba68b4 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -45,7 +45,8 @@ val stream2 = downloadStream(config*) The delegation performed by `unpack` is very similar to inheritance with `extends`, or composition with `export`. Scala has always had good ways to DRY up repetitive member definitions, but has so far had no good way to DRY up repetitive parameter lists. -`unpack` provides the way to do so. +`unpack` provides the way to do so, and removes the dilemma of passing things around as loose +parameters or `case class` values by making it easy to convert between them in both directions. ## Motivation From 32dec1795c156f97addbcdaa0d5ac63f48076875 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 13:40:59 +0800 Subject: [PATCH 15/23] . --- content/unpack.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index 63ba68b4..ff9d4ad3 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -52,7 +52,8 @@ parameters or `case class` values by making it easy to convert between them in b This proposal removes a tremendous amount of boilerplate converting between data structures and method calls in Scala. For example, the code snippet above without this feature would -have the pa: +have the parameter list duplicated many times, and any calling the methods with data from +a `RequestConfig` object with matching fields also requires lots of duplication: ```scala case class RequestConfig(url: String, From 7b1cc0781d69868360e689389c12db3c25f88425 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 13:57:10 +0800 Subject: [PATCH 16/23] . --- content/unpack.md | 61 +++++++++++++++++++++++++++++++---------------- 1 file changed, 41 insertions(+), 20 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index ff9d4ad3..6f359385 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -140,7 +140,7 @@ and use the `RequestConfig` wrapper. Apart from the boilerplate, some things to note: 1. The `RequestConfig` object is really just an implementation detail of `download` meant - to shared parameters and args between the different `download` methods. From a user + to share parameters and args between the different `download` methods. From a user perspective, the name is meaningless and the contents are arbitrary: someone calling `downloadAsync` would have to pass some params inside a `RequestConfig`, some parameters outside `RequestConfig`, with no reason why some parameters should go in one place or another @@ -149,7 +149,7 @@ Apart from the boilerplate, some things to note: objects that the user has to construct to call your method, possibly nested. The user would have to import several `Config` classes and instantiate a tree-shaped data structure just to call these methods. But this tree-structure does not model anything the user cares about, but - instead models the code-sharing relationships between the various `def download` methods + instead models the internal code-sharing relationships between the various `def download` methods ```scala case class RequestConfig(url: String, @@ -157,6 +157,7 @@ case class RequestConfig(url: String, case class TimeoutConfig(connectTimeout: Int, readTimeout: Int) case class AsyncConfig(retry: Boolean, ec: ExecutionContext) + def downloadSimple(config: RequestConfig) = doSomethingWith(config) def downloadAsync(config: RequestConfig, asyncConfig: AsyncConfig) = doSomethingWith(config) def downloadStream(config: RequestConfig, asyncConfig: AsyncConfig) = doSomethingWith(config) @@ -176,12 +177,17 @@ val stream = downloadStream( ) ``` -There are other more sophisticated ways that a library author can try to resolve this problem - +Forcing the user to construct this tree-shaped `case class` data structure is an abstraction leak: +the user has to write code matching the internal implementation details and code sharing of +the `def download` methods, and construct the corresponding `case class` tree, even though they +may really only care about calling a single `downloadAsync` method. + +There are other more sophisticated ways that a library author can try to mitigate this - e.g. builder patterns - but the fundamental problem is unsolvable today. `unpack`/`*` solves this neatly, allowing the library author to use `unpack` in their definition-site parameter lists to share parameters between definitions, and the library user can either pass parameters individually or unpack a configuration object via `*`, resulting in both the definition site -and the call site being boilerplate-free, even in the more involved example above: +and the call site being boilerplate-free even in the more involved example below: ```scala case class RequestConfig(url: String, @@ -189,6 +195,7 @@ case class RequestConfig(url: String, case class TimeoutConfig(connectTimeout: Int, readTimeout: Int) case class AsyncConfig(retry: Boolean, ec: ExecutionContext) + def downloadSimple(unpack config: RequestConfig) = doSomethingWith(config) def downloadAsync(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) = doSomethingWith(config) def downloadStream(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) = doSomethingWith(config) @@ -279,6 +286,7 @@ class Requester{ ) ... } + def stream( url: String, auth: RequestAuth = sess.auth, @@ -399,6 +407,7 @@ class Requester{ ) ... } + def stream( unpack request: Request, chunkedUpload: Boolean = false, @@ -519,7 +528,9 @@ os.walk.attrs(path, preOrder = false, followLinks = true) os.walk.stream(path, preOrder = false, followLinks = true) ``` -These are defined as +These are defined as shown below: each version of `os.walk` has a different return type, and +so needs to be a different method, but they share many parameters and default values, and +require a lot of boilerplate forwarding these internally: ```scala object walk{ @@ -540,6 +551,7 @@ object walk{ includeTarget ).toArray[Path].toIndexedSeq } + def attrs( path: Path, skip: (Path, os.StatInfo) => Boolean = (_, _) => false, @@ -559,6 +571,7 @@ object walk{ ) .toArray[(Path, os.StatInfo)].toIndexedSeq } + object stream { def apply( path: Path, @@ -577,6 +590,7 @@ object walk{ includeTarget ).map(_._1) } + def attrs( path: Path, skip: (Path, os.StatInfo) => Boolean = (_, _) => false, @@ -593,37 +607,44 @@ With `unpack`, this could be consolidated into ```scala object walk{ - case class Config(path: Path, - skip: Path => Boolean = _ => false, - preOrder: Boolean = true, - followLinks: Boolean = false, - maxDepth: Int = Int.MaxValue, - includeTarget: Boolean = false) - - def apply(unpack config: Config): IndexedSeq[Path] = { + case class Config[SkipType](path: Path, + skip: SkipType = _ => false, + preOrder: Boolean = true, + followLinks: Boolean = false, + maxDepth: Int = Int.MaxValue, + includeTarget: Boolean = false) + + def apply(unpack config: Config[os.Path => Boolean]): IndexedSeq[Path] = { stream(config*).toArray[Path].toIndexedSeq } - def attrs(unpack config: Config): IndexedSeq[(Path, os.StatInfo)] = { + def attrs(unpack config: Config[(os.Path, os.StatInfo) => Boolean]): IndexedSeq[(Path, os.StatInfo)] = { stream.attrs(config*) .toArray[(Path, os.StatInfo)].toIndexedSeq } object stream { - def apply(unpack config: Config): Generator[Path] = { + def apply(unpack config: Config[os.Path => Boolean]): Generator[Path] = { attrs(path, (p, _) => skip(p), preOrder, followLinks, maxDepth, includeTarget).map(_._1) } - def attrs(unpack config: Config): Generator[(Path, os.StatInfo)] = ??? + def attrs(unpack config: Config[(os.Path, os.StatInfo) => Boolean]): Generator[(Path, os.StatInfo)] = ??? } } ``` Things to note: -1. A lot of these methods are forwarders/wrappers for each other, purely for convenience, and +1. The different `def`s can all share the same `unpack config: Config` parameter to share + the common parameters + +2. The `.attrs` method take a `Config[(os.Path, os.StatInfo) => Boolean]`, while the + `.apply` methods take a `Config[os.Path => Boolean]`, as the shared parameters have some + subtle differences accounted for by the type parameter + +3. A lot of these methods are forwarders/wrappers for each other, purely for convenience, and `*` can be used to forward the `config` object from the wrapper to the inner method -2. Sometimes the parameter lists are subtly different, e.g. `walk.stream.apply` and - `walk.stream.attrs` have a different type for `skip`. In such cases `unpack` cannot work - and so the forwarding has to be done manually. +4. Sometimes the parameter lists are subtly different, e.g. `walk.stream.apply` and + `walk.stream.attrs` have a different type for `skip`. In such cases `*` at the call-site + cannot work and so the forwarding has to be done manually. ## Detailed Behavior From cd0df3957e65576041bba6317d0ec907ca5d1216 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 13:58:14 +0800 Subject: [PATCH 17/23] . --- content/unpack.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index 6f359385..dd7037a7 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -178,9 +178,9 @@ val stream = downloadStream( ``` Forcing the user to construct this tree-shaped `case class` data structure is an abstraction leak: -the user has to write code matching the internal implementation details and code sharing of -the `def download` methods, and construct the corresponding `case class` tree, even though they -may really only care about calling a single `downloadAsync` method. +the user has to write code matching the internal implementation details and code sharing +relationships of the `def download` methods to construct the corresponding `case class` tree, +even though they may really only care about calling a single `downloadAsync` method. There are other more sophisticated ways that a library author can try to mitigate this - e.g. builder patterns - but the fundamental problem is unsolvable today. `unpack`/`*` solves From c8cdea73458eb6e622da9f2e2cf3a028bd9c4176 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 14:03:16 +0800 Subject: [PATCH 18/23] . --- content/unpack.md | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index dd7037a7..74a34766 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -608,24 +608,24 @@ With `unpack`, this could be consolidated into ```scala object walk{ case class Config[SkipType](path: Path, - skip: SkipType = _ => false, + skip: SkipType => Boolean = (_: T) => false, preOrder: Boolean = true, followLinks: Boolean = false, maxDepth: Int = Int.MaxValue, includeTarget: Boolean = false) - def apply(unpack config: Config[os.Path => Boolean]): IndexedSeq[Path] = { + def apply(unpack config: Config[os.Path]): IndexedSeq[Path] = { stream(config*).toArray[Path].toIndexedSeq } - def attrs(unpack config: Config[(os.Path, os.StatInfo) => Boolean]): IndexedSeq[(Path, os.StatInfo)] = { + def attrs(unpack config: Config[(os.Path, os.StatInfo)]): IndexedSeq[(Path, os.StatInfo)] = { stream.attrs(config*) .toArray[(Path, os.StatInfo)].toIndexedSeq } object stream { - def apply(unpack config: Config[os.Path => Boolean]): Generator[Path] = { + def apply(unpack config: Config[os.Path]): Generator[Path] = { attrs(path, (p, _) => skip(p), preOrder, followLinks, maxDepth, includeTarget).map(_._1) } - def attrs(unpack config: Config[(os.Path, os.StatInfo) => Boolean]): Generator[(Path, os.StatInfo)] = ??? + def attrs(unpack config: Config[(os.Path, os.StatInfo)]): Generator[(Path, os.StatInfo)] = ??? } } ``` @@ -635,8 +635,8 @@ Things to note: 1. The different `def`s can all share the same `unpack config: Config` parameter to share the common parameters -2. The `.attrs` method take a `Config[(os.Path, os.StatInfo) => Boolean]`, while the - `.apply` methods take a `Config[os.Path => Boolean]`, as the shared parameters have some +2. The `.attrs` method take a `Config[(os.Path, os.StatInfo)]`, while the + `.apply` methods take a `Config[os.Path]`, as the shared parameters have some subtle differences accounted for by the type parameter 3. A lot of these methods are forwarders/wrappers for each other, purely for convenience, and @@ -664,6 +664,10 @@ def downloadAsync(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) def downloadStream(unpack config: RequestConfig, unpack asyncConfig: AsyncConfig) = doSomethingWith(config) ``` +You can `unpack` a `case class` into a method `def` parameter list as we see in +the `def download` methods above, or into a `case class` parameter list as we see in +`case class RequestConfig` above. + ### Nested and Adjacent Unpacks There can be multiple hops, e.g. `downloadSimple` unpacks `RequestConfig`, and `RequestConfig` From 6711e0f06d845e4a7591eba747ad47f754f280c7 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 14:08:23 +0800 Subject: [PATCH 19/23] . --- content/unpack.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index 74a34766..2585b254 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -723,6 +723,8 @@ def magnitudeInt(unpack v: Point[Int]) magnitude(x = 5, y = 3) // 4: Int ``` +This is similar to what we saw in the `os.walk` example earlier. + ### Orthogonality `unpack` on definitions and `*` on `case class` values are orthogonal: either can be used without @@ -736,9 +738,13 @@ case class RequestConfig(url: String, def downloadSimple(unpack config: RequestConfig) = ??? -val data = downloadSimple("www.example.com", 1000, 10000) +val data1 = downloadSimple("www.example.com", 1000, 10000) +val data2 = downloadSimple(url = "www.example.com", connectTimeout = 1000, readTimeout = 10000) ``` +When you `unpack` a `case class`, the resulting parameters can be called via either positional +or named arguments. + Similarly, you can define parameters individually at the definition-site and `unpack` a `case class` with matching fields at the call-site @@ -773,7 +779,7 @@ val config = OtherConfig("www.example.com", 1000, 10000) val data = downloadSimple(config*) ``` -Mix `unpack`-ed and individually passed argments: +Or mix `unpack`-ed and individually passed arguments: ```scala case class AsyncConfig(retry: Boolean, ec: ExecutionContext) @@ -789,6 +795,8 @@ case class OtherConfig(url: String, retry: Boolean) val config = OtherConfig("www.example.com", 1000, 10000, true) +// `OtherConfig` matches some of the fields from `unpack config: RequestConfig` and +// `unpack asyncConfig: AsyncConfig`, and we pass the last missing `retry = true` individually downloadAsync(config*, retry = true) ``` From a943cc502a1a0aecec195feeeae1f90186ac2831 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 14:10:58 +0800 Subject: [PATCH 20/23] . --- content/unpack.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index 2585b254..3bb88a5c 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -848,8 +848,8 @@ def downloadSimple(url: String, readTimeout: Int) ``` -And later, perhaps in the interest of code sharing, replace it with a method `unpack`ing a ` -case class: +And later, perhaps in the interest of code sharing, replace it with a method `unpack`ing a +`case class`: ```scala case class RequestConfig(url: String, @@ -955,7 +955,7 @@ write(Outer("abc", Inner(Inner2(7)))) // {"msg": "abc", "i": 7} ``` Like `unpack`, `@flatten` can be used recursively to flatten out -a multi-layer `case class` tree into a single flat JSON object, as shown above +a multi-layer `case class` tree into a single flat JSON object, as shown above. ### MainArg's `case class` embedding From 7aff4874c1f4c2a70b5fded12e3dc1cd9e8d7232 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sun, 24 Aug 2025 14:40:58 +0800 Subject: [PATCH 21/23] . --- content/unpack.md | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/content/unpack.md b/content/unpack.md index 3bb88a5c..a03f56e9 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -182,12 +182,18 @@ the user has to write code matching the internal implementation details and code relationships of the `def download` methods to construct the corresponding `case class` tree, even though they may really only care about calling a single `downloadAsync` method. -There are other more sophisticated ways that a library author can try to mitigate this - -e.g. builder patterns - but the fundamental problem is unsolvable today. `unpack`/`*` solves -this neatly, allowing the library author to use `unpack` in their definition-site parameter lists -to share parameters between definitions, and the library user can either pass parameters -individually or unpack a configuration object via `*`, resulting in both the definition site -and the call site being boilerplate-free even in the more involved example below: +There are other more sophisticated ways that a library author can try to mitigate this, +e.g. builder patterns. But fundamentally the problem is that language feature has limitations +that make people reach for user-land patterns as an alternative, at a cost of clarity and +indirection. As a _library_ designer that makes sense as the least-bad option given the +constraints, but as a _language_ designer we should strive to just fix the broken language +feature so library designers don't need to jump through these hoops. + +`unpack`/`*` solves this neatly, allowing the library author to use `unpack` in their +definition-site parameter lists to share parameters between definitions, and the library +user can either pass parameters individually or unpack a configuration object via `*`, +resulting in both the definition site and the call site being boilerplate-free even in +the more involved example below: ```scala case class RequestConfig(url: String, From 9c2130ae012caf5067e2948475f5b4a5557af16e Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 26 Sep 2025 21:35:23 -0400 Subject: [PATCH 22/23] . --- content/unpack.md | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/content/unpack.md b/content/unpack.md index a03f56e9..51dfc65c 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -806,6 +806,52 @@ val config = OtherConfig("www.example.com", 1000, 10000, true) downloadAsync(config*, retry = true) ``` +Each of the above scenarios is a concrete use case for `unpack` and `*`: I expect it to be common +to use `unpack` at a definition site without `*` at the callsite, and similarly common to use `*` +at a callsite without an `unpack` at the definition site. So we should support the two keywords +being used separately or together, and not mandate than a `*` must correspond to an matching +`unpack`. + +### Binary Compatibility + +An important property of `unpack` is that it can be added or removed after the fact to an existing +method without breaking binary compatibility. For example I should be able to go from existing +methods with separate parameters To methods with a shared parameters via `unpack` without breaking +binary compatibility: + +**Before** +```scala + +def downloadSimple(url: String, + connectTimeout: Int, + readTimeout: Int) = doSomethingWith(config) +def downloadAsync(url: String, + connectTimeout: Int, + readTimeout: Int, + ec: ExecutionContext) = doSomethingWith(config) +def downloadStream(url: String, + connectTimeout: Int, + readTimeout: Int) = doSomethingWith(config) +``` + +**After** +```scala +case class RequestConfig(url: String, + connectTimeout: Int, + readTimeout: Int) + +def downloadSimple(unpack config: RequestConfig) = doSomethingWith(config) +def downloadAsync(unpack config: RequestConfig, ec: ExecutionContext) = doSomethingWith(config) +def downloadStream(unpack config: RequestConfig) = doSomethingWith(config) +``` + +This is important because it is an exceedingly common workflow as a library evolves: nobody +knows up front exactly how all their parameter lists will evolve over time, and so nobody will +be able to design the perfect `unpack` structure up front to decide which methods will need +to share what parameters with which other methods. A user thus needs to be able to consolidate +parameter lists into `unpack` parameters without breaking binary compatibility. + + ### Case Class Construction Semantics `unpack` may-or-may-not re-create a `case class` instance passed via `config*` to an From a7ebbb787dad60b55f8349a35fc2b551509a58e5 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 26 Sep 2025 21:35:30 -0400 Subject: [PATCH 23/23] . --- content/unpack.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/unpack.md b/content/unpack.md index 51dfc65c..f0efe272 100644 --- a/content/unpack.md +++ b/content/unpack.md @@ -12,7 +12,7 @@ title: SIP-XX - Unpack Case Classes into Parameter Lists and Argument Lists | Date | Version | |-------------|--------------------| -| 23 Aug 2024 | Initial Draft | +| 23 Aug 2025 | Initial Draft | ## Summary