Skip to content

Cache: parallel tar and upload/unpack and download #357

@bertptrs

Description

@bertptrs

Cache operations can take up a long time. For our builds, the do in fact take up the majority of the time. Right now that works roughly as follows:

  • On restore
    • A compressed archive is downloaded to a location
    • After that is complete, the tarball is unpacked into position
  • On store
    • A compressed archive is created and written to a temporary location
    • That tempfile is then uploaded to the cache

For example, this is the code that actually does the restore:

func downloadAndUnpackKey(storage storage.Storage, metricsManager metrics.MetricsManager, key string) {
downloadStart := time.Now()
fmt.Printf("Downloading key '%s'...\n", key)
compressed, err := storage.Restore(key)
utils.Check(err)
downloadDuration := time.Since(downloadStart)
info, _ := os.Stat(compressed.Name())
fmt.Printf("Download complete. Duration: %v. Size: %v bytes.\n", downloadDuration.String(), files.HumanReadableSize(info.Size()))
publishMetrics(metricsManager, info, downloadDuration)
unpackStart := time.Now()
fmt.Printf("Unpacking '%s'...\n", compressed.Name())
restorationPath, err := files.Unpack(metricsManager, compressed.Name())
utils.Check(err)
unpackDuration := time.Since(unpackStart)
fmt.Printf("Unpack complete. Duration: %v.\n", unpackDuration)
fmt.Printf("Restored: %s.\n", restorationPath)
err = os.Remove(compressed.Name())
if err != nil {
fmt.Printf("Error removing %s: %v", compressed.Name(), err)
}
}

The archives are a tar file, which supports streaming (de)compression. Thus, it should be possible to interleave the download and unpacking (as well as packing and upload) resulting in less overall latency.

In the bash version this would've been as simple as piping the sftp output to tar -x and vice-versa for uploads; in Go this might be slightly more tricky but in general possible and an easy win for faster builds.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions