Skip to content

S3 multipart upload using FileAsyncRequestBody is not limited by configuration: bufferSizeInBytes #6539

@lancezhao-ins

Description

@lancezhao-ins

Describe the bug

I'm encountering an issue with the AWS SDK v2 when using FileAsyncRequestBody for S3 multipart uploads. It appears that the bufferSizeInBytes configuration is not being respected, leading to potential memory issues.

In FileAsyncRequestBodySplitHelper, when creating a multipart upload, the atomic counter numAsyncRequestBodiesInFlight is increased in the function doSendAsyncRequestBody(). Upon completion of the upload, the counter is supposed to be decreased in the function startNextRequestBody(). However, I found that startNextRequestBody() is called twice when the upload finishes. This results in two parts being created instead of one each time a part finishes uploading.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

The numAsyncRequestBodiesInFlight counter should only decrease once when each part completes its upload. So shouldSendMore() would limit number of inflight request body to be <= totalBufferSize / bufferPerAsyncRequestBody

Current Behavior

The numAsyncRequestBodiesInFlight counter decreases twice when each part completes its upload, causing two parts to be created instead of one. So the actual number of inflight request body would go more than totalBufferSize / bufferPerAsyncRequestBody

Reproduction Steps

  • Use S3TransferManager to upload a large file.
  • Monitor the behavior of the atomic counter and the threads for inflight request bodies.
  • Example code:
final Upload s3Upload = s3TransferManager.upload(UploadRequest.builder()
        .putObjectRequest(putObjectRequest)
        .addTransferListener(progressListener)
        .requestBody(FileAsyncRequestBody.builder()
                .path(file.toPath())
                .chunkSizeInBytes(chunkSizeInBytes)
                .build())
        .build());

Possible Solution

Use private Set<Long> inflightRequestsStartPositions = Collections.synchronizedSet(new HashSet<>()); instead of private AtomicInteger numAsyncRequestBodiesInFlight = new AtomicInteger(0);, and provide startPosition as an index when add/remove instead of addAndGet/decrementAndGet.
Use inflightRequestsStartPositions.size() instead of numAsyncRequestBodiesInFlight.get() in shouldSendMore()

This will ensure even startNextRequestBody() been called twice, with the same startPosition, still maintain the correct number of inflight request body.

Additional Information/Context

No response

AWS Java SDK version used

2.29.45

JDK version used

11

Operating System and version

macOS Sequoia 15.7.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.needs-triageThis issue or PR still needs to be triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions