Title: | 'AWS S3' Client Package |
---|---|
Description: | A simple client package for the Amazon Web Services ('AWS') Simple Storage Service ('S3') 'REST' 'API' <https://aws.amazon.com/s3/>. |
Authors: | Thomas J. Leeper [aut] , Boettiger Carl [ctb], Andrew Martin [ctb], Mark Thompson [ctb], Tyler Hunt [ctb], Steven Akins [ctb], Bao Nguyen [ctb], Thierry Onkelinx [ctb], Andrii Degtiarov [ctb], Dhruv Aggarwal [ctb], Alyssa Columbus [ctb], Simon Urbanek [cre, ctb] |
Maintainer: | Simon Urbanek <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.3.22 |
Built: | 2025-01-02 04:11:32 UTC |
Source: | https://github.com/cloudyr/aws.s3 |
AWS S3 Client Package
A simple client package for the Amazon Web Services (AWS) Simple Storage Service (S3) REST API.
Thomas J. Leeper <[email protected]>
Check whether a bucket exists and is accessible with the current authentication keys.
bucket_exists(bucket, ...)
bucket_exists(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
TRUE
if bucket exists and is accessible, else FALSE
.
bucketlist
, get_bucket
, object_exists
List buckets as a data frame
bucketlist(add_region = FALSE, ...) bucket_list_df(add_region = FALSE, ...)
bucketlist(add_region = FALSE, ...) bucket_list_df(add_region = FALSE, ...)
add_region |
A logical (by default |
... |
Additional arguments passed to |
bucketlist
performs a GET operation on the base s3 endpoint and returns a list of all buckets owned by the authenticated sender of the request. If authentication is successful, this function provides a list of buckets available to the authenticated user. In this way, it can serve as a “hello world!” function, to confirm that one's authentication credentials are working correctly.
bucket_list_df
and bucketlist
are identical.
A data frame of buckets. Can be empty (0 rows, 0 columns) if there are no buckets, otherwise contains typically at least columns Bucket
and CreationDate
.
Copy objects between S3 buckets
copy_object( from_object, to_object = from_object, from_bucket, to_bucket, headers = list(), ... ) copy_bucket(from_bucket, to_bucket, ...)
copy_object( from_object, to_object = from_object, from_bucket, to_bucket, headers = list(), ... ) copy_bucket(from_bucket, to_bucket, ...)
from_object |
A character string containing the name the object you want to copy. |
to_object |
A character string containing the name the object should have in the new bucket. |
from_bucket |
A character string containing the name of the bucket you want to copy from. |
to_bucket |
A character string containing the name of the bucket you want to copy into. |
headers |
List of request headers for the REST call. |
... |
Additional arguments passed to |
copy_object
copies an object from one bucket to another without bringing it into local memory. For copy_bucket
, all objects from one bucket are copied to another (limit 1000 objects). The same keys are used in the old bucket as in the new bucket.
Something...
Deletes an S3 bucket.
delete_bucket(bucket, ...)
delete_bucket(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
TRUE
if successful, FALSE
otherwise.
Deletes one or more objects from an S3 bucket.
delete_object(object, bucket, quiet = TRUE, ...)
delete_object(object, bucket, quiet = TRUE, ...)
object |
Character string with the object key, or an object of class “s3_object”. In most cases, if |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
quiet |
A logical indicating whether (when |
... |
Additional arguments passed to |
object
can be a single object key, an object of class “s3_object”, or a list of either.
TRUE
if successful, otherwise an object of class aws_error details if not.
Get/Put/Delete the website configuration for a bucket.
delete_website(bucket, ...) put_website(bucket, request_body, ...) get_website(bucket, ...)
delete_website(bucket, ...) put_website(bucket, request_body, ...) get_website(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
request_body |
A character string containing an XML request body, as defined in the specification in the API Documentation. |
For put_website
and get_website
, a list containing the website configuration, if one has been set.
For delete_website
: TRUE
if successful, FALSE
otherwise.
An aws_error
object may be returned if the request failed.
API Documentation: PUT website API Documentation: GET website API Documentation: DELETE website
Get/Put acceleration settings or retrieve acceleration status of a bucket.
get_acceleration(bucket, ...) put_acceleration(bucket, status = c("Enabled", "Suspended"), ...)
get_acceleration(bucket, ...) put_acceleration(bucket, status = c("Enabled", "Suspended"), ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
status |
Character string specifying whether acceleration should be “Enabled” or “Suspended”. |
Transfer acceleration is a AWS feature that enables potentially faster file transfers to and from S3, particularly when making cross-border transfers (such as from a European client location to the ‘us-east-1’ S3 region). Acceleration must be enabled before it can be used. Once enabled, accelerate = TRUE
can be passed to any aws.s3 function via s3HTTP
. get_acceleration
returns the acceleration status of a bucket; put_acceleration
enables or suspends acceleration.
For get_acceleration
: If acceleration has never been enabled or suspend, the value is NULL
. Otherwise, the status is returned (either “Enabled” or “Suspended”). For put_acceleration
: If acceleration has never been enabled or suspend, the value is NULL
.
API Documentation: PUT Bucket accelerate API Documentation: GET Bucket accelerate
## Not run: b <- bucketlist() get_acceleration(b[[1]]) put_acceleration(b[[1]], "Enabled") get_acceleration(b[[1]]) put_acceleration(b[[1]], "Suspended") ## End(Not run)
## Not run: b <- bucketlist() get_acceleration(b[[1]]) put_acceleration(b[[1]], "Enabled") get_acceleration(b[[1]]) put_acceleration(b[[1]], "Suspended") ## End(Not run)
Access Control Lists (ACLs) control access to buckets and objects. These functions retrieve and modify ACLs for either objects or buckets.
get_acl(object, bucket, ...) put_acl(object, bucket, acl = NULL, headers = list(), body = NULL, ...)
get_acl(object, bucket, ...) put_acl(object, bucket, acl = NULL, headers = list(), body = NULL, ...)
object |
Character string with the object key, or an object of class “s3_object”. In most cases, if |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
acl |
A character string indicating a “canned” access control list. By default all bucket contents and objects therein are given the ACL “private”. This can later be viewed using |
headers |
List of request headers for the REST call |
body |
A character string containing an XML-formatted ACL. |
get_acl
retrieves an XML-formatted ACL for either an object (if specified) or a bucket (if specified).
For get_acl
a character string containing an XML-formatted ACL. For put_acl
: if successful, TRUE
.
API Reference: GET Object ACL API Reference: PUT Object ACL
List the contents of an S3 bucket as either a list or data frame
get_bucket( bucket, prefix = NULL, delimiter = NULL, max = NULL, marker = NULL, parse_response = TRUE, ... ) get_bucket_df( bucket, prefix = NULL, delimiter = NULL, max = NULL, marker = NULL, ... )
get_bucket( bucket, prefix = NULL, delimiter = NULL, max = NULL, marker = NULL, parse_response = TRUE, ... ) get_bucket_df( bucket, prefix = NULL, delimiter = NULL, max = NULL, marker = NULL, ... )
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
prefix |
Character string that limits the response to keys that begin with the specified prefix |
delimiter |
Character string used to group keys. Read the AWS doc for more detail. |
max |
Integer indicating the maximum number of keys to return. The function will recursively access the bucket in case |
marker |
Character string that specifies the key to start with when listing objects in a bucket. Amazon S3 returns object keys in alphabetical order, starting with key after the marker in order. |
parse_response |
logical, should we attempt to parse the response? |
... |
Additional arguments passed to |
From the AWS doc: “This implementation of the GET operation returns some or all (up to 1000) of the objects in a bucket. You can use the request parameters as selection criteria to return a subset of the objects in a bucket.” The max
and marker
arguments can be used to retrieve additional pages of results. Values from a call are store as attributes
get_bucket
returns a list of objects in the bucket (with class “s3_bucket”), while get_bucket_df
returns a data frame (the only difference is the application of the as.data.frame()
method to the list of bucket contents. If max
is greater than 1000, multiple API requests are executed and the attributes attached to the response object reflect only the final request.
## Not run: # basic usage b <- bucketlist() get_bucket(b[1,1]) get_bucket_df(b[1,1]) # bucket names with dots ## this (default) should work: get_bucket("this.bucket.has.dots", url_style = "path") ## this probably wont: #get_bucket("this.bucket.has.dots", url_style = "virtual") ## End(Not run)
## Not run: # basic usage b <- bucketlist() get_bucket(b[1,1]) get_bucket_df(b[1,1]) # bucket names with dots ## this (default) should work: get_bucket("this.bucket.has.dots", url_style = "path") ## this probably wont: #get_bucket("this.bucket.has.dots", url_style = "virtual") ## End(Not run)
Get/Put/Delete the bucket access policy for a bucket.
get_bucket_policy(bucket, parse_response = TRUE, ...) put_bucket_policy(bucket, policy, ...) delete_bucket_policy(bucket, ...)
get_bucket_policy(bucket, parse_response = TRUE, ...) put_bucket_policy(bucket, policy, ...) delete_bucket_policy(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
parse_response |
A logical indicating whether to return the response as is, or parse and return as a list. Default is |
... |
Additional arguments passed to |
policy |
A character string containing a bucket policy. |
Bucket policies regulate who has what access to a bucket and its contents. The header
argument can beused to specify “canned” policies and put_bucket_policy
can be used to specify a more complex policy. The AWS Policy Generator can be useful for creating the appropriate JSON policy structure.
For get_policy
: A character string containing the JSON representation of the policy, if one has been set. For delete_policy
and put_policy
: TRUE
if successful, FALSE
otherwise.
API Documentation API Documentation AWS Policy Generator
Some utility functions for working with S3 objects and buckets
get_bucketname(x, ...) ## S3 method for class 'character' get_bucketname(x, ...) ## S3 method for class 's3_bucket' get_bucketname(x, ...) ## S3 method for class 's3_object' get_bucketname(x, ...) get_objectkey(x, ...) ## S3 method for class 'character' get_objectkey(x, ...) ## S3 method for class 's3_object' get_objectkey(x, ...)
get_bucketname(x, ...) ## S3 method for class 'character' get_bucketname(x, ...) ## S3 method for class 's3_bucket' get_bucketname(x, ...) ## S3 method for class 's3_object' get_bucketname(x, ...) get_objectkey(x, ...) ## S3 method for class 'character' get_objectkey(x, ...) ## S3 method for class 's3_object' get_objectkey(x, ...)
x |
S3 object, s3:// URL or a string |
... |
Ignored. |
get_bucketname
returns a character string with the name of the bucket.
get_objectkey
returns a character string with S3 key which is the part excluding bucket name and leading slashes
Get/Put/Delete the cross origin resource sharing configuration information for a bucket.
get_cors(bucket, ...) put_cors(bucket, ...) delete_cors(bucket, ...)
get_cors(bucket, ...) put_cors(bucket, ...) delete_cors(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
For get_cors
: A list with cors configuration and rules. For delete_cors
: TRUE
if successful, FALSE
otherwise.
API Documentation: PUT cors API Documentation: GET cords API Documentation: DELETE cors
Get/Put/Delete bucket-level encryption settings.
get_encryption(bucket, ...) put_encryption(bucket, algorithm = c("AES256", "KMS"), kms_arn = NULL, ...) delete_encryption(bucket, ...)
get_encryption(bucket, ...) put_encryption(bucket, algorithm = c("AES256", "KMS"), kms_arn = NULL, ...) delete_encryption(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
algorithm |
A character string specifying whether to use “AES256” or “KMS” encryption. |
kms_arn |
If |
get_encryption
returns the default encryption of a bucket; put_encryption
sets the default encryption. delete_encryption
deletes the encryption status.
For get_encryption
: if encryption has never been set, the value is NULL
. Otherwise, the encryption type is returned as a charater string. For put_encryption
or delete_encryption
: a logical TRUE
API Documentation API Documentation API Documentation
## Not run: # example bucket put_bucket("mybucket") # set and check encryption put_encryption("mybucket", "AES256") get_encryption("mybucket") # delete encryption delete_encryption("mybucket") ## End(Not run)
## Not run: # example bucket put_bucket("mybucket") # set and check encryption put_encryption("mybucket", "AES256") get_encryption("mybucket") # delete encryption delete_encryption("mybucket") ## End(Not run)
Get/Put/Delete the lifecycle configuration information for a bucket.
get_lifecycle(bucket, ...) put_lifecycle(bucket, request_body, ...) delete_lifecycle(bucket, ...)
get_lifecycle(bucket, ...) put_lifecycle(bucket, request_body, ...) delete_lifecycle(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
request_body |
A character string containing an XML request body, as defined in the specification in the API Documentation. |
For get_lifecycle
: a list with lifecycle configuration, if it has been configured. For delete_lifecycle
: TRUE
if successful, FALSE
otherwise.
API Documentation: PUT lifecycle API Documentation: GET lifecycle API Documentation: DELETE lifecycle
Get the AWS region location of bucket.
get_location(bucket, ...)
get_location(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
A character string containing the region, if one has been set.
Get/put the notification configuration for a bucket.
get_notification(bucket, ...) put_notification(bucket, request_body, ...)
get_notification(bucket, ...) put_notification(bucket, request_body, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
request_body |
A character string containing an XML request body, as defined in the specification in the API Documentation. |
A list containing the notification configuration, if one has been set.
API Documentation: GET API Documentation: PUT
Retrieve an object from an S3 bucket. To check if an object exists, see head_object
get_object( object, bucket, headers = list(), parse_response = FALSE, as = "raw", ... ) save_object( object, bucket, file = basename(object), headers = list(), overwrite = TRUE, ... ) select_object( object, bucket, request_body, headers = list(), parse_response = FALSE, ... ) s3connection(object, bucket, headers = list(), ...)
get_object( object, bucket, headers = list(), parse_response = FALSE, as = "raw", ... ) save_object( object, bucket, file = basename(object), headers = list(), overwrite = TRUE, ... ) select_object( object, bucket, request_body, headers = list(), parse_response = FALSE, ... ) s3connection(object, bucket, headers = list(), ...)
object |
Character string with the object key, or an object of class “s3_object”. In most cases, if |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
headers |
List of request headers for the REST call. |
parse_response |
Passed through to |
as |
Passed through to |
... |
Additional arguments passed to |
file |
An R connection, or file name specifying the local file to save the object into. |
overwrite |
A logical indicating whether to overwrite |
request_body |
For |
get_object
retrieves an object into memory as a raw vector. This page describes get_object
and several wrappers that provide additional useful functionality.
save_object
saves an object to a local file without bringing it into memory.
s3connection
provides a connection
interface to an S3 object.
select_object
uses the SELECT API to select part of a CSV or JSON object. This requires constructing and passing a fairly tedious request body, which users will have to construct themselves according to the documentation.
Some users may find the raw vector response format of get_object
unfamiliar. The object will also carry attributes, including “content-type”, which may be useful for deciding how to subsequently process the vector. Two common strategies are as follows. For text content types, running charToRaw
may be the most useful first step to make the response human-readable. Alternatively, converting the raw vector into a connection using rawConnection
may also be useful, as that can often then be passed to parsing functions just like a file connection would be.
Higher-level functions
If file = NULL
, a raw object. Otherwise, a character string containing the file name that the object is saved to.
API Documentation: GET Object API Documentation: GET Object torrent API Documentation: SELECT Object
get_bucket
, object_exists
, head_object
, put_object
, delete_object
## Not run: # get an object in memory ## create bucket b <- put_bucket("myexamplebucket") ## save a dataset to the bucket s3save(mtcars, bucket = b, object = "mtcars") obj <- get_bucket(b) ## get the object in memory x <- get_object(obj[[1]]) load(rawConnection(x)) "mtcars" %in% ls() # save an object locally y <- save_object(obj[[1]], file = object[[1]][["Key"]]) y %in% dir() # return object using 'S3 URI' syntax, with progress bar get_object("s3://myexamplebucket/mtcars", show_progress = TRUE) # return parts of an object ## use 'Range' header to specify bytes get_object(object = obj[[1]], headers = list('Range' = 'bytes=1-120')) # example of streaming connection ## setup a bucket and object b <- put_bucket("myexamplebucket") s3write_using(mtcars, bucket = b, object = "mtcars.csv", FUN = utils::write.csv) ## setup the connection con <- s3connection("mtcars.csv", bucket = b) ## line-by-line read while(length(x <- readLines(con, n = 1L))) { print(x) } ## use data.table::fread without saving object to file library(data.table) s3write_using(as.data.table(mtcars), bucket = b, object = "mtcars2.csv", FUN = data.table::fwrite) fread(get_object("mtcars2.csv", bucket = b, as = "text")) ## cleanup close(con) delete_bucket("myexamplebucket") ## End(Not run)
## Not run: # get an object in memory ## create bucket b <- put_bucket("myexamplebucket") ## save a dataset to the bucket s3save(mtcars, bucket = b, object = "mtcars") obj <- get_bucket(b) ## get the object in memory x <- get_object(obj[[1]]) load(rawConnection(x)) "mtcars" %in% ls() # save an object locally y <- save_object(obj[[1]], file = object[[1]][["Key"]]) y %in% dir() # return object using 'S3 URI' syntax, with progress bar get_object("s3://myexamplebucket/mtcars", show_progress = TRUE) # return parts of an object ## use 'Range' header to specify bytes get_object(object = obj[[1]], headers = list('Range' = 'bytes=1-120')) # example of streaming connection ## setup a bucket and object b <- put_bucket("myexamplebucket") s3write_using(mtcars, bucket = b, object = "mtcars.csv", FUN = utils::write.csv) ## setup the connection con <- s3connection("mtcars.csv", bucket = b) ## line-by-line read while(length(x <- readLines(con, n = 1L))) { print(x) } ## use data.table::fread without saving object to file library(data.table) s3write_using(as.data.table(mtcars), bucket = b, object = "mtcars2.csv", FUN = data.table::fwrite) fread(get_object("mtcars2.csv", bucket = b, as = "text")) ## cleanup close(con) delete_bucket("myexamplebucket") ## End(Not run)
Get/Delete the replication configuration for a bucket.
get_replication(bucket, ...) put_replication(bucket, request_body, ...) delete_replication(bucket, ...)
get_replication(bucket, ...) put_replication(bucket, request_body, ...) delete_replication(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
request_body |
A character string containing an XML request body, as defined in the specification in the API Documentation. |
get_replication
gets the current replication policy. delete_replication
deletes the replication policy for a bucket.
For get_replication
: A list containing the replication configuration, if one has been set. For delete_replication
: TRUE
if successful, FALSE
otherwise.
API Documentation: PUT replication API Documentation: GET replication API Documentation: DELETE replication
Get/Put the requestPayment subresource for a bucket.
get_requestpayment(bucket, ...) put_requestpayment(bucket, ...)
get_requestpayment(bucket, ...) put_requestpayment(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
A list containing the requestPayment information, if set.
Get/delete the tag set for a bucket.
get_tagging(bucket, ...) put_tagging(bucket, tags = list(), ...) delete_tagging(bucket, ...)
get_tagging(bucket, ...) put_tagging(bucket, tags = list(), ...) delete_tagging(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
tags |
A list containing key-value pairs of tag names and values. |
A list containing the tag set, if one has been set. For delete_tagging
: TRUE
if successful, FALSE
otherwise.
API Documentation: PUT tagging API Documentation: GET tagging API Documentation: DELETE tagging
## Not run: put_tagging("mybucket", tags = list(foo = "1", bar = "2")) get_tagging("mybucket") delete_tagging("mybucket") ## End(Not run)
## Not run: put_tagging("mybucket", tags = list(foo = "1", bar = "2")) get_tagging("mybucket") delete_tagging("mybucket") ## End(Not run)
Retrieves a Bencoded dictionary (BitTorrent) for an object from an S3 bucket.
get_torrent(object, bucket, ...)
get_torrent(object, bucket, ...)
object |
Character string with the object key, or an object of class “s3_object”. In most cases, if |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
Something.
Get a list of multipart uploads for a bucket.
get_uploads(bucket, ...)
get_uploads(bucket, ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
A list containing the multipart upload information.
Get/Put versioning settings or retrieve versions of bucket objects.
get_versions(bucket, ...) get_versioning(bucket, ...) put_versioning(bucket, status = c("Enabled", "Suspended"), ...)
get_versions(bucket, ...) get_versioning(bucket, ...) put_versioning(bucket, status = c("Enabled", "Suspended"), ...)
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
status |
Character string specifying whether versioning should be “Enabled” or “Suspended”. |
get_versioning
returns the versioning status of a bucket; put_versioning
sets the versioning status. get_versions
returns information about bucket versions.
For get_versioning
: If versioning has never been enabled or suspend, the value is NULL
. Otherwise, the status is returned (either “Enabled” or “Suspended”). For put_versioning
: If versioning has never been enabled or suspend, the value is NULL
. Otherwise, the status is returned (either “Enabled” or “Suspended”).
For get_versions
: A list.
API Documentation API Documentation API Documentation
## Not run: put_versioning("mybucket") get_versioning("mybucket") get_versions("mybucket") ## End(Not run)
## Not run: put_versioning("mybucket") get_versioning("mybucket") get_versions("mybucket") ## End(Not run)
These functions are deprecated.
getobject(...) saveobject(...) headobject(...) copyobject(...) copybucket(...) putbucket(...) putobject(...) deleteobject(...) getbucket(...) deletebucket(...) bucketexists(...)
getobject(...) saveobject(...) headobject(...) copyobject(...) copybucket(...) putbucket(...) putobject(...) deleteobject(...) getbucket(...) deletebucket(...) bucketexists(...)
... |
Arguments passed to updated versions of each function. |
Check if an object from an S3 bucket exists. To retrieve the object, see get_object
head_object(object, bucket, ...) object_exists(object, bucket, ...) object_size(object, bucket, ...)
head_object(object, bucket, ...) object_exists(object, bucket, ...) object_size(object, bucket, ...)
object |
Character string with the object key, or an object of class “s3_object”. In most cases, if |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
head_object
is a low-level API wrapper that checks whether an object exists by executing an HTTP HEAD request; this can be useful for checking object headers such as “content-length” or “content-type”. object_exists
is sugar that returns only the logical.
object_size
returns the size of the object (from the “content-length” attribute returned by head_object
).
head_object
returns a logical. object_exists
returns TRUE
if bucket exists and is accessible, else FALSE
. object_size
returns an integer, which is NA
if the request fails.
API Documentation: HEAD Object
bucket_exists
, get_object
, put_object
, delete_object
## Not run: # get an object in memory ## create bucket b <- put_bucket("myexamplebucket") ## save a dataset to the bucket s3save(mtcars, bucket = b, object = "mtcars") # check that object exists object_exists("mtcars", "myexamplebucket") object_exists("s3://myexamplebucket/mtcars") # get the object's size object_size("s3://myexamplebucket/mtcars") # get the object get_object("s3://myexamplebucket/mtcars") ## End(Not run)
## Not run: # get an object in memory ## create bucket b <- put_bucket("myexamplebucket") ## save a dataset to the bucket s3save(mtcars, bucket = b, object = "mtcars") # check that object exists object_exists("mtcars", "myexamplebucket") object_exists("s3://myexamplebucket/mtcars") # get the object's size object_size("s3://myexamplebucket/mtcars") # get the object get_object("s3://myexamplebucket/mtcars") ## End(Not run)
Creates a new S3 bucket.
put_bucket( bucket, region = Sys.getenv("AWS_DEFAULT_REGION"), acl = c("private", "public-read", "public-read-write", "aws-exec-read", "authenticated-read", "bucket-owner-read", "bucket-owner-full-control"), location_constraint = region, headers = list(), ... )
put_bucket( bucket, region = Sys.getenv("AWS_DEFAULT_REGION"), acl = c("private", "public-read", "public-read-write", "aws-exec-read", "authenticated-read", "bucket-owner-read", "bucket-owner-full-control"), location_constraint = region, headers = list(), ... )
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
region |
A character string containing the AWS region. If missing, defaults to value of environment variable AWS_DEFAULT_REGION. |
acl |
A character string indicating a “canned” access control list. By default all bucket contents and objects therein are given the ACL “private”. This can later be viewed using |
location_constraint |
A character string specifying a location constraint. If |
headers |
List of request headers for the REST call. |
... |
Additional arguments passed to |
Bucket policies regulate who has what access to a bucket and its contents. The header
argument can beused to specify “canned” policies and put_bucket_policy
can be used to specify a more complex policy. The AWS Policy Generator can be useful for creating the appropriate JSON policy structure.
TRUE
if successful.
API Documentation AWS Policy Generator
bucketlist
, get_bucket
, delete_bucket
, put_object
, put_encryption
, put_versioning
## Not run: put_bucket("examplebucket") # set a "canned" ACL to, e.g., make bucket publicly readable put_bucket("examplebucket", headers = list(`x-amz-acl` = "public-read") ## End(Not run)
## Not run: put_bucket("examplebucket") # set a "canned" ACL to, e.g., make bucket publicly readable put_bucket("examplebucket", headers = list(`x-amz-acl` = "public-read") ## End(Not run)
Stores an object into an S3 bucket
put_object( what, object, bucket, multipart = FALSE, acl = NULL, file, headers = list(), verbose = getOption("verbose", FALSE), show_progress = getOption("verbose", FALSE), partsize = 1e+08, ... ) put_folder(folder, bucket, ...)
put_object( what, object, bucket, multipart = FALSE, acl = NULL, file, headers = list(), verbose = getOption("verbose", FALSE), show_progress = getOption("verbose", FALSE), partsize = 1e+08, ... ) put_folder(folder, bucket, ...)
what |
character vector, raw vector or a connection (see Details section for important change in 0.3.22!) |
object |
A character string containing the name the object should have in S3 (i.e., its "object key"). If missing, an attempt is made to infer it. |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
multipart |
A logical indicating whether to use multipart uploads. See http://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html. If the content is smaller than |
acl |
A character string indicating a “canned” access control list. By default all bucket contents and objects therein are given the ACL “private”. This can later be viewed using |
file |
string, path to a file to store. Mutually exclusive with |
headers |
List of request headers for the REST call. If |
verbose |
A logical indicating whether to be verbose. Default is given by |
show_progress |
A logical indicating whether to show a progress bar for uploads. Default is given by |
partsize |
numeric, size of each part when using multipart upload. AWS imposes a minimum size (currently 5MB) so setting a too low value may fail. Note that it can be set to |
... |
Additional arguments passed to |
folder |
A character string containing a folder name. (A trailing slash is not required.) |
This provides a generic interface for storing objects to S3. Some convenience wrappers are provided for common tasks: e.g., s3save
and s3saveRDS
.
Note that S3 is a flat file store. So there is no folder hierarchy as in a traditional hard drive. However, S3 allows users to create pseudo-folders by prepending object keys with foldername/
. The put_folder
function is provided as a high-level convenience function for creating folders. This is not actually necessary as objects with slashes in their key will be displayed in the S3 web console as if they were in folders, but it may be useful for creating an empty directory (which is possible in the web console).
IMPORTANT: In aws.s3 versions before 0.3.22 the first positional argument was file
and put_object
changed behavior depending on whether the file could be found or not. This is inherently very dangerous since put_object
would only store the filename in cases there was any problem with the input. Therefore the first argument was changed to what
which is always the content to store and now also supports connection. If not used, file
is still a named argument and can be set instead - it will be always interpreted as a filename, failing with an error if it doesn't exist.
When using connections in what
it is preferrable that they are either unopened or open in binary mode. This condition is mandatory for multipart uploads. Text connections are inherently much slower and may not deliver identical results since they mangle line endings. put_object
will automatically open unopened connections and always closes the connection before returning.
If successful, TRUE
.
put_bucket
, get_object
, delete_object
, put_encryption
## Not run: library("datasets") # write file to S3 tmp <- tempfile() on.exit(unlink(tmp)) utils::write.csv(mtcars, file = tmp) # put object with an upload progress bar put_object(file = tmp, object = "mtcars.csv", bucket = "myexamplebucket", show_progress = TRUE) # create a "folder" in a bucket (NOT required! Folders are really just 0-length files) put_folder("example", bucket = "myexamplebucket") ## write object to the "folder" put_object(file = tmp, object = "example/mtcars.csv", bucket = "myexamplebucket") # write serialized, in-memory object to S3 x <- rawConnection(raw(), "w") utils::write.csv(mtcars, x) put_object(rawConnectionValue(x), object = "mtcars.csv", bucket = "myexamplebucketname") # use `headers` for server-side encryption ## require appropriate bucket policy ## encryption can also be set at the bucket-level using \code{\link{put_encryption}} put_object(file = tmp, object = "mtcars.csv", bucket = "myexamplebucket", headers = c('x-amz-server-side-encryption' = 'AES256')) # alternative "S3 URI" syntax: put_object(rawConnectionValue(x), object = "s3://myexamplebucketname/mtcars.csv") close(x) # read the object back from S3 read.csv(text = rawToChar(get_object(object = "s3://myexamplebucketname/mtcars.csv"))) # multi-part uploads for objects over 5MB \donttest{ x <- rnorm(3e6) saveRDS(x, tmp) put_object(file = tmp, object = "rnorm.rds", bucket = "myexamplebucket", show_progress = TRUE, multipart = TRUE, partsize=1e6) identical(x, s3readRDS("s3://myexamplebucket/rnorm.rds")) } ## End(Not run)
## Not run: library("datasets") # write file to S3 tmp <- tempfile() on.exit(unlink(tmp)) utils::write.csv(mtcars, file = tmp) # put object with an upload progress bar put_object(file = tmp, object = "mtcars.csv", bucket = "myexamplebucket", show_progress = TRUE) # create a "folder" in a bucket (NOT required! Folders are really just 0-length files) put_folder("example", bucket = "myexamplebucket") ## write object to the "folder" put_object(file = tmp, object = "example/mtcars.csv", bucket = "myexamplebucket") # write serialized, in-memory object to S3 x <- rawConnection(raw(), "w") utils::write.csv(mtcars, x) put_object(rawConnectionValue(x), object = "mtcars.csv", bucket = "myexamplebucketname") # use `headers` for server-side encryption ## require appropriate bucket policy ## encryption can also be set at the bucket-level using \code{\link{put_encryption}} put_object(file = tmp, object = "mtcars.csv", bucket = "myexamplebucket", headers = c('x-amz-server-side-encryption' = 'AES256')) # alternative "S3 URI" syntax: put_object(rawConnectionValue(x), object = "s3://myexamplebucketname/mtcars.csv") close(x) # read the object back from S3 read.csv(text = rawToChar(get_object(object = "s3://myexamplebucketname/mtcars.csv"))) # multi-part uploads for objects over 5MB \donttest{ x <- rnorm(3e6) saveRDS(x, tmp) put_object(file = tmp, object = "rnorm.rds", bucket = "myexamplebucket", show_progress = TRUE, multipart = TRUE, partsize=1e6) identical(x, s3readRDS("s3://myexamplebucket/rnorm.rds")) } ## End(Not run)
This is the workhorse function for executing API requests for S3.
s3HTTP( verb = "GET", bucket = "", path = "", query = NULL, headers = list(), request_body = "", write_disk = NULL, write_fn = NULL, accelerate = FALSE, dualstack = FALSE, parse_response = TRUE, check_region = FALSE, url_style = c("path", "virtual"), base_url = Sys.getenv("AWS_S3_ENDPOINT", "s3.amazonaws.com"), verbose = getOption("verbose", FALSE), show_progress = getOption("verbose", FALSE), region = NULL, key = NULL, secret = NULL, session_token = NULL, use_https = TRUE, ... )
s3HTTP( verb = "GET", bucket = "", path = "", query = NULL, headers = list(), request_body = "", write_disk = NULL, write_fn = NULL, accelerate = FALSE, dualstack = FALSE, parse_response = TRUE, check_region = FALSE, url_style = c("path", "virtual"), base_url = Sys.getenv("AWS_S3_ENDPOINT", "s3.amazonaws.com"), verbose = getOption("verbose", FALSE), show_progress = getOption("verbose", FALSE), region = NULL, key = NULL, secret = NULL, session_token = NULL, use_https = TRUE, ... )
verb |
A character string containing an HTTP verb, defaulting to “GET”. |
bucket |
A character string with the name of the bucket, or an object of class “s3_bucket”. If the latter and a region can be inferred from the bucket object attributes, then that region is used instead of |
path |
A character string with the name of the object to put in the bucket (sometimes called the object or 'key name' in the AWS documentation.) |
query |
Any query arguments, passed as a named list of key-value pairs. |
headers |
A list of request headers for the REST call. |
request_body |
A character string containing request body data. |
write_disk |
If |
write_fn |
If set to a function and |
accelerate |
A logical indicating whether to use AWS transfer acceleration, which can produce significant speed improvements for cross-country transfers. Acceleration only works with buckets that do not have dots in bucket name. |
dualstack |
A logical indicating whether to use “dual stack” requests, which can resolve to either IPv4 or IPv6. See http://docs.aws.amazon.com/AmazonS3/latest/dev/dual-stack-endpoints.html. |
parse_response |
A logical indicating whether to return the response as is, or parse and return as a list. Default is |
check_region |
A logical indicating whether to check the value of |
url_style |
A character string specifying either “path” (the default), or “virtual”-style S3 URLs. |
base_url |
A character string specifying the base hostname for the request (it is a misnomer, the actual URL is constructed from this name, region and |
verbose |
A logical indicating whether to be verbose. Default is given by |
show_progress |
A logical indicating whether to show a progress bar for downloads and uploads. Default is given by |
region |
A character string containing the AWS region. Ignored if region can be inferred from |
key |
A character string containing an AWS Access Key ID. If missing, defaults to value stored in environment variable AWS_ACCESS_KEY_ID. |
secret |
A character string containing an AWS Secret Access Key. If missing, defaults to value stored in environment variable AWS_SECRET_ACCESS_KEY. |
session_token |
Optionally, a character string containing an AWS temporary Session Token. If missing, defaults to value stored in environment variable AWS_SESSION_TOKEN. |
use_https |
Optionally, a logical indicating whether to use HTTPS requests. Default is |
... |
Additional arguments passed to an HTTP request function. such as |
This is mostly an internal function for executing API requests. In almost all cases, users do not need to access this directly.
the S3 response, or the relevant error.
Save/load R object(s) to/from S3
s3save(..., object, bucket, envir = parent.frame(), opts = NULL) s3save_image(object, bucket, opts = NULL) s3load(object, bucket, envir = parent.frame(), ...)
s3save(..., object, bucket, envir = parent.frame(), opts = NULL) s3save_image(object, bucket, opts = NULL) s3load(object, bucket, envir = parent.frame(), ...)
... |
For |
object |
For |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
envir |
For |
opts |
Additional arguments passed to |
For s3save
, a logical, invisibly. For s3load
, NULL
invisibly.
## Not run: # create bucket b <- put_bucket("myexamplebucket") # save a dataset to the bucket s3save(mtcars, iris, object = "somedata.Rdata", bucket = b) get_bucket(b) # load the data from bucket e <- new.env() s3load(object = "somedata.Rdata", bucket = b, envir = e) ls(e) # cleanup rm(e) delete_object(object = "somedata.Rdata", bucket = "myexamplebucket") delete_bucket("myexamplebucket") ## End(Not run)
## Not run: # create bucket b <- put_bucket("myexamplebucket") # save a dataset to the bucket s3save(mtcars, iris, object = "somedata.Rdata", bucket = b) get_bucket(b) # load the data from bucket e <- new.env() s3load(object = "somedata.Rdata", bucket = b, envir = e) ls(e) # cleanup rm(e) delete_object(object = "somedata.Rdata", bucket = "myexamplebucket") delete_bucket("myexamplebucket") ## End(Not run)
Serialization interface to read/write R objects to S3
s3saveRDS( x, object = paste0(as.character(substitute(x)), ".rds"), bucket, compress = TRUE, ... ) s3readRDS(object, bucket, ...)
s3saveRDS( x, object = paste0(as.character(substitute(x)), ".rds"), bucket, compress = TRUE, ... ) s3readRDS(object, bucket, ...)
x |
For |
object |
Character string with the object key, or an object of class “s3_object”. In most cases, if |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
compress |
A logical. See |
... |
Additional arguments passed to |
Note that early versions of s3saveRDS
from aws.s3 <= 0.2.4 unintentionally serialized objects to big endian format (due to defaults in serialize
. This can create problems when attempting to read these files using readRDS
. The function attempts to catch the issue and read accordingly, but may fail. The solution used internally is unserialize(memDecompress(get_object(), "gzip"))
For s3saveRDS
, a logical. For s3readRDS
, an R object.
Steven Akins <[email protected]>
## Not run: # create bucket b <- put_bucket("myexamplebucket") # save a single object to s3 s3saveRDS(x = mtcars, bucket = "myexamplebucket", object = "mtcars.rds") # restore it under a different name mtcars2 <- s3readRDS(object = "mtcars.rds", bucket = "myexamplebucket") identical(mtcars, mtcars2) # cleanup delete_object(object = "mtcars.rds", bucket = "myexamplebucket") delete_bucket("myexamplebucket") ## End(Not run)
## Not run: # create bucket b <- put_bucket("myexamplebucket") # save a single object to s3 s3saveRDS(x = mtcars, bucket = "myexamplebucket", object = "mtcars.rds") # restore it under a different name mtcars2 <- s3readRDS(object = "mtcars.rds", bucket = "myexamplebucket") identical(mtcars, mtcars2) # cleanup delete_object(object = "mtcars.rds", bucket = "myexamplebucket") delete_bucket("myexamplebucket") ## End(Not run)
Source R code (a la source
) from S3
s3source(object, bucket, ..., opts = NULL)
s3source(object, bucket, ..., opts = NULL)
object |
Character string with the object key, or an object of class “s3_object”. In most cases, if |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
... |
Additional arguments passed to |
opts |
Additional arguments passed to |
See source
## Not run: # create bucket b <- put_bucket("myexamplebucket") # save some code to the bucket cat("x <- 'hello world!'\nx", file = "example.R") put_object("example.R", object = "example.R", bucket = b) get_bucket(b) # source the code from the bucket s3source(object = "example.R", bucket = b, echo = TRUE) # cleanup unlink("example.R") delete_object(object = "example.R", bucket = b) delete_bucket("myexamplebucket") ## End(Not run)
## Not run: # create bucket b <- put_bucket("myexamplebucket") # save some code to the bucket cat("x <- 'hello world!'\nx", file = "example.R") put_object("example.R", object = "example.R", bucket = b) get_bucket(b) # source the code from the bucket s3source(object = "example.R", bucket = b, echo = TRUE) # cleanup unlink("example.R") delete_object(object = "example.R", bucket = b) delete_bucket("myexamplebucket") ## End(Not run)
Sync files/directories to/from S3
s3sync( path = ".", bucket, prefix = "", direction = c("upload", "download"), verbose = TRUE, create = FALSE, ... )
s3sync( path = ".", bucket, prefix = "", direction = c("upload", "download"), verbose = TRUE, create = FALSE, ... )
path |
string, path to the directory to synchronize, it will be expanded as needed (NOTE: older versions had a |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
prefix |
string, if set to non-empty string, leading part of the objects in the bucket much have that prefix, other objects are not considered. In practice, this alows the immitation of sub-directories in the bucket and in that case it is typically required that the training slash is included in the prefix. |
direction |
A character vector specifying whether to “upload” and/or “download” files. By default, |
verbose |
A logical indicating whether to be verbose (the default is |
create |
logical, if |
... |
Additional arguments passed to |
s3sync
synchronizes specified files to an S3 bucket.
If the bucket does not exist, it is created (unless create=FALSE
). Similarly, if local directories do not exist (corresponding to leading portions of object keys), they are created, recursively. Object keys are generated based on files
and local files are named (and organized into directories) based on object keys. A slash is interpreted as a directory level.
Local objects are copied to S3 and S3 objects are copied locally. This copying is performed conditionally. Objects existing locally but not in S3 are uploaded using put_object
. Objects existing in S3 but not locally, are saved using save_object
. If objects exist in both places, the MD5 checksum for each is compared; when identical, no copying is performed. If the checksums differ, local files are replaced with the bucket version if the local file is older and the S3 object is replaced if the local file is newer. If checksums differ but modified times match (which seems unlikely), a warning is issued. Note that multi-part files don't have a full MD5 sum recorded in S3 so they cannot be compared and thus are always assumed to be different.
A logical.
get_bucket
, put_object
, , save_object
## Not run: put_bucket("examplebucket") # sync all files in current directory to bucket (upload-only) s3sync(bucket = "examplebucket", direction = "upload") # two-way sync s3sync(bucket = "examplebucket") # full sync between a subset of the bucket and a test directory in user's home # corresponding roughly to: # aws s3 sync ~/test s3://examplebucket/test/ # aws s3 sync s3://examplebucket/test/ ~/test s3sync("~/test", "examplebucket", prefix="test/", region="us-east-2") ## End(Not run)
## Not run: put_bucket("examplebucket") # sync all files in current directory to bucket (upload-only) s3sync(bucket = "examplebucket", direction = "upload") # two-way sync s3sync(bucket = "examplebucket") # full sync between a subset of the bucket and a test directory in user's home # corresponding roughly to: # aws s3 sync ~/test s3://examplebucket/test/ # aws s3 sync s3://examplebucket/test/ ~/test s3sync("~/test", "examplebucket", prefix="test/", region="us-east-2") ## End(Not run)
Read/write objects from/to S3 using a custom function
s3write_using(x, FUN, ..., object, bucket, opts = NULL) s3read_using(FUN, ..., object, bucket, opts = NULL, filename = NULL)
s3write_using(x, FUN, ..., object, bucket, opts = NULL) s3read_using(FUN, ..., object, bucket, opts = NULL, filename = NULL)
x |
For |
FUN |
For |
... |
Additional arguments to |
object |
Character string with the object key, or an object of class “s3_object”. In most cases, if |
bucket |
Character string with the name of the bucket, or an object of class “s3_bucket”. |
opts |
Optional additional arguments passed to |
filename |
Optional string, name of the temporary file that will be created. If not specified, |
For s3write_using
, a logical, invisibly. For s3read_using
, the output of FUN
applied to the file from object
.
s3saveRDS
, s3readRDS
, put_object
,get_object
## Not run: library("datasets") # create bucket b <- put_bucket("myexamplebucket") # save a dataset to the bucket as a csv if (require("utils")) { s3write_using(mtcars, FUN = write.csv, object = "mtcars.csv", bucket = b) } # load dataset from the bucket as a csv if (require("utils")) { s3read_using(FUN = read.csv, object = "mtcars.csv", bucket = b) } # cleanup delete_object(object = "mtcars.csv", bucket = b) delete_bucket(bucket = b) ## End(Not run)
## Not run: library("datasets") # create bucket b <- put_bucket("myexamplebucket") # save a dataset to the bucket as a csv if (require("utils")) { s3write_using(mtcars, FUN = write.csv, object = "mtcars.csv", bucket = b) } # load dataset from the bucket as a csv if (require("utils")) { s3read_using(FUN = read.csv, object = "mtcars.csv", bucket = b) } # cleanup delete_object(object = "mtcars.csv", bucket = b) delete_bucket(bucket = b) ## End(Not run)