AWS S3(2) - 응용

Posted Aug 25, 2024

By KimHyungkeun 9 min read

1. Amazon S3 – Moving between Storage Classes

You can transition objects between storage classes
⇒ storage class들 사이에서 object 이행이 가능하다
For infrequently accessed object, move them to Standard IA
⇒ object 접근이 거의 없는 경우라면, Standard IA로 이동한다
For archive objects that you don’t need fast access to, move them to Glacier or Glacier Deep Archive
⇒ 빠르게 접근할 필요가 없는 아카이브 object라면 Glacier나 Glacier Deep Archive로 이동한다
Moving objects can be automated using a Lifecycle Rules
⇒ object 이동에 대해서는 라이프사이클 Rule 지정을 통해 자동화가 가능하다

2. Amazon S3 – Lifecycle Rules

Transition Actions – configure objects to transition to another storage class
⇒ 트랜지션 액션 : object를 다른 storate classs로 이동시킴
- Move objects to Standard IA class 60 days after creation
  ⇒ 생성 후 60일이 지나면 Standard IA로 옯긴다
- Move to Glacier for archiving after 6 months
  ⇒ 아카이빙 후 6개월이 지나면 Glacier로 옮긴다
Expiration actions – configure objects to expire (delete) after some time
⇒ object에 대해 일정 기한이 지나면 파기한다
- Access log files can be set to delete after a 365 days
  ⇒ 접근 로그 파일은 365일이 지난 후 삭제한다
- Can be used to delete old versions of files (if versioning is enabled)
  ⇒ 오래된 버전의 file을 삭제하는데 사용 가능하다 (versioning이 활성화된 전제하에)
- Can be used to delete incomplete Multi-Part uploads
  ⇒ 완료되지 않은 Multi-Part 업로드 파일 삭제에도 쓰인다
Rules can be created for a certain prefix (example: s3://mybucket/mp3/*)
Rules can be created for certain objects Tags (example: Department: Finance)

3. Amazon S3 Analytics – Storage Class Analysis

Help you decide when to transition objects to the right storage class
⇒ object가 올바른 storage class로 이동될수있도록 도와주는 분석 시스템이다
Recommendations for Standard and Standard IA
⇒ Standard나 Standard IA를 추천한다
Does NOT work for One-Zone IA or Glacier
⇒ One-Zone IA나 Galcier에서는 작동하지 않는다
Report is updated daily
⇒ 날마다 Report가 업데이트된다
24 to 48 hours to start seeing data analysis
⇒ 24 ~ 48시간동안 data 분석을 감시한다
Good first step to put together Lifecycle Rules (or improve them)!
⇒ 가장 좋은 첫단계는 라이프사이클 룰도 같이 넣는것이

4. S3 – Requester Pays

In general, bucket owners pay for all Amazon S3 storage and data transfer costs associated with their bucket
⇒ 일반적으로, bucket 소유자는 AWS S3 스토리지의 모든 값을 지불하고, 데이터 이동 비용또한 bucket과 관련되어 있다
With Requester Pays buckets, the requester instead of the bucket owner pays the cost of the request and the data download from the bucket
⇒ 요청자가 bucket 지불과 함께, 요청자가 소유자 대신에 요청 비용을 대신지불할 수 있고, bucket으로부터 데이터 다운로드도 가능하다
Helpful when you want to share large datasets with other accounts
⇒ 대용량 datasets을 다른 계정에 공유하고 싶을 때 도움이 된다
The requester must be authenticated in AWS (cannot be anonymous)
⇒ 요청자는 반드시 AWS 사용자 인증이 되어야한다 (익명으로 안됨)

5. S3 Event Notifications

S3:ObjectCreated, S3:ObjectRemoved, S3:ObjectRestore, S3:Replication…
Object name filtering possible (*.jpg)
⇒ 필터링 가능한 Object 이름을 준비한다
Use case: generate thumbnails of images uploaded to S3
⇒ 사용 예시 : S3에 업로드될 이미지의 썸네일을 생성
Can create as many “S3 events” as desired

S3 event notifications typically deliver events in seconds but can sometimes take a minute or longer
⇒ S3 event notifications는 전형적으로 몇초 이내에 이벤트를 전달하지만, 가끔씩 몇분 이상이 걸릴 수도 있다

7. S3 Event Notifications with Amazon EventBridge

Advanced filtering options with JSON rules (metadata, object size, name…)
⇒ JSON rule을 이용해서 좀더 상세한 옵션 설정이 가능하다
Multiple Destinations – ex Step Functions, Kinesis Streams / Firehose…
⇒ 목적지 다수 지정 가능
EventBridge Capabilities – Archive, Replay Events, Reliable delivery
⇒ 이벤트브릿지 capabilities : 아카이빙, 이벤트 재시작, 신뢰가능한 전달

8. S3 – Baseline Performance

Amazon S3 automatically scales to high request rates, latency 100-200 ms
⇒ AWS S3는 고비율의 리퀘스트에 대해 자동적으로 비율을 확장한다 (지연율 100 ~ 200ms)
Your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket.
⇒ 사용자의 app은 최소 3,500개의 PUT/COPY/POST/DELETE 기능이나, 5,500개의 GET/HEAD를 초당 처리 가능하다
There are no limits to the number of prefixes in a bucket.
⇒ bucket에 있어 prefix 갯수에는 제한이 없다
Example (object path => prefix)
- bucket/folder1/sub1/file => /folder1/sub1/
- bucket/folder1/sub2/file => /folder1/sub2/
- bucket/1/file => /1/
- bucket/2/file => /2/
If you spread reads across all four prefixes evenly, you can achieve 22,000 requests per second for GET and HEAD

9. S3 Performance

Multi-Part upload
- recommended for files > 100MB, must use for files > 5GB
  ⇒ 100MB 초과되는 파일을 추천하고, 반드시 5GB 이상의 파일에 사용되어야 한다
- Can help parallelize uploads (speed up transfers)
  ⇒ 병렬화 업로드를 지원한다
S3 Transfer Acceleration
- Increase transfer speed by transferring file to an AWS edge location which will forward the data to the S3 bucket in the target region
  ⇒ AWS edge 로케이션에 파일을 전송함으로써, 전송 속도를 증가시킨다.
  AWS edge 로케이션은 타겟이 될 region에서 데이터를 S3 bucket으로 전송시킨다
- Compatible with multi-part upload
  ⇒ Multi-part 업로드에 대응된다

10. S3 Performance – S3 Byte-Range Fetches

Parallelize GETs by requesting specific byte ranges
⇒ byte 범위를 특정한 request를 통해 GETS를 병렬화
Better resilience in case of failures

11. S3 Select & Glacier Select

Retrieve less data using SQL by performing server-side filtering
⇒ 서버 사이드 필터링 관점에서의 SQL을 사용하여 적은 데이터 검색 가능
Can filter by rows & columns (simple SQL statements)
⇒ row와 컬럼을 통한 필터링 가능
Less network transfer, less CPU cost client-side
⇒ 클라이언트 에서는 적은 네트워크 전송과, 적은 CPU 비용 사

12. S3 Batch Operations

Perform bulk operations on existing S3 objects with a single request, example:
⇒ 단일 요청을 포함하는 S3 object에 대한 bulk operation 성능처리
- Modify object metadata & properties ⇒ object 메타데이터와 properties 변경
- Copy objects between S3 buckets ⇒ S3 bucket 사이에서 object 카피
- Encrypt un-encrypted objects ⇒ object 암호화
- Modify ACLs, tags ⇒ ACLs, tags 수정
- Restore objects from S3 Glacier ⇒ S3 Glacier로 부터 object 복구
- Invoke Lambda function to perform custom action on each object ⇒ labda function을 custom action으로 진행
A job consists of a list of objects, the action to perform, and optional parameters
⇒ Job 하나는 object 리스트, 성능, 옵션에 따른 parameter로 포함한다
S3 Batch Operations manages retries, tracks progress, sends completion notifications, generate reports
⇒ S3 배치 연산은 재시도, 프로그레스 추적, 완료된 notification전송, report를 생성한다
You can use S3 Inventory to get object list and use S3 Select to filter your objects
⇒ object 리스트를 얻기 위해 S3 Inventory 사용이 가능
⇒ object 필터링을 위해 S3 Select 가능

Infra, AWS

Infra AWS

This post is licensed under CC BY 4.0 by the author.

1. Amazon S3 – Moving between Storage Classes

2. Amazon S3 – Lifecycle Rules

3. Amazon S3 Analytics – Storage Class Analysis

4. S3 – Requester Pays

5. S3 Event Notifications

7. S3 Event Notifications with Amazon EventBridge

8. S3 – Baseline Performance

9. S3 Performance

10. S3 Performance – S3 Byte-Range Fetches

11. S3 Select & Glacier Select

12. S3 Batch Operations

Trending Tags